Data Processing
Creating a processing context
In order to process data with fluxEngine, a processing context must be created. A processing context knows about:
The model that is to be processed
How processing will be parallelized (it takes that information from the current handle) – changing parallelization settings will invalidate a given context
What kind of data is to be processed each time (full HSI cubes or individual PushBroom frames)
The size of data that is to be processed. fluxTrainer models are designed to be camera-independent (to an extent), and thus do not know about the actual spatial dimensions of the data that is to be processed. But once processing is to occur, the spatial dimensions have to be known
The input wavelengths of the data being processed. While a model build in fluxTrainer specifies the wavelengths that will be used during processing, cameras of the same model don’t map the exact same wavelengths onto the same pixels, due to production tolerances. For this reason cameras come with calibration information that tells the user what the precise wavelengths of the camera are. The user must specify the actual wavelengths of the input data, so that fluxEngine can interpolate those onto the wavelength range given in the model
Any white (and dark) reference data that is applicable to processing
There are two types of processing contexts that can be created: one for HSI cubes, one for PushBroom frames.
HSI Cube Processing Contexts
To process entire HSI cubes the user must use the constructor of
ProcessingContext
that takes
a ProcessingContext.HSICube
attribute argument. It has the following parameters:
The model that is to be processed
The storage order of the cube (BSQ, BIL, or BIP)
The scalar data type of the cube (e.g. 8 bit unsigned integer)
The spatial dimensions of the cube that is to be processed
The wavelengths of the cube
Whether the input data is in intensities or reflectances
An optional set of white reference measurements
An optional set of dark reference measurements
There are two ways to specify the spatial dimensions of a given cube. The first is to fix them at this point, only allowing the user to process cubes that have exactly this size with the processing context. The alternative is to leave them variable, but specify a maximum size. This has the advantage that the user can process differently sized cubes with the same context, but has the major disadvantage that if a white reference is used, it will be averaged along all variable axes, that means that any spatial information of the reference data will be averaged out. (It is also possible to only fix one of the spatial dimensions.)
For referencing it is typically useful to average multiple measurements
to reduce the effect of noise. For this reason, any references that are
provided have to be tensors of 4th order, with an additional initial
dimension at the beginning for the averages. For example, a cube in BSQ
storage order has the shape (λ, y, x)
, so the references must have
the shape (N, λ, y, x)
, where N
may be any positive number,
indicating the amount of measurements that is to be averaged. A cube in
BIP storage order would have a shape of (y, x, λ)
, leading to a
reference shape of (N, y, x, λ)
.
Note
It is possible to supply only a single cube as a reference
measurement, in that case N
would be 1
. In that case
the structure of the data is effectively only a tensor of
third order – but the additional dimension still has to
be specified. The function numpy.expand_dims
may be used
for this purpose to add the additional dimension:
referenceData = numpy.expand_dims(cube, axis=0)
Reference cubes must always have the same storage order as the cubes that are to be processed.
The first example here shows how to create a processing context without any references, assuming that the input data is already in reflectances, with a 32bit floating point data type, and fixed spatial dimensions:
width = 1024
height = 2150
wavelengths = [900, 901.5, 903, ...]
referenceInfo = fluxEngine.ReferenceInfo(fluxEngine.ValueType.Reflectance)
context = fluxEngine.ProcessingContext(model, fluxEngine.ProcessingContext.HSICube,
storageOrder=fluxEngine.HSICube_StorageOrder.BSQ,
dataType=numpy.float32,
maxHeight=height, height=height,
maxWidth=width, width=width,
wavelengths=wavelengths, referenceInfo=referenceInfo);
Alternatively, to create a processing context that uses a white reference cube, and where the y dimension has a variable size, the following code could be used:
width = 1024
height = 2150
wavelengths = [900, 901.5, 903, ...]
referenceInfo = fluxEngine.ReferenceInfo(fluxEngine.ValueType.Intensity)
# this is just an example, the real reference data
# would come from somewhere
referenceInfo.whiteReference = np.ones((1, len(wavelengths), 40, width), np.uint8)
context = fluxEngine.ProcessingContext(model, fluxEngine.ProcessingContext.HSICube,
storageOrder=fluxEngine.HSICube_StorageOrder.BSQ,
dataType=numpy.uint8,
maxHeight=height, height=-1,
maxWidth=width, width=width,
wavelengths=wavelengths, referenceInfo=referenceInfo);
Note
The maximum size specified here also determines how much RAM is allocated in fluxEngine internally. Specifying an absurdly large number will cause fluxEngine to exhaust system memory.
Note
The white and dark references may have different spatial dimensions if those dimensions are specified as variable. In the above example, if a dark reference were to be specified, it would have to have the same width and number of bands (because those are both fixed), but it could have a different height.
PushBroom Frame Processing Contexts
To process entire HSI cubes the user must use the constructor of
ProcessingContext
that takes
a ProcessingContext.PushBroomFrame
argument. It has the following parameters:
The model that is to be processed
The storage order of the PushBroom frame
The scalar data type of each PushBroom frame
The spatial width of each PushBroom frame (which will be the actual with of each image if
LambdaY
storage order is used, or the height of each image ifLambdaX
storage order is used)The wavelengths of each PushBroom frame
Whether the input data is in intensities or reflectances
An optional set of white reference measurements
An optional set of dark reference measurements
As PushBroom processing can be thought of as a means to incrementally
build up an entire cube (but process data on each line individually),
the spatial width must be fixed and cannot be variable. (The number of
frames processed, i.e. the number of calls to
ProcessingContext.processNext()
is variable though.)
As it is often useful to average multiple reference measurements to
reduce noise, the white and dark references must be supplied as tensors
of third order, with a dimension structure of (N, x, λ)
or
(N, λ, x)
, depending on the storage order.
The following example shows how to set up a processing context without any references, assuming the input data is already in reflectances, stored as 32bit floating point numbers:
width = 320
wavelengths = [900, 901.5, 903, ...]
referenceInfo = fluxEngine.ReferenceInfo(fluxEngine.ValueType.Reflectance)
context = fluxEngine.ProcessingContext(model, fluxEngine.ProcessingContext.PushBroomFrame,
storageOrder=fluxEngine.PushBroomFrame_StorageOrder.LambdaY,
dataType=numpy.float32, width=width,
wavelengths=wavelengths, referenceInfo=referenceInfo);
Alternatively, if both white and dark reference measurements are to be supplied, and unsigned 8bit integer numbers, one could use the following code:
width = 640
wavelengths = [900, 901.5, 903, ...]
referenceInfo = fluxEngine.ReferenceInfo(fluxEngine.ValueType.Intensity)
# Replace this with actually obtaining the reference data.
# In this example the white reference contains 5 measurements,
# and the dark reference contains 10. Since this uses LambdaY
# storage order, these references are effectively HSI cubes
# in BIL storage order.
referenceInfo.whiteReference = np.ones((5, len(wavelengths), width), np.uint8)
referenceInfo.darkReference = np.ones((10, len(wavelengths), width), np.uint8)
context = fluxEngine.ProcessingContext(model, fluxEngine.ProcessingContext.PushBroomFrame,
storageOrder=fluxEngine.PushBroomFrame_StorageOrder.LambdaY,
dataType=numpy.uint8, width=width,
wavelengths=wavelengths, referenceInfo=referenceInfo);
Processing Data
Once a processing context has been set up, the user may use it to process data. This happens in two steps:
Set the data pointer for the source data that is to be processed
Process the data
The first step has to be called each time new data is to be processed. This is different from the C/C++ API.
Simply provide the source data in form of a numpy array with the right dimensions.
For HSI cubes the numpy array must have three dimensions in the right storage order, and the number of wavelengths must be fixed. The width and height might be fixed, depending on how the processing context was created.
For PushBroom frames the numpy array must have two dimensions in the right storage order, and both must have the right size, depending on the parameters with which the processing context was created.
fluxEngine will check that the data type and dimensions of the input data match the processing context, and will throw an exception if they do not.
Note
At the moment fluxEngine will create a copy of the data provided here, as the Python/NumPy memory ownership model don’t match the memory model within fluxEngine.
Once the data pointer has been set, the user may process the data with
the
ProcessingContext.processNext()
method.
Therefore, to process data with fluxEngine in Python, the following two method calls should
# fetch the source data as a numpy array
sourceData = ...
context.setSourceData(sourceData)
context.processNext()
After processing has completed, the user may obtain the results (see the next section).
PushBroom Resets
As PushBroom cameras can be thought of as incrementally building up a cube line by line, at some point the user may want to indicate that the current cube is considered complete and a new cube starts. In that case the processing context has to be reset, so that all stateful operations are reset as well, such as object detection, but also kernel-based operations.
To achieve this the method
ProcessingContext.resetState()
exists. Its usage is simple:
context.resetState()
Note
There is no requirement to actually perform such a reset. If fluxEngine is used to process PushBroom frames that are obtained from a camera above a conveyor belt in a continuous process, for example, it is possible to just simply process all incoming frames in a loop and never call the reset method. In that situation the reset method would be called though if the conveyor belt is stopped and has to be started up again. Though, depending on the specific application, that could also mean that a processing context would have to be created again, for example because references have to be measured again, and a simple state reset is not sufficient.
Obtaining Results
After processing has completed via the
ProcessingContext.processNext()
method, fluxEngine provides a means for the user to obtain the results
of that operation.
When designing the model in fluxTrainer that is to be used here, Output Sinks should be added to the model wherever processing results are to be obtained later.
Note
If a model contains no output sinks, it can be processed with fluxEngine, but the user will have no possibility of extracting any kind of result from it.
fluxEngine provides a means to introspect a model to obtain information about the output sinks that it contains. The following two identifiers for output sinks have to be distinguished:
The output sink index, this ist just a number starting at 0 and ending at one below the number of output sinks that may be used to specify the output sink for the purposes of the fluxEngine API.
The ordering of output sinks according to this index is non-obvious. Loading the same
.fluxmdl
file will lead to the same order, but saving models with the same configuration (but constructed separately) can lead to different orders of output sinks.The output id, which is a user-assignable id in fluxTrainer that can be used to mark output sinks for a specific purpose. The output id may not be unique (but should be), and is purely there for informational purposes.
For each output sink in the model the user will be able to obtain the output id of that sink. There is also a method
ProcessingContext.findOutputSink()
that can locate an output sink if the output id of that sink is unique. It will return the index of the sink with that output id.
To obtain information about all output sinks in the context, the method
ProcessingContext.outputSinkInfos()
exists, which returns a list of
OutputSinkInfo
objects that
contain information about each output sink. The order of the list
also defines the output sink index.
sinkInfos = context.outputSinkInfos()
for i in range(0, len(sinkInfos)):
print("Sink with index {0} has name {1}".format(i, sinkInfos[i].name))
The
OutputSinkInfo
structure
contains the following information:
The output id of the output sink
The name of the output sink as a string (this is the name the user specified when creating the model in fluxTraineer)
The output delay of the output sink (only relevant in the case when PushBroom data is being processed, see Output Delay for PushBroom Cameras for a more detailed discussion of this.
Data structure: what kind of data the output sink will return
Output sinks store either tensor data or detected objects, depending on the configuration of the output sink, and where it sits in the processing chain.
Tensor Data
Tensor data is always returned as a NumPy array. Tensor data within fluxEngine always has a well-defined storage order, as most algorithms that work on hyperspectral data are at their most efficient in this memory layout. While fluxEngine supports input data of arbitrary storage order, it will be converted to the internal storage order at the beginning of processing. The output data will always have the following structure:
When processing entire HSI cubes it will effectively return data in BIP storage order, that means that the dimension structure will be
(y, x, λ)
(for data that still has wavelength information) or(y, x, P)
, whereP
is a generic dimension, if the data has already passed through dimensionality reduction filters such as PCA.When processing PushBroom frames it will effectively return data in LambdaX storage order, with an additional dimension of size
1
at the beginning. In that case it will either be(1, x, λ)
or(1, x, P)
.Pure per-object data always has a tensor structure of order 2, in the form of
(N, λ)
or(N, P)
, whereN
describes the number of objects detected in this processing iteration. Important: objects themselves are returned as a structure (see below), per-object data is data that is the output of filters such as the Per-Object Averaging or Per-Object Counting filter. Also note that output sinks can combine objects and per-object data, in which case the per-object data will be returned as part of the object structure.For PushBroom data it is recommended to always combine per-object data with objects before interpreting it, as the output delay of both nodes may be different, and when combining the data the user does not have to keep track of the relative delays themselves.
The output structure of a tensor data output sink is stored in a
OutputSinkTensorStructure
object assigned to the
OutputSinkInfo.structure
attribute of the
OutputSinkInfo
object that
characterizes the output sink. In contains the following information:
The scalar data type of the tensor data (this is the same as the data type configured in the output sink)
The order of the tensor, which will be 2 or 3 (see above)
The maximum sizes of the tensor that can be returned here
The fixed sizes of the tensor that will be returned here. If the tensors returned here are always of the same size, the values here will be same as the maximum sizes. Any dimension that is not always the same will have a value of
-1
instead. If all of the sizes of the tensor returned here are fixed, the tensor returned will always be of the same size. (There is one notable exception: if the output sink has a non-zero output delay ofm
, the firstm
processing iterations will produce a tensor that does not have any data.)
Please refer to the documentation of the
OutputSinkTensorStructure
class for more information on how this information is returned.
Using
ProcessingContext.outputSinkData()
it is possible to obtain that tensor data after a successfull
processing step.
For example, if we know a given sink with index sinkIndex
has
signed 16bit integer data that spans the entire cube that is being
processed (when processing HSI cubes), the following code could be
used to obtain the results:
1 # Obtain sink index from somewhere
2 # e.g. sinkIndex = context.findOutputSink(42)
3 sinkIndex = ...
4 data = context.outputSinkData(sinkIndex)
5 # data is now a numpy.array with shape (cubeHeight, cubeWidth, 1)
6 # and of type numpy.int16
7 print(data)
Object Data
Objects will be returned as an list of
OutputObject
, which contain
information about objects that were detected in the model.
The output structure of an object list data output sink is stored in
a
OutputSinkObjectListStructure
object assigned to the
OutputSinkInfo.structure
attribute of the
OutputSinkInfo
object that characterizes the output sink. It contains the following
information:
The maximum number of objects that can be returned in a single iteration.
Whether per-object data was output using the output sink in the model, and if so, how large it is (per-object data is always a vector, i.e. a tensor of order 1)
The scalar type of per-object data (if any)
Please refer to the documentation of
OutputSinkObjectListStructure
for more information on how this information is returned.
Using
ProcessingContext.outputSinkData()
it is possible to obtain that object data after a successfull
processing step. It may be called in the following manner, assuming
sinkIndex
is a sink that is known to return objects:
1 # Obtain sink index from somewhere
2 # e.g. sinkIndex = context.findOutputSink(42)
3 sinkIndex = ...
4 objectList = context.outputSinkData(sinkIndex)
5 for object in objectList:
6 x, y, width, height = object.boundingBox
7 f = "Object: bbox topleft [{0}, {1}] -- bottomright [{2}, {3}], area {4}"
8 print(f.format(x, y, x + width - 1, y + height - 1, object.area))
The mask
and
additionalData
entries of each object are Numpy arrays.
Note
When comparing this to the C/C++ APIs: the Python API will use only
the extended object logic internally, and always provide all
possible information for each object. Also, the numpy arrays that
are fields of the fluxEngine.OutputObject
structure are
always be copied by the Python API before returning this
information to the user, so the user need not care about the
lifetime of that data, which they would have to in C/C++.