Data Processing
In order to process data with fluxEngine, a so-called processing context must be created. A processing context knows about:
How processing will be parallelized (it takes that information from the current handle) – changing parallelization settings will invalidate a given context
The instrument device that supplies the data that is being processed. This is currently either a PushBroom camera or a spectrometer.
It is also possible to process data loaded from files on disk, see the section Processing an ENVI cube.
What kind of processing should occur.
Should a preview of the data be generated?
Should the data be preprocessed so that it may be recorded to disk?
Should a fluxEngine model be used to process the data from the device?
Any white and dark reference data that should be used her.
Processing contexts are represented by the
fluxEngine::ProcessingContext
class. Special static
methods are available to create processing contexts for various
different purposes.
After a context has been created data may be processed. The basic logic is the following:
Set the source (input) data of the context to a buffer from the instrument device via the correct overload of
fluxEngine::ProcessingContext::setSourceData()
. Note that this only sets a pointer in the processing context and the buffer must not be returned to the instrument device until processing has completed.Call
ctx.processNext()
to process the data supplied to the processing context.Obtain the result of the processing via the
outputSinkData()
method. This will return only a pointer to the processing result that is stored internally in the processing context. This is described in detail further down in Obtaining Results.
The following sections will describe different types of processing contexts that may be created for processing data from instrument devices.
Finally it is possible to manually supply fluxEngine with data that it should process. This is the most complicated setup, as the user must provide fluxEngine with a lot of details on how the data is laid out in memory. This is described in the section Manual Data Processing.
Instrument Preview
Sometimes it is useful to obtain a preview image that may be shown to the user. The following device types will generate the following preview data:
PushBroom cameras: the preview data from a PushBroom camera will consist of data that is averaged over all wavelengths, so that only an intensity value is returned. The resulting tensor structure will always be
(1, width, 1)
for each buffer that is being processed.Spectrometers: the preview data will be a single spectrum (tensor structure
(bands)
) that may be plotted.HSI imager cameras: the preview data will be a single grayscale image that consists of the intensities averaged over all wavelengths. The tensor structure of the preview data will always be
(height, width, 1)
for each buffer that is being processed.
To create a processing context for preview data, one may use the
fluxEngine::ProcessingContext::createInstrumentPreviewContext()
static method. There are two overloads for this method: one that just
takes a fluxEngine::InstrumentDevice
pointer that will
create the processing context associated with the main processing queue
set of the fluxEngine handle, and one that takes an additional reference
to a fluxEngine::ProcessingQueueSet
.
To use the processing context to create the preview, one may call the
overload of fluxEngine::ProcessingContext::setSourceData()
that takes a fluxEngine::InstrumentDevice::BufferInfo
argument. After the source data has been set, a call to
fluxEngine::ProcessingContext::processNext()
will perform
the data processing.
The following example code will endlessly loop to create preview images from data that is acquired from a PushBroom camera:
1 try {
2 fluxEngine::ProcessingContext ctx;
3 ctx = fluxEngine::ProcessingContext::createInstrumentPreviewContext(instrumentDevice);
4
5 fluxEngine::InstrumentDevice::AcquisitionParameters parameters;
6 instrumentDevice->startAcquisition(parameters);
7 while (true) {
8 fluxEngine::BufferInfo buffer = instrumentDevice->retrieveBuffer(std::chrono::seconds{1});
9 if (!buffer.ok)
10 continue;
11 ctx.setSourceData(buffer);
12 ctx.processNext();
13 // See below for details on how the output sink logic
14 // works.
15 auto outputData = ctx.outputSinkData(0);
16 // Get a tensor view on the output data
17 TensorData view{outputData};
18 assert(view.dimension(0) == 1);
19 assert(view.dimension(2) == 1);
20 int64_t const width = view.dimension(1);
21 for (int64_t x = 0; x < width; ++x) {
22 float pixelValue{};
23 if (view.dataType() == fluxEngine::DataType::Float32)
24 pixelValue = view.at<float>(0, x, 0);
25 else if (view.dataType() == fluxEngine::DataType::Float64)
26 pixelValue = static_cast<float>(view.at<double>(0, x, 0));
27 // (etc. handle the other cases)
28
29 // Do something with the pixel value here
30 }
31 instrumentDevice->returnBuffer(buffer.id);
32 }
33 instrumentDevice->stopAcquisition();
34 } catch (std::exception& e) {
35 std::cerr << "An error occurred: " << e.what() << std::endl;
36 exit(1);
37 }
The context that has been created in this manner may be reused for multiple acquisitions. Note, however, that the strucure of the input data at the time of context creation determines what kind of data the context expects. If parameters are changed, especially things such as ROI, the context may no longer be compatible to the buffers that the device provides.
Recording HSI Data
fluxEngine also has the capability of preprocessing data in order for it to be stored as HSI data cubes.
There are certain preprocessing steps that are always done, such as applying corrections to the data obtained from the camera (if the camera has corrections that are to be applied in software), as well as normalizing the storage order. (fluxEngine always uses the BIP data layout internally, with wavelengths in ascending order.)
Additionally, the user may request further normalizations:
The user may select whether they want to store the recorded data as intensities or as reflectances. If the data is stored as intensities the user has the option to include white and dark references (if measured).
The user may select a wavelength grid to use instead of the raw wavelengths from the camera. By default hyperspectral cameras will have slight variations in the wavelengths that each pixel corresponds to due to manufacturing tolerances. For this reason fluxEngine provides the user with the ability to normalize the wavelengths onto a regularized grid.
(When fluxEngine processes data in models wavelengths are always normalized, but this allows the user to already perform the normalization during recording.)
The processing context that is created for the recording of HSI data requires more input than just the preview context. In addition to the device itself (in order to automatically obtain the structure of the input data) the context requires additional information.
The output tensor structure will always be (1, width, bands)
for
each PushBroom line that is being processed.
The white and dark references may be provided via the
fluxEngine::ProcessingContext::InstrumentParameters
structure. It allows the user to comfortably provide buffer
containers for this purpose (see
Measuring References).
The static method used to create a processing context for this
purpose is
fluxEngine::ProcessingContext::createInstrumentHSIRecordingContext()
.
The following example shows how the white and dark reference buffer containers that were recorded in Measuring References can be used for creating a processing context for recording HSI data:
1 // Declare variables for later use
2 fluxEngine::ProcessingContext::HSIRecordingResult ctxAndInfo;
3 fluxEngine::ProcessingContext ctx;
4 // For storing the recording result
5 fluxEngine::BufferContainer recordedData;
6 // The following were measured previously:
7 fluxEngine::BufferContainer whiteReference, darkReference;
8 try {
9 fluxEngine::ProcessingContext::InstrumentParameters parameters;
10 parameters.whiteReference = &whiteReference;
11 parameters.darkReference = &darkReference;
12 // Measure raw intensities
13 fluxEngine::ValueType valueType = fluxEngine::ValueType::Intensity;
14 // Empty vector -> don't normalize wavelength grid
15 std::vector<double> targetWavelengths = {};
16
17 ctxAndInfo = fluxEngine::ProcessingContext::createInstrumentHSIRecordingContext(instrumentDevice,
18 valueType, parameters, targetWavelengths);
19 ctx = std::move(ctxAndInfo.context);
20
21 // ctxAndInfo.wavelengths contains the actual wavelengths
22 // of ths HSI data
23 // ctxAndInfo.whiteReference contains the white reference
24 // data after it has been normalized in the same manner
25 // as the original data (or NULL to indicate it's not
26 / present)
27 // ctxAndInfo.darkReference contains the dark reference
28 // data after it has been normalized in the same manner
29 // as the original data (or NULL to indicate it's not
30 / present)
31 // Other fields contain further information
32
33 // Store up to 1000 lines
34 recordedData = fluxEngine::createBufferContainer(ctx, 1000);
35
36 fluxEngine::InstrumentDevice::AcquisitionParameters parameters;
37 instrumentDevice->startAcquisition(parameters);
38 // Record exactly 1000 lines
39 while (recordedData.count() < 1000) {
40 fluxEngine::BufferInfo buffer = instrumentDevice->retrieveBuffer(std::chrono::seconds{1});
41 if (!buffer.ok)
42 continue;
43 ctx.setSourceData(buffer);
44 ctx.processNext();
45 recordedData.addLastResult(ctx);
46 instrumentDevice->returnBuffer(buffer.id);
47 }
48 instrumentDevice->stopAcquisition();
49
50 // The data was stored in recordedData
51 } catch (std::exception& e) {
52 std::cerr << "An error occurred: " << e.what() << std::endl;
53 exit(1);
54 }
The previous example used a fluxEngine::BufferContainer
to
also store the result of the recording. That is the simplest way to do
this, but it is also possible to directly access the data via
1 // Record exactly 1000 lines
2 while (recordedData.count() < 1000) {
3 fluxEngine::BufferInfo buffer = instrumentDevice->retrieveBuffer(std::chrono::seconds{1});
4 if (!buffer.ok)
5 continue;
6 ctx.setSourceData(buffer);
7 ctx.processNext();
8 // See below for details on how the output sink logic
9 // works.
10 auto outputData = ctx.outputSinkData(0);
11 // Get a tensor view on the output data
12 TensorData view{outputData};
13 // For HSI data:
14 // view.order() == 3
15 // view.dimension(0) == 1 (1 line)
16 // view.dimension(1) == width
17 // view.dimension(2) == bands (# wavelengths)
18 // view.dataType() will differ, depending on
19 // the device, and with what options the
20 // context was created
21 // Here user code could do something with the data
22 instrumentDevice->returnBuffer(buffer.id);
23 }
The previous example selected
fluxEngine::ValueType::Intensity
and provided a white
reference when creating a processing context. The following table
illustrates the various possible combinations:
Value Type |
White reference supplied |
Allowed |
White reference included in result |
---|---|---|---|
Intensity |
no |
yes |
no |
Intensity |
yes |
yes |
yes |
Reflectance |
no |
no [1] |
- |
Reflectance |
yes |
yes |
yes |
It is also possible to specify the white reference directly in the form
of raw tensor data manually, instead of specifying it in form of buffer
containers. Please take a look at the reference documentation of the
fluxEngine::ProcessingContext::InstrumentParametersEx
structure that may passed to
fluxEngine::ProcessingContext::createInstrumentHSIRecordingContext()
instead for details on this.
Footnotes
Note
As with all context creation functions, it is also possible to specify a processing queue set as an optional second parameter to the method to associate the context with a different processing queue set.
Model Processing
Finally it is possible to use data from an instrument as the input of models that have been loaded. The user must have first loaded a fluxEngine model from disk using the functions described in Models.
Creating a processing context for model processing takes the following inputs:
The instrument device to create the context for
The model to use
Optionally a white & dark reference, again in the form of a
fluxEngine::ProcessingContext::InstrumentParameters
structure supplied by the user
The selected model must be compatible with the input data the connected instrument device generates, otherwise processing context creation will fail.
The static method
fluxEngine::ProcessingContext::createInstrumentProcessingContext()
is used to create a context for instrument data processing. If the
model doesn’t require reflectance data is its input, it is not
necessary to specify a white reference. Most models, however, will
require a white reference, as most models will require input data in
reflectances.
The following example code shows how to create a processing context that processes data obtained directly from an instrument:
1 // These were measured previously
2 fluxEngine::BufferContainer whiteReference, darkReference;
3 // This was loaded previously
4 fluxEngine::Model model;
5 try {
6 fluxEngine::ProcessingContext ctx;
7
8 fluxEngine::ProcessingContext::InstrumentParameters parameters;
9 parameters.whiteReference = &whiteReference;
10 parameters.darkReference = &darkReference;
11
12 ctx = fluxEngine::ProcessingContext::createInstrumentProcessingContext(instrumentDevice,
13 model, parameters);
14
15 fluxEngine::InstrumentDevice::AcquisitionParameters parameters;
16 instrumentDevice->startAcquisition(parameters);
17 while (true) {
18 fluxEngine::BufferInfo buffer = instrumentDevice->retrieveBuffer(std::chrono::seconds{1});
19 if (!buffer.ok)
20 continue;
21 ctx.setSourceData(buffer);
22 ctx.processNext();
23 // TODO: obtain result data from the context
24 instrumentDevice->returnBuffer(buffer.id);
25 }
26 instrumentDevice->stopAcquisition();
27
28 // The data was stored in recordedData
29 } catch (std::exception& e) {
30 std::cerr << "An error occurred: " << e.what() << std::endl;
31 exit(1);
32 }
Sequence Ids
When processing data in models the buffer number (frame number) is used as a so-called sequence id. For imager cameras and spectrometers this is mostly irrelevant, but for PushBroom cameras this is used by fluxEngine to modify the behavior slightly whether individual frames have been lost. PushBroom cameras only provide a single line each time a buffer is returned, and an image is constructed by concatenating lines one after another. A missing buffer will mean that a line is missing, and if the data is naively concatenated, the missing line will cause distortions.
What fluxEngine does to mitigate this is the following:
For any model that outputs on a per-line basis still only the line in question will be processed. (It will not generate additional output for missing lines.)
For any model that uses algorithms that put together the current line with previous lines (such as object detection), if a buffer or more are missing between the last invocation and the current one, the algorithm will behave as if the current line had been repeated as often as there were buffers missing.
For example, if a single buffer is missing, the line after the missing buffer will be repeated once when performnig any 2D reconstruction, i.e. it will occur twice.
For data processed from the device directly, the buffer number will be used as the sequence id for this. However, when storing a buffer in a buffer container, the sequence id will not be saved – and when extracting a buffer from a buffer container, the user has the ability to select a sequence id to use, instead. (By default it would use the index within the buffer container as the sequence id.)
Note
Also note that the behavior that a missing buffer is repeated only applies to processing within a fluxEngine model – adding a buffer to a buffer container completely ignores the sequence id; if the user wants a similar behavior here, it is up to them to implement this.
PushBroom Resets
As PushBroom cameras can be thought of as incrementally building up a cube line by line, at some point the user may want to indicate that the current cube is considered complete and a new cube starts. In that case the processing context has to be reset, so that all stateful operations are reset as well, such as object detection, but also kernel-based operations.
To achieve this the method
ProcessingContext::resetState()
exists. Its usage is simple:
1 try {
2 context.resetState();
3 } catch (std::exception& e) {
4 std::cerr << "An error occurred: " << e.what() << std::endl;
5 exit(1);
6 }
It is recommended that any time acquisition is stopped and then restarted that the user performs such a reset for any model they use.
Note
There is no requirement to actually perform such a reset. If fluxEngine is used to process PushBroom frames that are obtained from a camera above a conveyor belt in a continuous process, for example, it is possible to just simply process all incoming frames in a loop and never call the reset method. In that situation the reset method would be called though if the conveyor belt is stopped and has to be started up again. Though, depending on the specific application, that could also mean that a processing context would have to be created again, for example because references have to be measured again, and a simple state reset is not sufficient.
Obtaining Results
After processing has completed via the
ProcessingContext::processNext()
method, fluxEngine provides a means for the user to obtain the results
of that operation.
When designing the model in fluxTrainer that is to be used here, Output Sinks should be added to the model wherever processing results are to be obtained later.
For processing instrument preview and instrument recording processing
contexts an automatic output sink with sink index 0 will be created by
fluxEngine so the user may extract the preview and/or recording data.
Additionally, for instrument recording sinks, the
BufferContainer::addLastResult()
method may be used to add the last output data of a model to a buffer
container. (Though that method only works for recording contexts.)
Note
If a model contains no output sinks, it can be processed with fluxEngine, but the user will have no possibility of extracting any kind of result from it.
fluxEngine provides a means to introspect a model to obtain information about the output sinks that it contains. The following two identifiers for output sinks have to be distinguished:
The output sink index, this ist just a number starting at 0 and ending at one below the number of output sinks that may be used to specify the output sink for the purposes of the fluxEngine API.
The ordering of output sinks according to this index is non-obvious. Loading the same
.fluxmdl
file will lead to the same order, but saving models with the same configuration (but constructed separately) can lead to different orders of output sinks.The output id, which is a user-assignable id in fluxTrainer that can be used to mark output sinks for a specific purpose. The output id may not be unique (but should be), and is purely there for informational purposes.
For each output sink in the model the user will be able to obtain the output id of that sink. There is also a method
ProcessingContext::findOutputSink()
that can locate an output sink if the output id of that sink is unique. It will return the index of the sink with that output id.
To obtain information about all output sinks in the context, the method
ProcessingContext::outputSinkMetaInfos()
exists, which returns a vector of simple structs that contain
information about each output sink. The index of the vector is also the
output sink index.
1 try {
2 auto sinkMetaInfos = context.outputSinkMetaInfos();
3 for (std::size_t i = 0; i < sinkMetaInfos.size(); ++i) {
4 int sinkIndex = static_cast<int>(i);
5 std::cout << "Output sink with index " << sinkIndex << " has name "
6 << sinkMetaInfos[i].name << std::endl;
7 }
8 } catch (std::exception& e) {
9 std::cerr << "An error occurred: " << e.what() << std::endl;
10 exit(1);
11 }
The
ProcessingContext::OutputSinkMetaInfo
structure contains the following information:
The output id of the output sink
The name of the output sink as an UTF-8 string (this is the name the user specified when creating the model in fluxTraineer)
Storage type: what kind of data the output sink will return (the current options are either tensor data or detected objects)
The output delay of the output sink (only relevant in the case when PushBroom data is being processed, see Output Delay for PushBroom Cameras for a more detailed discussion of this.
Output sinks store either tensor data or detected objects, depending on the configuration of the output sink, and where it sits in the processing chain.
Tensor Data
Tensor data within fluxEngine always has a well-defined storage order, as most algorithms that work on hyperspectral data are at their most efficient in this memory layout. While fluxEngine supports input data of arbitrary storage order, it will be converted to the internal storage order at the beginning of processing. The output data will always have the following structure:
When processing entire HSI cubes it will effectively return data in BIP storage order, that means that the dimension structure will be
(y, x, λ)
(for data that still has wavelength information) or(y, x, P)
, whereP
is a generic dimension, if the data has already passed through dimensionality reduction filters such as PCA.When processing PushBroom frames it will effectively return data in LambdaX storage order, with an additional dimension of size
1
at the beginning. In that case it will either be(1, x, λ)
or(1, x, P)
.Pure per-object data always has a tensor structure of order 2, in the form of
(N, λ)
or(N, P)
, whereN
describes the number of objects detected in this processing iteration. Important: objects themselves are returned as a structure (see below), per-object data is data that is the output of filters such as the Per-Object Averaging or Per-Object Counting filter. Also note that output sinks can combine objects and per-object data, in which case the per-object data will be returned as part of the object structure.For PushBroom data it is recommended to always combine per-object data with objects before interpreting it, as the output delay of both nodes may be different, and when combining the data the user does not have to keep track of the relative delays themselves.
To obtain the tensor structure of a given output sink, the method
ProcessingContext::outputSinkTensorStructure()
is available. It returns the following information:
The scalar data type of the tensor data (this is the same as the data type configured in the output sink)
The order of the tensor, which will be 2 or 3 (see above)
The maximum sizes of the tensor that can be returned here
The fixed sizes of the tensor that will be returned here. If the tensors returned here are always of the same size, the values here will be same as the maximum sizes. Any dimension that is not always the same will have a value of
-1
instead. If all of the sizes of the tensor returned here are fixed, the tensor returned will always be of the same size. (There is one notable exception: if the output sink has a non-zero output delay ofm
, the firstm
processing iterations will produce a tensor that does not have any data.)
Please refer to the documentation of
ProcessingContext::OutputSinkTensorStructure
for more information on how this information is returned.
Using
ProcessingContext::outputSinkData()
it is possible to obtain that tensor data after a successful
processing step. It will also return information about the scalar data
type, the order, and the stride structure of the resulting tensor, even
if that information is in principle reconstructible from the data
obtained via the
ProcessingContext::outputSinkTensorStructure()
method.
Tensor data retrieved from an output sink may be cast into the
convenient fluxEngine::TensorData
wrapper that allows
easy access to tensor elements.
For example, if we know a given sink with index sinkIndex
has
signed 16bit integer data that spans the entire cube that is being
processed (when processing HSI cubes), the following code could be
used to obtain the results:
1 /* obtained from previous introspection */
2 int sinkIndex = ...;
3 std::int64_t cube_width = ...;
4 /* from current buffer */
5 int64_t bufferNumber = ...;
6 try {
7 auto data = ProcessingContext::outputSinkData(sinkIndex);
8 TensorData view{outputData};
9 // PushBroom data
10 assert(view.dimension(0) == 1);
11 assert(view.dimension(1) == cube_width);
12 assert(view.dimension(2) == 1);
13
14 for (std::int64_t x = 0; x < cube_width; ++x) {
15 std::cout << "Classification result for pixel (" << x << ", " << bufferNumber << ") = "
16 << view.at<int16_t>(0, x, 0) << "\n";
17 }
18 std::cout.flush();
19 } catch (std::exception& e) {
20 std::cerr << "An error occurred: " << e.what() << std::endl;
21 exit(1);
22 }
Alternatively, if fluxEngine::TensorData
is not used
and access happens manually, the following code will give the same
output:
1 /* obtained from previous introspection */
2 int sinkIndex = ...;
3 std::int64_t cube_width = ...;
4 /* from current buffer */
5 int64_t bufferNumber = ...;
6 try {
7 auto data = ProcessingContext::outputSinkData(sinkIndex);
8 TensorData view{outputData};
9 auto classificationData = static_cast<int16_t const*>(data.data);
10 /* Classification results have an inner dimension of 1, so the
11 * actual sizes should be (1, cube_width, 1) for PushBroom data.
12 */
13 assert(data.sizes[0] == 1);
14 assert(data.sizes[1] == cube_width);
15 assert(data.sizes[2] == 1);
16
17 int64_t strideY = data.strides[0];
18 int64_t strideX = data.strides[1];
19
20 for (std::int64_t x = 0; x < cube_width; ++x) {
21 std::int64_t index = 0 * strideY + x * strideX;
22 std::cout << "Classification result for pixel (" << x << ", " << bufferNumber << ") = "
23 << classificationData[index] << "\n";
24 }
25 std::cout.flush();
26 } catch (std::exception& e) {
27 std::cerr << "An error occurred: " << e.what() << std::endl;
28 exit(1);
29 }