This code snippet demonstrates the how the DNN module with safe DLA enabled is typically used. Note that error handling is left out for clarity.
Initialize network from file.
In order to be able to use safe DLA, the model must be generated using --useSafeDLA
option via TensorRT Optimizer Tool tool. The processor type, while initializing DNN, must be either DW_PROCESSOR_TYPE_DLA_0
or DW_PROCESSOR_TYPE_DLA_1
depending on which DLA engine the inference should take place.
struct dwDNNObject * dwDNNHandle_t
Handles representing Deep Neural Network interface.
DW_API_PUBLIC dwStatus dwDNN_initializeTensorRTFromFile(dwDNNHandle_t *const network, const char8_t *const modelFilename, const dwDNNPluginConfiguration *const pluginConfiguration, dwProcessorType const processorType, dwContextHandle_t const context)
Creates and initializes a TensorRT Network from file.
Check that the loaded network has the expected number of inputs and outputs.
uint32_t numInputs = 0;
uint32_t numOutputs = 0;
if (numInputs != 1) {
std::cerr << "Expected a DNN with one input blob." << std::endl;
return -1;
}
if (numOutputs != 1) {
std::cerr << "Expected a DNN with one output blobs." << std::endl;
return -1;
}
DW_API_PUBLIC dwStatus dwDNN_getInputBlobCount(uint32_t *const count, dwDNNHandle_t const network)
Gets the input blob count.
DW_API_PUBLIC dwStatus dwDNN_getOutputBlobCount(uint32_t *const count, dwDNNHandle_t const network)
Gets the output blob count.
Ask the DNN about the order of the input and output blobs. The network is assumed to contain the input blob "data_in" and output blobs "data_out1" and "data_out2".
uint32_t inputIndex = 0;
uint32_t output1Index = 0;
DW_API_PUBLIC dwStatus dwDNN_getOutputIndex(uint32_t *const blobIndex, const char8_t *const blobName, dwDNNHandle_t const network)
Gets the index of an output blob with a given blob name.
DW_API_PUBLIC dwStatus dwDNN_getInputIndex(uint32_t *const blobIndex, const char8_t *const blobName, dwDNNHandle_t const network)
Gets the index of an input blob with a given blob name.
Note that, safe DLA requires RGBA
input with interleaved channels, and it provides outputs with a NCHWx
format.
NCHWx
format's layout is equivalent to a C array with dimensions [N][(C+x-1)/x][H][W][x]
, with the tensor coordinates (n, c, h, w)
mapping to array subscript [n][c/x][h][w][cx]
where:
N
: Batch size n
: Batch index C
: Number of planes c
: Plane index H
: Height h
: Vertical index W
: Width w
: horizontal index x
: Number of interleaved elements
DLA dictates that x
is equal to 32 / sizeof(DataType)
; therefore, for a tensor with FP16 precision, x
is 16
.
Moreover, the input and output to a safe DLA model are expected to be tensors of type NvMedia. In order to simplify the process of inference, dwDataConditioner
and dwDNN
modules provide the streaming and conversion functionalities.
In dwDNNTensors, the dimensions for NCHWx are stored as:
dims = {numChannels % x, W, H, (numChannels + x - 1U) / x, N};
Therefore, in order to compute the number of channels, which is needed for conversion to NCHW
:
numChannels = (dims[2] - 1) * x + dims[0];
Below, we shall make use of these features:
uint32_t numChannels = (outputProps1.dims[2] - 1) * x + outputProps1.dims[0];
&metadata.dataConditionerParams, cudaStream,
contextHandle);
struct dwDataConditionerObject * dwDataConditionerHandle_t
Handle to a DataConditioner.
DW_API_PUBLIC dwStatus dwDataConditioner_initializeFromTensorProperties(dwDataConditionerHandle_t *const obj, dwDNNTensorProperties const *const outputProperties, uint32_t const maxNumImages, dwDataConditionerParams const *const dataConditionerParams, cudaStream_t const stream, dwContextHandle_t const ctx)
Initializes a DataConditioner module.
DW_API_PUBLIC dwStatus dwDNN_getOutputTensorProperties(dwDNNTensorProperties *const tensorProps, uint32_t const blobIndex, dwDNNHandle_t const network)
Gets the output tensor properties at blobIndex.
DW_API_PUBLIC dwStatus dwDNN_getInputTensorProperties(dwDNNTensorProperties *const tensorProps, uint32_t const blobIndex, dwDNNHandle_t const network)
Gets the input tensor properties at blobIndex.
DW_API_PUBLIC dwStatus dwDNN_getMetaData(dwDNNMetaData *const metaData, dwDNNHandle_t const network)
Returns the metadata for the associated network model.
uint32_t numDimensions
Number of dimensions of the tensor.
dwDNNTensorType tensorType
Tensor type.
dwDNNTensorLayout tensorLayout
Tensor layout.
uint32_t dimensionSize[DW_DNN_TENSOR_MAX_DIMENSIONS]
Dimensions of the tensor to match the selected layout type.
struct dwDNNTensorObject * dwDNNTensorHandle_t
Handles representing Deep Neural Network interface.
DW_API_PUBLIC dwStatus dwDNNTensor_create(dwDNNTensorHandle_t *const tensorHandle, dwDNNTensorProperties const *const properties, dwContextHandle_t const ctx)
Creates and allocates resources for a dwDNNTensorHandle_t based on the properties.
@ DW_DNN_TENSOR_TYPE_CUDA
CUDA tensor.
@ DW_DNN_TENSOR_TYPE_CPU
CPU tensor.
@ DW_DNN_TENSOR_LAYOUT_NCHW
Planar tensor. This is the most common tensor layout.
Specifies DNNTensor properties.
DW_API_PUBLIC dwStatus dwDNNTensorStreamer_initialize(dwDNNTensorStreamerHandle_t *streamer, const dwDNNTensorProperties *from, dwDNNTensorType to, dwContextHandle_t ctx)
Creates and initializes the tensor streamer capable of moving tensors between different API types.
struct dwDNNTensorStreamerObject * dwDNNTensorStreamerHandle_t
Convert DNN input from image to tensor, then perform DNN inference and stream results back. All operations are performed asynchronously with the host code.
dwRect rois[1]{0U, 0U, imageWidth, imageHeight};
cudaAddressModeClamp, dataConditioner);
void* data1;
doit(data1);
DW_API_PUBLIC dwStatus dwDataConditioner_prepareData(dwDNNTensorHandle_t const tensorOutput, dwImageHandle_t const *const inputImages, uint32_t const numImages, dwRect const *const rois, cudaTextureAddressMode const addressMode, dwDataConditionerHandle_t const obj)
Runs the configured transformations on an image.
DW_API_PUBLIC dwStatus dwDNN_infer(dwDNNTensorHandle_t *const outputTensors, uint32_t const outputTensorCount, dwConstDNNTensorHandle_t *const inputTensors, uint32_t const inputTensorCount, dwDNNHandle_t const network)
Runs inference pipeline on the given input.
DW_API_PUBLIC dwStatus dwDNNTensor_unlock(dwDNNTensorHandle_t const tensorHandle)
Unlocks the tensor, enabling other threads to lock the tensor and modify the content.
struct dwDNNTensorObject const * dwConstDNNTensorHandle_t
DW_API_PUBLIC dwStatus dwDNNTensor_lock(void **const data, dwDNNTensorHandle_t const tensorHandle)
Locks the tensor and retrieves pointer to the data with write access.
DW_API_PUBLIC dwStatus dwDNNTensorStreamer_producerSend(dwDNNTensorHandle_t tensor, dwDNNTensorStreamerHandle_t streamer)
Sends an tensor through the streamer acting as the producer.
DW_API_PUBLIC dwStatus dwDNNTensorStreamer_consumerReceive(dwDNNTensorHandle_t *tensor, dwTime_t timeout_us, dwDNNTensorStreamerHandle_t streamer)
Receive a pointer to a dwDNNTensorHandle_t from the streamer, acting as a consumer.
DW_API_PUBLIC dwStatus dwDNNTensorStreamer_producerReturn(dwDNNTensorHandle_t *tensor, dwTime_t timeout_us, dwDNNTensorStreamerHandle_t streamer)
The producer streamer waits for the tensor sent to be returned by the consumer.
DW_API_PUBLIC dwStatus dwDNNTensorStreamer_consumerReturn(dwDNNTensorHandle_t *tensor, dwDNNTensorStreamerHandle_t streamer)
Return the received tensor back to the producer.
Finally, free previously allocated memory.
DW_API_PUBLIC dwStatus dwDataConditioner_release(dwDataConditionerHandle_t const obj)
Releases the DataConditioner module.
DW_API_PUBLIC dwStatus dwDNN_release(dwDNNHandle_t const network)
Releases a given network.
DW_API_PUBLIC dwStatus dwDNNTensor_destroy(dwDNNTensorHandle_t const tensorHandle)
Destroys the tensor handle and frees any memory created by dwDNNTensor_create() or dwDNNTensor_create...
DW_API_PUBLIC dwStatus dwDNNTensorStreamer_release(dwDNNTensorStreamerHandle_t streamer)
Releases the tensor streamer.