Model Quantized Compilation

During the model quantized compilation, the hb_compile tool generates intermediate stage onnx models as well as hbm model that can be used for on-board deployment based on the specific information in the configuration file.

The hb_compile provides two modes when doing the model quantized compilation, fast performance evaluation mode (with fast-perf turned on) and traditional model conversion compilation mode (without fast-perf turned on).

model_quantized_compilation

Note

The fast performance evaluation mode, when turned on, will generate the hbm model that can have the highest performance when running on the board side during the conversion process, and the tool internally performs the following operations:

  • Run BPU executable operators on the BPU whenever possible.
  • Remove CPU operators that are removable at the beginning and end of the model, including: Quantize/Dequantize, Transpose, Cast, Reshape, etc.

Usage

Usage: hb_compile [OPTIONS] A tool that maps floating-point models to quantized models and provides some additional validation features Options: -h, --help Show this message and exit. -c, --config PATH Model convert config file -m, --model PATH Model to be compiled or modified --proto PATH Caffe prototxt file --march [nash-e|nash-m] BPU's micro architectures -i, --input-shape <TEXT TEXT>... Specify the model input shape, e.g. --input- shape input1 1x3x224x224 --fast-perf Build with fast perf mode

Parameters Introduction

Parameter NameParameter Description
-h, --helpShow help information and exit.
-c, --configConfiguration file for the model compilation, in YAML format.
--fast-perf

Turn on fast-perf, which will generate the hbm model that can have the highest performance when running on the board side during the conversion process, so that you can easily use it for the model performance evaluation later.
If you turned on fast-perf, you also need to configure the following:
-m, --model, Floating-point model file of Caffe/ONNX.
--proto, Specify the Prototxt file of the Caffe model.
--march, BPU's micro architectures.

  • for S100 processor, you should specify it tonash-e.
  • for S100P processor, you should specify it to nash-m.
-i, --input-shape, Optional parameter that specifies the shape of the input node of the model. When using hb_compile for model quantized compilation, this configuration only takes effect when fast-perf is turned on currently. It is used in the following way:
  • Specify the shape information of a single input node, example of how to use: --input-shape input_1 1x3x224x224.
  • Specify the shapes of multiple input nodes, example of how to use: --input-shape input_1 1x3x224x224 --input-shape input_2 1x3x224x224.
Attention:
For the non-dynamic input model, --input-shape can be left unconfigured and the tool will automatically read the size information from the model file.
For the dynamic input model:
  • If the model has single input and the --input-shape is not specified, the tool will set the first dimension of the model which the first dimension of the input node is [-1, 0, ?] to 1 by default.
  • If the model has multiple inputs, you must configure this parameter to specify the shape for each input.

--skipIf you don't focus on the compile process and its output during the accuracy debugging stage, you can use this parameter configuration as compile to skip the compilation stage.

The log file generated by the compilation will be stored in the directory where the command is executed under the default name hb_compile.log.

Usage Example

Fast Performance Evaluation Mode

If you want to use the fast performance evaluation mode (i.e., turn on fast-perf), the reference command is as follows:

hb_compile --fast-perf --model ${caffe_model/onnx_model} \ --proto ${caffe_proto} \ --march ${march} \ --input-shape ${input_node_name} ${input_shape}
Attention
  • Please note that if you need to enable fast performance evaluation mode, do not configure the -config parameter as the tool uses the built-in high-performance configuration in this mode.
  • When using hb_compile for model quantized compilation, the --input-shape parameter configuration only works in fast performance evaluation mode (i.e. fast-perf is turned on).

Traditional Model Conversion Compilation Mode

If you want to use the traditional model conversion compilation mode (without fast-perf enabled), you can refer to the following command:

hb_compile --config ${config_file}

Specific Parameter Information

The configuration file mainly contains model parameters, input information parameters, calibration parameters, compilation parameters. All parameter groups must exist in your configuration file. Parameters can be divided into the optional and the mandatory, while the optional parameters can be left unconfigured.

The following is the specific parameter information, the parameters will be more, we follow the order of the parameter group to introduce. Required/Optional indicates whether this parameter must be specified in the Yaml file.

Model Parameters

Parameter NameParameter DescriptionRequired/Optional
prototxt

PURPOSE: This parameter specifies the prototxt filename of the floating-point Caffe model.
PARAMETER TYPE: String.
RANGE: Model path.
DEFAULT VALUE: None.
DESCRIPTIONS: This parameter must be specified when the model is a Caffe model.
FOR EXAMPLE:

prototxt: 'mobilenet_deploy.prototxt'
Caffe module required
caffe_model

PURPOSE: This parameter specifies the caffemodel filename of the floating-point Caffe model.
PARAMETER TYPE: String.
RANGE: Model path.
DEFAULT VALUE: None.
DESCRIPTIONS: This parameter must be specified when the model is a Caffe model.
FOR EXAMPLE:

caffe_model: 'mobilenet.caffemodel'
Caffe module required
onnx_model

PURPOSE: This parameter specifies the onnx filename of the floating-point ONNX model.
PARAMETER TYPE: String.
RANGE: Model path.
DEFAULT VALUE: None.
DESCRIPTIONS: This parameter must be specified when the model is a ONNX model.
FOR EXAMPLE:

onnx_model: 'resnet50.onnx'
ONNX module required
march

PURPOSE: This parameter specifies the platform architecture to run the board-side deployable model.
PARAMETER TYPE: String.
RANGE: 'nash-e' or 'nash-m'.
DEFAULT VALUE: None.
DESCRIPTIONS: These optional configuration values correspond to S100, S100P processors in that order.
Depending on the platform you are using, you can choose between the two options.
FOR EXAMPLE:

march: 'nash-e'
required
output_model_file_prefix

PURPOSE: This parameter specifies the prefix of the board-side deployable model filename.
PARAMETER TYPE: String.
RANGE: None.
DEFAULT VALUE:'model'.
DESCRIPTIONS: This parameter specifies the prefix of the converted fixed-point model filename.
FOR EXAMPLE:

output_model_file_prefix: 'resnet50_224x224_nv12'
optional
working_dir

PURPOSE: This parameter specifies the directory to save the conversion results.
PARAMETER TYPE: String.
RANGE: None.
DEFAULT VALUE:'model_output'.
DESCRIPTIONS: The tool will create a new directory automatically if it doesn't exist.
FOR EXAMPLE:

working_dir: './model_output'
optional
output_nodes

PURPOSE: This parameter specifies model output node(s).
PARAMETER TYPE: String.
RANGE: Specific node name of the model.
DEFAULT VALUE: None.
DESCRIPTIONS: This parameter is used to support you to specify the node as the model output, the value should be the specific node name of the model.
When there are multiple values, please refer to param_value Configuration.
FOR EXAMPLE:

output_nodes: "OP_name"
optional
remove_node_type

PURPOSE: This parameter sets the type of the deleted node.
PARAMETER TYPE: String.
RANGE: "Quantize", "Transpose", "Dequantize", "Cast", "Reshape", "Softmax". Different types should be split by;.
DEFAULT VALUE: None.
DESCRIPTIONS: No settings or set to null doesn't affect the model conversion process.
This parameter is used to support you in setting the type information of the node to be deleted.
The deleted node must be at the beginning or end of the model, connected to the input or output of the model.
When there are multiple values, please refer to param_value Configuration.
FOR EXAMPLE:

remove_node_type: "Dequantize"

Attention: After setting this parameter, we will match the deletable nodes of the model according to your configuration. If the node type you configured to be deleted meets the deletion conditions, it will be deleted and this process will be repeated until the deletable nodes can not be matched with the configured node type.

optional
remove_node_name

PURPOSE: This parameter sets the name of the deleted node.
PARAMETER TYPE: String.
RANGE: The name of the node in the model to be deleted. Different names should be split by;.
DEFAULT VALUE: None.
DESCRIPTIONS: No settings or set to null doesn't affect the model conversion process.
This parameter is used to support you in setting the name of the node to be deleted.
The deleted node must be at the beginning or the end of the model, connected to the input or output of the model.
When there are multiple values, please refer to param_value Configuration.
FOR EXAMPLE:

remove_node_name: "OP_name"

Attention: After setting this parameter, we will match the deletable nodes of the model according to your configuration. If the node name you configured to be deleted meets the deletion conditions, it will be deleted and this process will be repeated until the deletable nodes can not be matched with the configured node name.

optional
debug_mode

PURPOSE: Set debugging parameters for accuracy analysis.
PARAMETER TYPE: String.
RANGE: "dump_all_layers_output","dump_calibration_data"
DEFAULT VALUE: None.
DESCRIPTIONS:

  • "dump_all_layers_output" specifies whether the board-side deployable model retains the ability to output intermediate layer values. Note that dumping the intermediate layer results is a debugging method, please do not enable it unless it is necessary.
  • "dump_calibration_data" serves to save the calibration data for the accuracy debug analysis and the data format is .npy. This data can be fed directly into the model for inference via np.load(). If you don't set this parameter, you can also save the data yourself and use the accuracy debug tool for accuracy analysis.
FOR EXAMPLE:

debug_mode: 'dump_all_layers_output'

Attention: It is not supported to configure input_source to be resizer after setting dump_all_layers_output.

optional

Input Information Parameters

Parameter NameParameter DescriptionRequired/Optional
input_name

PURPOSE: This parameter specifies the input node names of the original floating-point model.
PARAMETER TYPE: String.
RANGE: Single input: "" or the input node name, Multiple inputs: "input_name1; input_name2; input_name3..."
DEFAULT VALUE: None.
DESCRIPTIONS: No configuration is required if there is only one input node.
If there are more than one nodes, it must be configured so as to guarantee the accuracy of subsequent types and input sequence of the calibration data.
For configuration methods of multiple values, please refer to param_value Configuration .
FOR EXAMPLE:

input_name: "data"
Dynamic input: required
Non-Dynamic input:
Optional for single input
Required for multiple inputs
input_type_train

PURPOSE: This parameter specifies the input data type of the original floating-point model.
PARAMETER TYPE: String.
RANGE: 'rgb', 'bgr','yuv444', 'gray' and 'featuremap'.
DEFAULT VALUE: 'featuremap'.
DESCRIPTIONS: Each input node needs to be configured with a defined input data type. If there are multiple input nodes, the order of the nodes must be strictly consistent with the order in the input_name.
For configuration methods of multiple values, please refer to param_value Configuration
For the selection of data types, please refer to: Model Conversion Interpretationsection.
FOR EXAMPLE:

input_type_train: 'bgr'
optional
input_type_rt

PURPOSE: This parameter specifies the input data format that the board-side deployable model obtained after conversion must match.
PARAMETER TYPE: String.
RANGE: 'rgb', 'bgr','nv12','gray' and 'featuremap'.
DEFAULT VALUE: 'featuremap'.
DESCRIPTIONS: Here is an indication of the data format you need to use.
It doesn't have to be the same as the data format of the original model, but note that this is the format that will actually feed into your model when running on the computing platform.
Each input node needs to be configured with a defined input data layout. If there are multiple input nodes, the sequence of the configured nodes must be strictly consistent with the input_name sequence.
For configuration methods of multiple values, please refer to param_value Configuration.
For the selection of data types, please refer to Model Conversion Interpretationsection.
FOR EXAMPLE:

input_type_rt: 'yuv444'

Attention: When input_type_rt is configured as a featuremap non-four-dimensional input, do not specify the mean_value, scale_value and std_value.

optional
input_layout_train

PURPOSE: This parameter specifies the input data layout of the original floating-point model.
PARAMETER TYPE: String.
RANGE: 'NHWC', 'NCHW'.
DEFAULT VALUE: None.
DESCRIPTIONS: Each input node needs to be configured with a defined input data layout that shall be the same as the layout of the original floating-point model.
If there are multiple input nodes, the order of the nodes must be strictly consistent with the input_name sequence.
For configuration methods of multiple values, please refer to param_value Configuration.
For more about data layout, please refer to Model Conversion Interpretationsection.
FOR EXAMPLE:

input_layout_train: 'NCHW'
Required for model with non-featuremap input
Ineffective for model with featuremap input, so no configuration is needed
input_space_and_range

PURPOSE: This parameter specifies special data formats.
PARAMETER TYPE: String.
RANGE: 'regular' and 'bt601_video'.
DEFAULT VALUE: 'regular'.
DESCRIPTIONS: The purpose of this parameter is to deal with the YUV420 format dumped by different ISP and it will only become valid when the input_type_rt is specified as nv12, if the format is not nv12, an error will be reported, and the process will exit.

  • regular is a common YUV420 format ranged between [0,255].
  • bt_601_video is another YUV420 video format ranged between [16,235]. For more information about bt601, please feel free to Google it. bt601_video Specification is only supported when input_type_train is configured as bgr or rgb.
FOR EXAMPLE:

input_space_and_range: 'regular'

Attention: You don't need to configure this parameter without explicit requirements.

optional
input_shape

PURPOSE: This parameter specifies the input data size of the original floating-point model.
PARAMETER TYPE: String.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Dimensions of shape should be separated by x, e.g. '1x3x224x224'.
You don't need to configure this parameter when the original floating-point model is a non-dynamic input, the tool will read the size information from model files automatically.
If you need to configure multiple input nodes, the sequence of configured nodes must be strictly consistent with the input_name sequence.
For configuration methods of multiple values, please refer to param_value Configuration.
FOR EXAMPLE:

input_shape: '1x3x224x224'
Dynamic input: required
Non-Dynamic input: optional
input_batch

PURPOSE: This parameter specifies the input data batch size that the board-side deployable model obtained after conversion must match.
PARAMETER TYPE: Int.
RANGE: 1-4096.
DEFAULT VALUE: 1.
DESCRIPTIONS: This parameter specifies the input data batch size that the board-side deployable model obtained after conversion must match. This parameter only supports specifying a single value, which will act on all inputs of the model when the model has multiple inputs.
If you don't configure this parameter, the default value is 1.
FOR EXAMPLE:

input_batch: 1

Attention:

  • This parameter can only be used for the model which first dimension of the input_shapeis 1. If the model has multiple inputs, the first dimension of the input_shape needs to be 1 for all inputs.
  • When the input_type_rt is specified as nv12 or gray, or the input_source is specified as pyramid or resizer, that is, when the model input is pyramid/resizer, only image input is supported. At this time, if you specify the input_batch, you need to specify the separate_batch to True to enable the separated batch mode or configure the corresponding input nodes in separate_name to split.
  • This parameter only effective when the original onnx model itself supports multi-batch inference. However, due to the complexity of the operators, if during the model conversion process, you meet the conversion failure log which indicates that the model does not support the configuration of the input_batch parameter, please try to directly export a multi-batch onnx model and correctly configure the size of the calibration data to re-convert it (at this time, you no longer need to configure this arameter).

optional
separate_batch

PURPOSE: This parameter specifies whether to enable the separated batch mode.
PARAMETER TYPE: Bool.
RANGE: True, False.
DEFAULT VALUE: False.
DESCRIPTIONS: If you don't configure this parameter, the default value is False, that is, the separated batch mode is not enabled.
When the separated batch mode is not enabled, the input need to be allocated in a contiguous memory area. For example, if the model input is 1x3x224x224 and input_batch is set to N, you will need to prepare an on-board model input of Nx3x224x224.
When the separate batch mode is enabled, the input nodes with this mode enabled will be separated, the number of which to be separated is the value you specified with input_batch. You can prepare the input for each batch individually, which no longer requires the input to be allocated in a contiguous area of memory in this mode. For example, if the model input is 1x3x224x224 and input_batch is set to N, then you need to prepare N 1x3x224x224 on-board model inputs.
FOR EXAMPLE:

separate_batch: False
optional
separate_name

PURPOSE: When separated batch mode is not enabled, the parameter specifies the split node names.
PARAMETER TYPE: String.
RANGE: Single input: "" or the input node name, Multiple inputs: "separate_name1; separate_name2; separate_name3...".
DEFAULT VALUE: None.
DESCRIPTIONS: This parameter is only valid if separate_batch is False, internally it will do the splitting according to the node name you specified.
Ensure that the node to be split is within the input_name range, if there are more than one nodes to be split, for configuration methods of multiple values, please refer to param_value Configuration.
FOR EXAMPLE:

separate_name: ""
optional
mean_value

PURPOSE: This parameter specifies the mean value to be subtracted by the pre-processing method.
PARAMETER TYPE: Float.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Each input node has 2 configuration methods:
If only one value is specified, then all channels will subtract the same mean value.
Otherwise, you need to specify the mean values for each channel and the number of values (separated by space) must be consistent with the numbers of channel, which mean that each channel will subtract a different mean value.
If there is a node that doesn't require mean processing, it should be specified as 'None'.
You can specify the parameters like this: mean_value: 'value', while multiple values can be separated by ';': mean_value: 'value1;value_2 value3'.
If 'value' has multiple values, while multiple values can be separated by space or ',': mean_value: 'value11 value12 value13;value21 value22 value23;...' or mean_value: 'value11,value12,value13;value21,value22,value23;...'.
FOR EXAMPLE:

mean_value: '103.94 116.78 123.68'
optional
scale_value

PURPOSE: This parameter specifies the scale factor of the pre-processing method.
PARAMETER TYPE: Float.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: When specifying the factor of the pre-processing method, only one of this parameter and the std_value parameter needs to be specified, and scale_value = 1 / std_value.
Each input node has 2 configuration methods:
If only one value is specified, then all channels will be multiplied by this factor.
Otherwise, you need to specify the scale values for each channel and the number of values (separated by space) must be consistent with the numbers of channel, which mean that each channel will be multiplied by a different factor.
If there is a node that doesn't require scale processing, it should be specified as 'None'.
You can specify the parameters like this: scale_value: 'value', while multiple values can be separated by ';': scale_value: 'value1;value_2;value3'.
If 'value' has multiple values, while multiple values can be separated by space or ',': scale_value: 'value11 value12 value13;value21 value22 value23;...' or scale_value: 'value11,value12,value13;value21,value22,value23;...'.
FOR EXAMPLE:

scale_value: '0.017'
optional
std_value

PURPOSE: This parameter specifies the std factor of the pre-processing method.
PARAMETER TYPE: Float.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: When specifying the factor of the pre-processing method, only one of this parameter and the scale_value parameter needs to be specified, and std_value = 1 / scale_value.
Each input node has 2 configuration methods:
If only one value is specified, then all channels will be divided by this factor.
Otherwise, you need to specify the std values for each channel and the number of values (separated by space) must be consistent with the numbers of channel, which mean that each channel will be divided by a different factor.
If there is a node that doesn't require std processing, it should be specified as 'None'.
You can specify the parameters like this: std_value: 'value', while multiple values can be separated by ';': std_value: 'value1;value_2;value3'.
If 'value' has multiple values, while multiple values can be separated by space or ',': std_value: 'value11 value12 value13;value21 value22 value23;...' or std_value: 'value11,value12,value13;value21,value22,value23;...'.
FOR EXAMPLE:

std_value: '58.824'
optional

input_type_rt/input_type_train additional description

The camera captured data are usually in NV12 format. Therefore, if you use the RGB (NCHW) format in the model training and expect the model to process NV12 data efficiently, then you will need to configure as follow:

input_parameters: input_type_rt: 'nv12' input_type_train: 'rgb' input_layout_train: 'NCHW'

In addition to converting the input data to NV12, you can also use different RGB-orders in the training and runtime infer. The tool will automatically add data conversion nodes according to the data formats specified by the input_type_rt and input_type_train. Not any type combination is needed, in order to avoid your misuse, we only open some fixed type combinations in the following table (Y for supported image types, while N for unsupported image types. The first row of the table is the data types supported in input_type_rt and the first column is the data types supported in input_type_train.):

input_type_train \ input_type_rtnv12yuv444rgbbgrgrayfeaturemap
yuv444YYNNNN
rgbYYYYNN
bgrYYYYNN
grayNNNNYN
featuremapNNNNNY
Note

To meet the requirements of Horizon ASICs on input data type and reduce the inference costs, when input_type_rt is of the RGB(NHWC/NCHW)/BGR(NHWC/NCHW) type, the input data type of the models converted by using the conversion tool will all be int8. That is, for regular image formats, pixel values should be subtracted by 128, which has already been done by the API and you do not need to do it again.

In the final hbm model obtained from the conversion, the conversion from input_type_rt to input_type_train is an internal process. You only need to focus on the data format of input_type_rt. It is of vital importance to understand the requirement of the input_type_rt when preparing the inference data for embedded applications, please refer to the following explanations to each format of the input_type_rt.

  • rgb, bgr, and gray are commonly used image format. Note that each value is represented using UINT8.
  • yuv444 is a popular image format. Note that each value is represented using UINT8.
  • NV12 is a popular YUV420 image format. Note that each value is represented using UINT8.
  • One special case of NV12 is to specify the bt601_video of the input_space_and_range. Compared with typical NV12 format, its value range has changed from [0,255] to [16,235]. Each value is still represented as UINT8. Note that bt601_video is supported configuring via input_space_and_range only when input_type_train is bgr or rgb.
  • Featuremap is suitable for cases where the above listed formats failed to meet your needs, and this type uses float32 for each value. For example, this format is commonly used for model processing such as radar and speech.
Hint

The above input_type_rt and input_type_train are integrated into the toolchain processing procedure. If you are very sure that no format conversion is required, then set the two input_type to be the same, so that the same input_type will perform the processing in a straight-through way and will not affect the actual execution performance of the model.

Similarly, data pre-processing also is also integrated into the process. If you don't need to do any pre-processing, just do not specify mean_value, scale_value and std_value, which will not affect the actual execution performance of the model.

Attention

If you specify the input_type_rt and input_type_train to different values(each being rgb/bgr/yuv444), or if you set preprocessing-related parameters such as mean_value, scale_value, and std_value, the tool will internally insert the preprocessing node during the model conversion and compilation process. The preprocessing node only supports NHWC input. Therefore, if the data layout of the original input model is NCHW, the data layout of the HBIR and HBM models will be changed to NHWC. This change does not affect the performance and accuracy of the model.

Calibration Parameters

Parameter NameParameter DescriptionRequired/Optional
cal_data_dir

PURPOSE: This parameter specifies the directory to save the calibration samples.
PARAMETER TYPE: String.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: The calibration data in the directory must comply with the requirements of input configurations, please refer to the Model Calibration Set Preparation section.
When there are multiple input nodes, the sequence of configured nodes must be strictly consistent with the input_name sequence.
For configuration methods of multiple values, please refer to param_value Configuration.
FOR EXAMPLE:

cal_data_dir: './calibration_data'

Attention: When the cal_data_dir parameter is not configured, the tool will perform pseudo calibration to facilitate quick verification. At this time, the accuracy of the model is abnormal and is only for functional testing.

optional
quant_config

PURPOSE: The S100 platform supports flexible configuration of various quantization parameters, you can use these parameters to configure the computational accuracy of the operator in the model, algorithms used in the calibration and search methods for calibration parameters.
PARAMETER TYPE: String/Dict.
RANGE: Path of json file or configuration parameters with dict form.
DEFAULT VALUE: None.
DESCRIPTIONS: This parameter supports the configuration of computational accuracy from multiple levels(model_config, op_config, node_config), and supports the configuration of multiple computational accuracy data types(int8/int16/float16), supports multiple calibration algorithms(kl/max), and supports searching for calibration parameters from different granularities (modelwise/layerwise), please refer to the The quant_config Introduction.
FOR EXAMPLE:

  • Mode 1:

quant_config: './quant_config.json'
  • Mode 2:
quant_config: { // Configure model-level parameters "model_config": { // Configure input data types for all nodes at once "all_node_type": "int16"/"float16", // Configure the data type of the model output "model_output_type": "int8"/"int16", } }
optional

Compilation Parameters

Parameter NameParameter DescriptionRequired/Optional
compile_mode

PURPOSE: This parameter specifies compilation strategies.
PARAMETER TYPE: String.
RANGE: 'latency', 'bandwidth' and 'balance'.
DEFAULT VALUE: 'latency'.
DESCRIPTIONS: The latency aims to optimize the latency time of inference.
The bandwidth aims to optimize the access bandwidth of DDR.
The balance aims to balance the optimization of latency and bandwidth, to set this option, you need to specify the balance_factor.
It is recommended to use the latency strategy as long as your models don't severely exceed the expected bandwidth.
FOR EXAMPLE:

compile_mode: 'latency'
optional
balance_factor

PURPOSE: This parameter specifies the balance ratio when the compile_mode is specified as balance.
PARAMETER TYPE: Int.
RANGE: 0-100.
DEFAULT VALUE: None.
DESCRIPTIONS: This parameter is only used when the compile_mode is specified as balance, otherwise the configuration will not take effect.

  • Configuration of 0 means that the bandwidth is optimal, which corresponds to the compile strategy with bandwidth as the compile_mode.
  • Configuration of 100 means that the performance is optimal, which corresponds to the compile strategy with latency as the compile_mode.
FOR EXAMPLE:

balance_factor: 100
compile_mode specified as balance: required
core_num

PURPOSE: This parameter specifies the number of cores to run model.
PARAMETER TYPE: Int.
RANGE: 1.
DEFAULT VALUE: 1.
DESCRIPTIONS: Used to configure the number of cores for the model to run on the Horizon platform.
FOR EXAMPLE:

core_num: 1
optional
optimize_level

PURPOSE: This parameter specifies the model optimization levels.
PARAMETER TYPE: String.
RANGE: 'O0' , 'O1','O2'.
DEFAULT VALUE: 'O2'.
DESCRIPTIONS: Optimization level ranges between O0 - O2.
O0: No optimization, fastest compilation speed and lowest optimization level.
O1 to O2: As the optimization level increases, the compiled model is expected to execute faster, but the compilation time is also expected to be longer.
FOR EXAMPLE:

optimize_level: 'O2'
optional
input_source

PURPOSE: This parameter specifies the input source of dev board hbm models.
PARAMETER TYPE: Dict.
RANGE: ddr , pyramid and resizer .
DEFAULT VALUE: None, it will be automatically selected from an optional range based on the value of input_type_rt by default:

  • When input_type_rt is specified as nv12 or gray, input_source is automatically selected as pyramid by default.
  • When input_type_rt is specified as any other value, input_source is automatically selected as ddr by default.
  • When this parameter is specified as resizer, input_type_rt only supports specifying as nv12 or gray.
DESCRIPTIONS: This is an option for adapting the engineering environment and you are recommended to configure it after all model validations are complete.
The ddr indicates that the data comes from memory. pyramid and resizer indicates the fixed hardware from the processor.
This parameter is a bit special, e.g., if the model input name is data and the data source is memory (ddr), then this parameter should be configured as {"data": "ddr"}.
FOR EXAMPLE:

input_source: {"data": "pyramid"}
optional
max_time_per_fc

PURPOSE: This parameter specifies the maximum continuous execution time (by μs) of model's each function call.
PARAMETER TYPE: Int.
RANGE: 0 or 1000-4294967295.
DEFAULT VALUE: 0.
DESCRIPTIONS: The inference of the compiled directive model in the BPU are denoted by 1 or multiple function-calls(the function-call is the atomic unit in BPU execution). The value of 0 means no restriction.
This parameter is used for specifying the max execution time of each function-call. The model only has a chance to be preempted when the execution of a single function-call is finished.
Please refer to the Model Preemption Control section.
FOR EXAMPLE:

max_time_per_fc: 1000

Attention:

  • Note that this parameter is only used to implement the model preemption function and can be ignored otherwise.
  • The model preemption funtion is only supported on the board, not in the simulator.

optional
jobs

PURPOSE: This parameter sets the number of processes when compiling the hbm model.
PARAMETER TYPE: Int.
RANGE: Within the maximum number of cores supported by the machine.
DEFAULT VALUE: 16.
DESCRIPTIONS: When you compile the hbm model, it is used to set the number of processes.
It can improve the compilation speed to some extent.
FOR EXAMPLE:

jobs: 8
optional
advice

PURPOSE: This parameter is used to indicate the predicted increase in elapsed time in microseconds after the model is compiled.
PARAMETER TYPE: Int.
RANGE: Natural number.
DEFAULT VALUE: 0.
DESCRIPTIONS: During the model compilation process, the toolchain will perform a time consumption analysis internally. In the actual process, the time consumption will be increased when doing operations such as data alignment of operators. After setting this parameter, when the deviation between the actual computation time and the theoretical computation time of a certain OP is larger than the value you specify, the relevant log will be printed, including information about the change in time, the shape and padding ratio before and after data alignment, etc.
FOR EXAMPLE:

advice: 0
optional
cache_path

PURPOSE: This parameter is used to configure the path for the compilation cache.
PARAMETER TYPE: The path supports lowercase letters(a-z), uppercase letters(A-Z), numbers(0-9), underscores(_), hyphens(-), periods(.), and their combined use.
RANGE: None.
DEFAULT VALUE: None. Enabling the compilation cache configuration is the default after configuring this path.
DESCRIPTIONS: By configuring the path for the compilation cache, you can enable cache acceleration. Cache acceleration can effectively reduce secondary compilation time and improve compilation speed.
If the compilation cache path you configured already exists and is valid, the target content will be automatically created in that path.
Attention: Please do not store any other proprietary files in the cache directory you configured, to prevent your own files from being deleted along with the cache directory when it is removed.
FOR EXAMPLE:

cache_path:'./cache'
optional
cache_mode

PURPOSE: This parameter is used to set the compilation cache mode.
PARAMETER TYPE: String.
RANGE: enable,force_overwrite,disable.
DEFAULT VALUE: disable.
DESCRIPTIONS:

  • enable: This indicates enabling the compilation cache. Once enabled, it can prevent repeated compilation of subgraphs with identical compilation parameters and operator parameters, thereby improving the compilation speed.
  • force_overwrite:This indicates enabling the operator compilation cache function. Different from the enable mode, in the force_overwrite mode, the cache hit for the current session will be forced to refresh. That is, the existing cache hit for this session will be deleted first, and then recompilation will be performed and added to the cache.
  • disable: This indicates disabling the compilation cache and recompiling.
FOR EXAMPLE

cache_mode:'disable'
optional
extra_params

PURPOSE: This parameter provides additional flexible configuration of some model compilation related parameters.
PARAMETER TYPE: Dict.
RANGE: input_no_padding , output_no_padding.
DEFAULT VALUE: {}.
DESCRIPTIONS: This parameter is supported for configuring the following features (supported for configuring together):

  • input_no_padding: if not specified, the default is False. If specified as True, all non-image inputs of the model will be performed with removing padding.
  • output_no_padding: if not specified, the default is False. If specified as True, all outputs of the model will performed with removing padding.

Attention: The extra_params parameter cannot be configured repeatedly, but if you want to configure both parameters at the same time, you can see the following example.
FOR EXAMPLE:

  • Individual configuration input_no_padding
  • extra_params: {"input_no_padding": True}
  • Individual configuration output_no_padding
  • extra_params: {"output_no_padding": True}
  • Simultaneous configuration input_no_padding and output_no_padding
  • extra_params: {"input_no_padding": True, "output_no_padding": True}
optional

param_value Configuration

You can specify the parameters like this: param_name: 'param_value', while multiple values can be separated by ';': param_name: 'param_value1;param_value2;param_value3'.

Hint

To avoid parameter sequence problems, You are strongly suggested to specify the parameters(such as input_shape etc.) explicitly when there are multi-input models.

Attention

Please note that, if set input_type_rt to nv12 , an odd number cannot appear in the input size of model.

Configuration File Template

The Resnet50 model is used as a sample here, a complete configuration file template is shown as below:

Note

Below configuration file is only for display, in an actual configuration file of a model, needs to be determined based on the original floating point model type you passed in, the caffe_model and onnx_model parameters are not coexisting.

For Caffe model, pass caffe_model + prototxt; for Onnx model, pass onnx_model.

That is, caffe_model + prototxt or onnx_model , you need to choose one of the two when configuring.

# model parameters model_parameters: # The original ONNX model file onnx_model: '../../01_common/model_zoo/mapper/classification/resnet50/resnet50.onnx' # The target processor architecture of conversion march: 'nash-e' # The prefix of the converted model file which will run on the dev board output_model_file_prefix: 'resnet50_224x224_nv12' # The directory where the conversion results will be saved working_dir: './model_output' # Batch delete nodes of a certain type remove_node_type: "Dequantize" # input information parameters input_parameters: # The input node name of the floating-point model input_name: "" # The input data format of the original floating-point model (quantity/sequence consistent with the input_name) input_type_train: 'rgb' # The input data layout of the original floating-point model (quantity/sequence consistent with the input_name) input_layout_train: 'NCHW' # The input data size of the original floating-point model input_shape: '1x3x224x224' # The data batch_size input to the neural network when the network is actually executed input_batch: 1 # The mean value of the image subtracted by the preprocessing method, if it is the channel mean, the values must be separated by a space mean_value: '123.675 116.28 103.53' # The image scaling of the preprocessing method, if it is a channel scaling, the values must be separated by a space scale_value: '0.01712475 0.017507 0.01742919' # The input data format which the board-side deployable model needs to match # (quantity/sequence consistent with the input_name) input_type_rt: 'nv12' # Special input data format input_space_and_range: 'regular' # Calibration parameters calibration_parameters: # The directory where the calibration samples will be saved cal_data_dir: './calibration_data_rgb' # compilation parameters compiler_parameters: # Select compilation strategy compile_mode: 'latency' # Number of cores to run the model core_num: 1 # Select the priority of model compilation optimize_level: 'O2' # Specify the maximum continuous execution time for each function call of the model max_time_per_fc: 1000 # Specify the number of processes when compiling the model jobs: 8