Model Modification

In the process of model conversion and compilation, it may involve model modification scenarios, below we give several common scenarios of sample code and the comparison of HBIR model before and after modification as an example.

Attention

Please note that if you convert and compile the model using the PTQ pipeline API method, since this path does not save the HBIR model (*.bc file) before conversion by default, if you need to do visualization/other modifications based on this file, you can save it in the following way.

from hbdk4.compiler.onnx import export from hbdk4.compiler import save import onnx ptq_onnx = onnx.load("ptq_model.onnx") ptq_model = export(proto=ptq_onnx, name="model_name") save(ptq_model, "ptq_model.bc") # If you want to save the quantized.bc, run the following quantized_model = convert(m=ptq_model, march=march) save(quantized_model, "quantized.bc")

Multi-batch Splitting

Scenario

For the input model of batch 1, if you convert and compile the model through PTQ pipeline. We support you in using the hb_compile tool to configure the input_batch, separate_batch, and separate_name parameters in the yaml file to convert and compile the model for on-board inference. Parameters can be configured as described in section Specific Parameter Information.

For the input model of batch n, if you need to split the batch by dimension in order to compile the correct model that can be inference on the board, this process we need to realize by calling the insert_split interface of the compiler.

insert_split parameter: dim, used to specify a dimension of the input batch, data type is int, can be negative (reverse split).

Method

  1. For the input model of batch n, the model conversion compilation is performed through the PTQ pipeline, you need to convert the original floating-point model through the hb_compile tool or the HMCT API, and after generating ptq.onnx, you can refer to the following commands for the splitting of multiple batch:

    import onnx from hbdk4.compiler.onnx import export # load onnx model ptq_onnx = onnx.load("ptq_model.onnx") # Convert onnx model to hbir model ptq_model = export(proto=ptq_onnx, name="model_name") func = ptq_model.functions[0] # Insert split op to separate multi batch func.input[0].insert_split(dim=0)
  2. For the input model of batch n, the model conversion compilation is performed through the QAT path, you can refer to the following commands for the splitting of multiple batch:

    import torch import torch.nn as nn import torchvision from hbdk4.compiler.torch import export from hbdk4.compiler import load # Load hbir model qat_model = load("qat.bc") func = qat_model.functions[0] # Insert split op to separate multi batch func.input[0].insert_split(dim=0)

HBIR Model Structure Comparison Before and After Operation

Note

The HBIR files shown here before and after the operation are the HBIR files before and after the conversion (ptq_model.bc and quantized.bc), which are saved by the save command.

Pre-operationPost-operation
before_splitafter_split

Preprocessing Node Insertion

Scenario

When generating the HBIR model (*.bc) during the model conversion compilation process, if you need to prepare the data for color transformation, mean/scale/std processing within the HBIR model, this process we achieve by inserting preprocessing nodes.

If you convert and compile the model using the PTQ pipeline, our hb_compile tool is encapsulated to support you in completing these preparations by configuring the relevant parameters in the yaml.

If you convert and compile the model using the PTQ pipeline API method or using the QAT path, this data preparation requires a call to the compiler insert_image_preprocess interface. The insert_image_preprocess includes the following parameters:

  • The mode optional values are included:

    • "yuvbt601full2rgb": convert from YUVBT601Full to RGB mode (default).

    • "yuvbt601full2bgr": convert from YUVBT601Full to BGR mode.

    • "yuvbt601video2rgb": convert from YUVBT601Video to RGB mode.

    • "yuvbt601video2bgr": convert from YUVBT601Video to RGB mode.

    • "bgr2rgb": convert from BGR to RGB mode.

    • "rgb2bgr": convert from RGB to BGR mode.

    • "skip": no image format transformation is performed, only preprocessing is performed.

  • divisor: data conversion divisor, data type is int, default value is 255.

  • mean: data set mean, data type is double, length is aligned with input c direction, default is [0.485, 0.456, 0.406].

  • std: data set standard deviation value, data type is double, length is aligned with input c direction, default is [0.229, 0.224, 0.225].

  • is_signed: if or not the input is a signed number, the data type is bool, which indicates whether the input is convertd by -128 or not, the default is True, the current case of False is not supported for the time being.

Attention

Please note that, the preprocessing node only supports NHWC input. Therefore, if the data layout of the original input model is NCHW, the data layout of the HBIR and HBM models will be changed to NHWC after inserting the preprocessing node. This change does not affect the performance and accuracy of the model.

Method

  1. The model conversion compilation is performed through the PTQ pipeline, for image type inputs, it can be performed by configuring the parameters input_type_rt, input_type_train, mean_value, scale_value/std_value in the yaml, for the specific way of parameter configuration you can refer to Specific Parameter Information.

  2. To convert and compile the model by PTQ pipeline API method, you need to convert the original floating-point model first, and after generating ptq.onnx, you can refer to the following commands for preprocessing node insertion:

    import onnx from hbdk4.compiler.onnx import export # load onnx model ptq_onnx = onnx.load("ptq_model.onnx") # Convert onnx model to hbir model ptq_model = export(proto=ptq_onnx, name="model_name") func = ptq_model.functions[0] # Insert node for color conversion and normalization func.inputs[0].insert_image_preprocess(mode="yuvbt601full2rgb", divisor=255, mean= [0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], is_signed=True)
  3. To convert and compile the model by the QAT path, you can refer to the following command for preprocessing node insertion:

    import torch import torch.nn as nn import torchvision from hbdk4.compiler.torch import export from hbdk4.compiler import load # Load hbir model qat_model = load("qat.bc") func = qat_model.functions[0] # Insert node for color conversion and normalization func.inputs[0].insert_image_preprocess(mode="yuvbt601full2rgb", divisor=255, mean= [0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], is_signed=True)

HBIR Model Structure Comparison Before and After Operation

Note

The HBIR files shown here before and after the operation are the HBIR files before and after the conversion (ptq_model.bc and quantized.bc), which are saved by the save command.

Pre-operationPost-operation
ptq_model_bcinsert_image_preprocess

Pyramid Input Insertion

Scenario

Pyramid input, where the model on the BPU is input as YUV420SP (NV12). When preparing for pre-processing of input data, you need to set up Pyramid input to set up the input data source.

If you convert and compile the model using the PTQ pipeline and the command-line tool, our hb_compile tool has been encapsulated to support the insertion of the Pyramid input node by configuring the relevant parameters in the yaml file.

And if you are model conversion compilation via the PTQ pipeline API method or via the QAT path, Pyramid input are set up by calling the compiler insert_image_convert interface.

insert_image_convert parameters:

mode, specifies the conversion mode, optional values include:

  • "nv12": NV12 mode (default), the new input parameters will become two (y-component and uv-component), with c-dimensions 1, 2 respectively.

  • "gray": grayscale map mode, the new input parameter will remain as one, containing only the y-component, with a c-dimension of one.

Attention

Please note that due to hardware limitations, it is necessary to ensure that W >= 16 when inserting the Pyramid input.

Method

  1. If the model conversion compilation is performed via PTQ pipeline and using command line tool, the Pyramid input can be set by configuring the input_type_rt and input_source parameters in the yaml, for the specific way of parameter configuration you can refer to Specific Parameter Information.

  2. To convert and compile the model by PTQ pipeline API method, you need to convert the original floating-point model first, and after generating ptq.onnx, you can refer to the following commands for Pyramid input:

    import onnx from hbdk4.compiler.onnx import export # load onnx model ptq_onnx = onnx.load("ptq_model.onnx") # Convert onnx model to hbir model ptq_model = export(proto=ptq_onnx, name="model_name") func = ptq_model.functions[0] # Insert node for conversion from nv12 to yuv444 func.inputs[0].insert_image_convert(mode="nv12")
  3. To convert and compile the model through the QAT path, you can refer to the following command for Pyramid input:

    import torch import torch.nn as nn import torchvision from hbdk4.compiler.torch import export from hbdk4.compiler import load # Load hbir model qat_model = load("qat.bc") func = qat_model.functions[0] # Insert node for conversion from nv12 to yuv444 func.inputs[0].insert_image_convert(mode="nv12")

HBIR Model Structure Comparison Before and After Operation

Note

The HBIR files shown here before and after the operation are the HBIR files before and after the conversion (ptq_model.bc and quantized.bc), which are saved by the save command.

Pre-operationPost-operation
ptq_model_bcinsert_image_convert

Resizer Input Insertion

Scenario

Resizer input, where the model on the BPU is input as YUV420SP (NV12) plus a rectangular ROI. When preparing for pre-processing of input data, you need to set up Resizer inputs to set up the input data source.

If you convert and compile the model using the PTQ pipeline and the command-line tool, our hb_compile tool has been encapsulated to support the insertion of the Resizer input node by configuring the relevant parameters in the yaml file.

And if you are model conversion compilation via the PTQ pipeline API method or via the QAT path, Resizer input are set up by calling the compiler insert_roi_resize interface.

insert_roi_resize parameters:

  • mode, specifies the conversion mode, optional values include:

    • "nv12": NV12 mode (default), the new input parameters will become three, the y component (c dimension 1), the uv component (c dimension 2), and the component used to specify roi.

    • "gray": grayscale map mode, the new input parameters will become two, the y component (with a c dimension of 1) and the component used to specify roi.

  • interpolation_mode, specifies the interpolation mode, optional values include:

    • "bilinear": bilinear interpolation mode (default).

    • "nearest": nearest point interpolation mode.

  • When the coordinates of the given ROI exceed the original input range, the out-of-bounds areas will be padded. The optional padding parameters include:

    • pad_mode, the mode for padding the excess when the coordinates of a given roi are outside the range of the original input, included as an optional value in resize:

      • "constant": constant value padding (default).

      • "border": use input data edge value padding.

    • pad_value, specify the constant padding value, the default value is (0, -128), which corresponds to the y and uv component padding, respectively, and takes effect only when the padding mode is pad_mode to constant.

ROI Introduction and Constraints

ROI, that is, Region of Interest, there are four region side coordinates:

  • left: the horizontal coordinates of the left boundary of the ROI region.

  • top: the vertical coordinate of the upper boundary of the ROI.

  • right: the horizontal coordinate of the right border of the ROI area.

  • bottom: the vertical coordinate of the lower boundary of the ROI region.

The constraints of the ROI input model are as follows(Wout and Hout represent the width (W) and height (H) of the resized output image):

  1. The size of the original image input to NV12 requires:

    • S100&S100P: The alignment in the W direction requires 32 < = stride < = 262144, and must be a multiple of 32.
  2. The ROI must intersect with the image, 2 < = ROI_w < = 40962 < = ROI_h < = 4096.

  3. The coordinates of the ROI are represented as [w_begin, h_begin, w_end, h_end], where both the bottom-left and bottom-right coordinate points are included within the ROI range.

  4. The dimensions of the resized output image require meet the requirements: 2 <= Wout <= 4096 and 2 <= Hout <= 4096.

  5. The size constraints for the ROI and the output image are as follows:

    • S100&S100P:

      Start alignment: h_begin_align=floor_align(h_begin,2)h\_begin\_align = floor\_align(h\_begin, 2), w_begin_align=floor_align(w_begin,32)w\_begin\_align = floor\_align(w\_begin, 32).

      Size alignment: ROI_H=ceil_align((h_endh_begin_align+1),2)ROI\_H = ceil\_align((h\_end - h\_begin\_align + 1), 2), ROI_W=ceil_align((w_endw_begin_align+1),16)ROI\_W = ceil\_align((w\_end - w\_begin\_align + 1), 16).

      Constraint: ROI_H * ROI_W + Hout * Wout < 1.5MB.

  6. ROI scaling multiplier limit: 1/3.5 < = Wout/ROI_w< 65536 and 1/3.5 < = Hout/ROI_h < 65536.

Method

  1. If the model conversion compilation is performed via PTQ pipeline and using command line tool, the Resizer input can be set by configuring the input_type_rt and input_source parameters in the yaml, for the specific way of parameter configuration you can refer to Specific Parameter Information.

  2. To convert and compile the model by PTQ pipeline API method, you need to convert the original floating-point model first, and after generating ptq.onnx, you can refer to the following commands for Resizer input:

    import onnx from hbdk4.compiler.onnx import export # load onnx model ptq_onnx = onnx.load("ptq_model.onnx") # Convert onnx model to hbir model ptq_model = export(proto=ptq_onnx, name="model_name") func = ptq_model.functions[0] # Insert node for conversion from nv12 to yuv444 func.inputs[0].insert_roi_resize(mode="nv12")
  3. To convert and compile the model through the QAT path, you can refer to the following command for Resizer input:

    import torch import torch.nn as nn import torchvision from hbdk4.compiler.torch import export from hbdk4.compiler import load # Load hbir model qat_model = load("qat.bc") func = qat_model.functions[0] # Insert node for conversion from nv12 to yuv444 func.inputs[0].insert_roi_resize(mode="nv12")

HBIR Model Structure Comparison Before and After Operation

Note

The HBIR files shown here before and after the operation are the HBIR files before and after the conversion (ptq_model.bc and quantized.bc), which are saved by the save command.

Pre-operationPost-operation
ptq_model_bcinsert_roi_resize

Adjustment of Input/Output Data Layout

Scenario

In some specific scenario, we need to adjust the data layout for input/output data. For example, when performing data preprocessing, since the Pyramid/Resizer input only supports NHWC input. If your original floating point model is the input of NCHW, then you need to perform the re-adjustment of the input data layout, of which we implement this process by inserting the transpose node.

If you convert and compile the model using the PTQ pipeline and the command-line tool, our hb_compile tool has been encapsulated to support you to configure the yaml parameters to specify, our tool will be judged internally whether the data layout works to adjust.

If you convert and compile the model using a PTQ pipeline and using the API method or using a QAT path, you need to specify a call to the compiler insert_transpose interface to adjust the data layout.

Insert_transpose parameter: permutes, dimension transformation alignment, data type is List, you need to explicitly specify all dimensions of the original input, starting from 0.

  • insert_transpose inserts input parameters (i.e., adjusting the input data alignment) requires the imputation of permutes based on the dimensions of the original input Tensor and the dimensions of the desired input parameters. For example, the required input dimension is NHWC, and the original input Tensor dimension is NCHW, according to the required NHWC projection, we need to be able to correspond to the original input Tensor dimension in the order of [0,3,1,2], then at this time, we need to set the permutes to [0,3,1,2].

  • insert_transpose inserts output parameters (i.e., adjusts the output data layout), then the permutes can be set directly. For example, the original output Tensor is [1,32,16,3], and the required output is [16,3,32,1], then the original output Tensor can correspond to the required output Tensor dimensions according to the order of [2,3,1,0], then at this time we need to set the permutes to [2,3,1,0].

Method

  1. If the model conversion compilation is performed via PTQ pipeline and using command line tool, the input data layout for the original floating-point model can be specified in the yaml file via the input_layout_train parameter, for the specific way of parameter configuration you can refer to Specific Parameter Information.

  2. To convert and compile the model by PTQ pipeline API method, you need to convert the original floating-point model first, and after generating ptq.onnx, you can refer to the following commands to make adjustments to the input data layout:

    import onnx from hbdk4.compiler.onnx import export # load onnx model ptq_onnx = onnx.load("ptq_model.onnx") # Convert onnx model to hbir model ptq_model = export(proto=ptq_onnx, name="model_name") func = ptq_model.functions[0] # Insert transpose node to convert format from NCHW to NHWC func.inputs[0].insert_transpose(permutes=[0,3,1,2])
  3. To convert and compile the model through the QAT path, you can refer to the following command make adjustments to the input data layout:

    import torch import torch.nn as nn import torchvision from hbdk4.compiler.torch import export from hbdk4.compiler import load # Load hbir model qat_model = load("qat.bc") func = qat_model.functions[0] # Insert transpose node to convert format from NCHW to NHWC func.inputs[0].insert_transpose(permutes=[0,3,1,2])

HBIR Model Structure Comparison Before and After Operation

Note

The HBIR files shown here before and after the operation are the HBIR files before and after the conversion (ptq_model.bc and quantized.bc), which are saved by the save command.

Pre-operationPost-operation
before_insert_transposeinsert_transpose

Operator Deletion

Scenario

We support you to remove Dequantize, Quantize, Cast, Transpose, Softmax and Reshape operators at the beginning and end of the model after the model is convertd.

If you convert and compile the model using the PTQ pipeline and the command-line tool, our hb_compile tool is encapsulated to support you to remove operators by configuring the relevant parameters in yaml.

If you convert and compile the model using a PTQ pipeline and using the API method or using a QAT path, you need to call the compiler remove_io_op interface to remove the relevant operators.

remove_io_op parameters:

  • node_types: used to specify the type of operator to be deleted, the data type is List, after specifying, it will traverse the input and output nodes to delete the operator of the type you specified, node type support: ["Quantize", "Dequantize", "Transpose", "Reshape", "Cast", "Softmax"].

  • node_names: used to specify the name of the operator to be deleted, the data type is List, the operator with the name you specify will be deleted after you specify it, for example, ["transpose_1", "Reshape0"] will be deleted.

Attention

op_types and op_names need to be one or the other, if both are specified, only op_names will take effect.

Method

  1. If the model conversion compilation is performed via PTQ pipeline and using command line tool, the operators to be removed can be specified in the yaml file by the remove_node_type, remove_node_name parameters, for the specific way of parameter configuration you can refer to Specific Parameter Information.

  2. To convert and compile the model by PTQ pipeline API method, you need to convert the original floating-point model first, and after generating ptq.onnx, you can refer to the following commands to remove the operators:

    import onnx from hbdk4.compiler.onnx import export # load onnx model ptq_onnx = onnx.load("ptq_model.onnx") ptq_model = export(proto=ptq_onnx, name="model_name") quantized_model = convert(m=ptq_model, march=march) func = quantized_model.functions[0] # Delete removable nodes recursively func.inputs[0].remove_io_op(op_types = ["Dequantize", "Quantize"])
  3. To convert and compile the model through the QAT path, you can refer to the following command to remove the operators:

    import torch import torch.nn as nn import torchvision from hbdk4.compiler.torch import export from hbdk4.compiler import load # Load hbir model qat_model = load("qat.bc") quantized_model = convert(m=qat_model, march=march) func = model.functions[0] # Delete removable nodes recursively func.inputs[0].remove_io_op(op_types = ["Dequantize", "Quantize"])

HBIR Model Structure Comparison Before and After Operation

Note

The HBIR file shown here before and after the operation is the model before and after the operation on quantized.bc after conversion (quantized and dequantized nodes are inserted only after conversion), which is saved by the save command.

Pre-operationPost-operation
before_remove_dequantizeafter_remove_dequantize