The model conversion function provided by HMCT, inputs an onnx model and outputs the quantized model after model conversion and quantization.
Output a quantized onnx model that can be used for quantized model accuracy evaluation, as well as compiled into a deployment model via hbdk.
| Parameter | Type | Description | |
| onnx_model | ModelProto | Required, no default value | The input onnx model. |
| march | str | Required, no default value | Computing platform. |
| cali_data | Union[Sequence[np.ndarray], Dict[str, Sequence[np.ndarray]]] | Required, no default value | Calibration data:
|
| quant_config | Optional[Union[str, Dict[str, Any]]] | None | Specify quantized configuration parameters, detailed quantized parameter configuration description reference The quant_config Introduction。
|
| input_dict | Optional[Dict[str, Any]] | None | Modify the converted model inputs according to the specified parameters. Note: The caller needs to ensure that the modifications to the model inputs are legal.
|
| name_prefix | Optional[str] | None | Specify the save path for the generated object during the model conversion process.
|
| verbose | Optional[bool] | True | verbose=True/False, specify whether detailed information is printed during the model conversion. |
| Generator Name | Description |
|---|---|
| original_float_model.onnx | The output after the original model conversion, which is included in this phase of the conversion: opset, ir version conversion, input_shape modification operation. |
| optimized_float_model.onnx | The output of the model optimization phase, which is included in this phase of the conversion: constant folding, operator fusion, useless operator removal, operator replacement, and operator splitting. |
| calibrated_model.onnx | The output of the model calibration phase, which is included in this phase of the quantization: inserting calibration nodes, counting data distributions, and calculating quantization parameters. |
| ptq_model.onnx | The output of the model quantization phase, which is included in this phase of the quantization: tuning and converting the quantization parameters based on the specified march. |