hmct.api.build_model

Interface Description

The model conversion function provided by HMCT, inputs an onnx model and outputs the quantized model after model conversion and quantization.

Interface Form

def build_model( onnx_model: ModelProto, march: str, cali_data: Union[Sequence[np.ndarray], Dict[str, Sequence[np.ndarray]]], quant_config: Optional[Union[str, Dict[str, Any]]] = None, input_dict: Optional[Dict[str, Any]] = None, name_prefix: Optional[str] = None, verbose: Optional[bool] = True, ) -> ModelProto

Return Value

Output a quantized onnx model that can be used for quantized model accuracy evaluation, as well as compiled into a deployment model via hbdk.

Parameter

ParameterTypeDefault ValueDescription
onnx_modelModelProtoRequired, no default valueThe input onnx model.
marchstrRequired, no default valueComputing platform.
cali_dataUnion[Sequence[np.ndarray], Dict[str, Sequence[np.ndarray]]]Required, no default value

Calibration data:

  1. Single-input models can specify calibration data via Sequence[np.ndarray].
  2. cali_data = [sample0, sample1, ...]
  3. Multi-input models require calibration data to be specified via Dict[str, Sequence[np.ndarray]].
  4. cali_data = {'input_name0': [sample0, sample1, ...], 'input_name1': [sample0, sample1, ...]}
quant_configOptional[Union[str, Dict[str, Any]]]None

Specify quantized configuration parameters, detailed quantized parameter configuration description reference The quant_config Introduction

  1. Use the default quantization method.
  2. quant_config = None
  3. Quantization parameters can be configured via a json file.
  4. quant_config = quant_config_json_file
  5. Configure quantization parameters via a dict type.
  6. quant_config = {'node_config' : {'Conv0', {'qtype', 'int16'}}}
input_dictOptional[Dict[str, Any]]None

Modify the converted model inputs according to the specified parameters. Note: The caller needs to ensure that the modifications to the model inputs are legal.

  1. Modify the input_shape of the specified input:
  2. input_dict = {'input_name0': {'input_shape':[x,x,x,x]}}
  3. Modify the batch_size of the specified input:
  4. input_dict = {'input_name0': {'input_batch': 4}}
name_prefixOptional[str]None

Specify the save path for the generated object during the model conversion process.

  1. None, saved in the current directory.
  2. Specify a path relative to the current directory in which to save the generated objects during the conversion process.
  3. # Saved in the current directory, saved files are prefixed with temp_; name_prefix = 'temp_' # Save in the . /tmp01 directory, saved files are prefixed with temp_; name_prefix = './tmp01/temp_'
verboseOptional[bool]Trueverbose=True/False, specify whether detailed information is printed during the model conversion.

Generator Description

Generator NameDescription
original_float_model.onnxThe output after the original model conversion, which is included in this phase of the conversion: opset, ir version conversion, input_shape modification operation.
optimized_float_model.onnxThe output of the model optimization phase, which is included in this phase of the conversion: constant folding, operator fusion, useless operator removal, operator replacement, and operator splitting.
calibrated_model.onnxThe output of the model calibration phase, which is included in this phase of the quantization: inserting calibration nodes, counting data distributions, and calculating quantization parameters.
ptq_model.onnxThe output of the model quantization phase, which is included in this phase of the quantization: tuning and converting the quantization parameters based on the specified march.