The horizon_plugin_pytorch currently supports quantization using the eager mode, however, we no longer recommend using this mode. For quantization in eager mode, currently you are still supported by our horizon_plugin_pytorch. The overall flow of the Eager mode is also based on PyTorch's quantization interface and ideas, so we recommend that you read the Eager mode section of the PyTorch official document.
The main differences between using eager mode and fx mode in horizon_plugin_pytorch are:
| Original floating-point operator | Operator to be replaced |
|---|---|
| torch.nn.functional.relu | torch.nn.ReLU() |
| a + b / torch.add | horizon.nn.quantized.FloatFunctional().add |
| Tensor.exp | horizon.nn.Exp() |
| torch.nn.functional.interpolate | horizon.nn.Interpolate() |
fuser_func provided in horizon_plugin_pytorch when calling it. As shown below:The eager mode cannot use QconfigSetter because there is no computation graph, and the qconfig can only be configured by setting attributes.