Before reading this section, it is recommended to read the torch.fx — PyTorch documentation to get an initial understanding of torch's FX mechanism.
FX uses the symbolic execution approach, which allows models to be graphically constructed at the nn.Module or function level, allowing for automated fuse and other graph-based optimizations.
FX can sense the computational graph, so you can automate the fusion of operators. You no longer need to manually specify the operators to be fused, just call the interface directly.
inplace parameter for fuse_fx, because internally you need to do symbolic trace on the model to generate a GraphModule, so you can't do inplace modification.fused_model and model will share almost all attributes (including submodules, operators, etc.), so do not make any changes to the model after fuse, as this may affect the fused_model.fuse_fx interface, as the subsequent prepare interface integrates the fuse procedure internally.The global march must be set according to the target hardware platform before calling the prepare interface, which internally performs a fuse procedure (even if the model has already been fused) and then replaces the eligible operators in the model with implementations from horizon.nn.qat.
fuse_fx, this interface does not support the inplace parameter, and please do not make any changes to the input model after prepare.In most cases, FX quantized interfaces can directly replace eager mode quantized interfaces, but they cannot be mixed with eager mode interfaces. Some models require some modifications to the code structure in the following cases.
Both of these cases can be avoided by using wrap, as illustrated by RetinaNet.