Horizon OpenExplorer currently provides 2 sets of model quantization schemes at the same time.
Among them:
Both solutions do not interfere with the training phase of the floating-point model, which is your own responsibility.
Horizon has also provided some open-source implementations of the public version of Pytorch for efficient models in classification, detection, segmentation,
and other scenarios in the samples/ai_toolchain/horizon_model_train_sample for reference, with the support for training and replication on the host.
For PTQ scheme, you need to quantize the model in the host development environment, and then copy the compiled .hbm model to the dev board environment for subsequent deployment.
For the QAT scheme, you need to complete the QAT training of the model in the host development environment, perform the quantization conversion, and then copy the compiled .hbm model to the dev board environment for subsequent deployment.
For both of the above quantization schemes and the development environment of the efficient model, Horizon provides both local manual installation and Docker containers. We strongly recommend using Docker containers as they do not pollute the local environment and is easy to use, descriptions of these two are included in the following sections.
In order to use the toolchain smoothly, we recommends that the development machine you choose should meet the following requirements:
| HW/OS | REQUIREMENTS |
| CPU | CPU above I3 or same level processor as E3/E5 |
| Memory Size | 16G or above |
| GPU | CUDA12.6, Drive Version: Linux: >= 550.163.01 Adapted graphics cards include but are not limited to: 1)GeForce RTX 3090 2)GeForce RTX 2080 Ti 3)NVIDIA TITAN V 4)Tesla V100S-PCIE-32GB 5)A100 |
| OS | Native Ubuntu 22.04 |
For more information about CUDA compatibility with graphics cards, refer to NVIDIA website information.
Horizon requires the following Docker base environment, please complete the installation on your host computer in advance.
After completing the installation of the Docker environment, remember to add non-root users into Docker users group by running below command:
To help you quickly use the toolchain, we provide a Docker image containing the complete development environment, which greatly simplifies the deployment process of the development environment.
If you have downloaded the offline image, you need to use the following command to load the image locally first.
You can start the Docker container corresponding to the current OE version by running the following script directly from the first level of the OE package:
Where data is the path to the evaluation dataset folder. Please create the path before running the command, otherwise loading problems may occur.
If you want to use the CPU version of the Docker image, you need to add the cpu parameter:
Since the environment variables PATH and LD_LIBRARY_PATH are configured during the build process of the OE Docker image,
not using the recommended way (e.g., docker attach) to enter the container may result in the environment variables not being loaded correctly,
which may lead to the use of abnormalities in tools such as Cmake, GCC, CUDA, and so on.
If you want the Docker container to exit without removing it, use the command line docker run -it to start it manually, without the --rm option.
If you want the Docker container to run background after startup, add the -d option after the command line docker run -it, the container ID will be returned after the container is started,
and then you can enter the container again with the command docker exec -it {container ID} /bin/bash.
For manual local environment setup, you can enter the package/host folder and install the required files.
The PTQ scheme has the following software dependencies on the base software of the development machine operating environment:
The QAT quantization environment is installed in the local environment and you need to ensure that the following basic environmental conditions are met.
The environmental dependencies required for the quantitative training tool to be trained are listed below:
| HW/OS | GPU | CPU |
|---|---|---|
| os | Ubuntu22.04 | Ubuntu22.04 |
| cuda | 12.6 | N/A |
| python | 3.10 | 3.10 |
| torch | 2.6.0+cu126 | 2.6.0+cpu |
| torchvision | 0.21.0+cu126 | 0.21.0+cpu |
| Recommended Graphics Cards | titan v/2080ti/v100/3090 | N/A |
After completing the training of the QAT model, you can install the relevant toolkits in the current training environment and complete the subsequent model conversion directly through the interface call.
Once the model has been quantized, the compiled model can be deployed on the dev board environment for inference and execution.
To deploy the runtime environment, you need to prepare a dev board with the system image programmed, and then copy the relevant supplementary files to the dev board.
At this stage, you need to verify the usability of the dev board and program the available system images to the board.
Some of the supplementary tools of the toolchain are not included in the system image, but can be copied to the dev board by running the installation script in the OE package in the host environment, as follows:
Where ${board_ip} is the IP address you set for the dev board. Make sure that you can successfully access this IP on the development PC.
After the supplementary files are successfully installed, please restart the dev board and execute hrt_model_exec --help on the dev board to verify if the installation is successful.