The logs in NN module are mainly divided into 7 levels:
The log level can be set to 0, 1, 2, 3, 4, 5 and 6, corresponding to Trace, Debug, Info, Warning, Error, Critical, and Never, with the default being Warning.
Log level setting rules:
DNN supports inference of HBIR models in X86 environments. Since there may not be a GPU in your environment, GPU acceleration is not used by default.
If there is a GPU on your machine, you can use GPU acceleration by setting the environment variable HB_NN_HBIR_GPU_ENABLE to true.
When using GPU acceleration, make sure that the corresponding GPU driver and CUDA environment are installed on your machine. For environment requirements, please refer to Environment Deployment.
When using GPU acceleration, make sure libhbdnn.so is in the directory set by LD_LIBRARY_PATH.
When using GPU acceleration, you can set the deviceId in the control parameters to specify the GPU for calculation. Make sure the set value is in the range [0, gpu_dvice_count).
Before the BPU actually uses the BPU memory, the NN module needs to perform special processing on the memory before it can be used normally. Frequent processing of memory will increase the CPU load, which may cause performance problems.
To solve this problem, the inference library provides a memory LRU cache function, which can be used by setting the environment variable HB_NN_ENABLE_MEM_LRU_CACHE to true. The setting method is as follows:
The input and output memory are managed based on the LRU (least recently used principle) internally. Therefore, when using the cache function, the memory you applied for will not be released immediately after calling the release interface. The memory will not be actually released until the cache is cleaned. You can control the cache cleanup time interval by setting the HB_NN_MEM_LRU_CACHE_CLEAN_INTERVAL environment variable.
Scenario constraints: The memory you applied for will only be profitable when it is used repeatedly.
This cache feature is only available for NN modules.
The UCP-side L2 Cache uses a static mapping scheme. You need to configure the size of the L2 Cache allocated to each BPU core via the environment variable HB_DNN_USER_DEFINED_L2M_SIZES . Sizes for each core are separated by : and the unit is MB . The size values only support integers.
During the inference preparation phase, UCP will allocate the L2 Cache for all cores according to your configuration. Below are several configuration examples and explanations:
When enabling L2 Cache optimization, inference priority cannot be set to preemptive priority.
In a multi-process environment, the L2 Cache allocated to different processes is independent of each other. Therefore, it is necessary to ensure that the total L2 Cache configured across all processes does not exceed the maximum hardware L2 Cache size.
The maximum L2 cache size can be viewed using the command cat /sys/kernel/debug/ion/heaps/custom.