FAQ

General Issues

What is the reason for returning HB_UCP_INVALID_ARGUMENT error code after task creation or submission?

Possible problems can be determined from the UCP error logs, which may exist as follows:

  • Operator Constraint Problem: Most acceleration operators should satisfy usage constraints when they are created, otherwise they will return error code.

  • If you encounter a log that prints op $1 of task has no proper backend, user expect $2, indicating that there is no proper backend available for execution, where $1 indicates the type of the task and $2 is the backend parameter when the task is submitted in binary form, and the configuration needs to be done according to the number of cores that each S100 series backend can support.

How to understand the physical and virtual addresses of hbUCPSysMem?

In the S100 processor architecture, all hardware share the DDR memory, and a physically contiguous section of memory can be requested through the hbUCPMallocCached and hbUCPMalloc interfaczes.

The return values of these functions are wrapped in the hbUCPSysMem data structure, and the phyAddr and virAddr fields correspond to the physical and virtual addresses of its memory space, respectively.

As this memory space is contiguous, both physical and virtual addresses can be represented, read, and written by the first address. However, in practice, it is recommended to use virtual addresses in preference in non-essential scenarios.

How to understand cacheable and non-cacheable hbmem?

UCP's memory management interface provides hbUCPMallocCached and hbUCPMalloc to allocate DDR read/write memory, which is physically contiguous and can be accessed and used by bpu/dsp and other ip accesses. Where hbUCPMallocCached represents the allocation of memory for the cacheable attribute and is accompanied by the hbUCPMemFlush function to refresh the Cache.

The cache mechanism is determined by the memory architecture of the platform, as shown in the following figure. The cache between CPU and memory is used as a data cache, however, there is no cache between the BPU/DSP/JPU/VPU(Video Process Unit)/PYRAMID/STITCH/GDC backend hardware and main memory. Therefore, the misuse of the cache can cause problems in data reading/writing accuracy and efficiency.

runtime_dev_faq
  • When the CPU has finished writing data, it needs to actively flush the data in the cache to the memory, otherwise other hardware accessing the same memory space may read the old data from before.

  • The CPU also needs to actively INVALIDATE the data in the Cache before accessing it when the other backend hardware has finished writing the data, otherwise the CPU will preferentially read the old data previously cached.

  • In the continuous inference process of the model, those that need to be read by the cpu, such as the model output, it is recommended to apply for memory with cacheable to accelerate the efficiency of the CPU to read and write repeatedly, while those that don't need to be read, and are only written, such as the model input, can apply for non-cacheable memory.

Model Inference

What are the possible causes of the timeout on the hbUCPWaitTaskDone interface for model reasoning?

  • The model itself takes a long time to execute, and an insufficient timeout set by the asynchronous wait interface, or a long queuing time for the task due to a high load on the current computational resources, may trigger an interface timeout.

  • Memory leaks exist. Slow allocation of memory in case of insufficient system memory may lead to inference timeout.

  • CPU load is too high. The scheduling thread can't get CPU, at this time, even if the task is completed, it can't be synchronized to the user interface in time, which leads to the reasoning timeout situation.

Reasons why model inference gets stuck

Model problem: The underlying running error caused by the model command reason, the error is not reported, resulting in hang. At this point, the bpu task situation can be viewed by cat /sys/devices/system/bpu/bpu0/task_running, as shown below:

task_running

If s_time is not null, it means that the task has started normally, while p_time is null, it means that it has not returned normally, which can be assumed that the BPU task hangs, and can be solved by contacting sr or the compiler team.

What are the ROI input model constraints?

You can refer to the ROI Introduction and Constraints for the introduction of ROI constraints.