This section describes the alignment restriction rules when using BPU.
BPU does not restrict the model input size or parity. Both 416x416 inputs (e.g., YOLO) and 227x227 inputs (e.g., SqueezeNet) can be supported.
For NV12, both H & W of the input are required to be even to meet the requirement that the UV is half of the Y.
The BPU has alignment requirements for data. Valid data arrangement and aligned data arrangement are represented by validShapeand stride in hbDNNTensorProperties.
validShape is the shape of the valid data.stride represents the stride of each dimension of validShape, describing the number of bytes required to cross each dimension of the tensor. It is important to note that models with NV12 or Y-type inputs are special. These input types just satisfy the fixed alignment constraints in the width direction, and the stride they receive is -1. For detailed rules, please refer to the description in the Dynamic input Introduction section.The model input and output tensors can get the correct data layout through validShape and stride.
For example, if the getted model input attribute hbDNNDataType = HB_DNN_TENSOR_TYPE_U16, validShape = [1, 3, 212, 212] and stride = [301056, 100352, 448, 2],
this indicates that the valid input size of the model is 1x3x212x212 .
stride[3] = 2 indicates that each element is 16 bits in size.stride[2] = 448 = 2 * 224 indicates that the dimension at index=3 is aligned according to 224, therefore the stride for the dimension at index=3 is 448.stride[1] = 100352 = 448 * 224 indicates that the dimension at index=2 is also aligned according to 224, therefore the stride for the dimension at index=2 is 100352.stride[0] = 301056 = 100352 * 3 indicates that ithe dimension at index=1 is aligned according to 3 , consistent with the valid size, therefore the stride for the dimension at index=1 is 301056.In subsequent usage scenarios, considering the alignment requirements, when alignedByteSize > 0, it is recommended to apply for memory space according to the size of alignedByteSize.
You can use the following method to determine whether the model input needs to be aligned. If the formula does not hold, you need to perform additional alignment operations on the input data.
, where, n=validShape.numDimensions.
BPU has an alignment restriction on the first address of the model input and output memory, requiring the first address of the input and output memory to be 32 or 64 aligned, corresponding to different computing platforms.
The first address of the memory requested by hbUCPMalloc and hbUCPMallocCached interfaces is aligned to 64 by default.
When you request the memory and use the offset address as the input of the model, please check whether the first address after the offset meets the alignment requirements.