S100 Torch算子BPU约束列表

注意

下方默认进行了如下别名替换:

import horizon_plugin_pytorch as horizon

下方表格中:

lhs:left-hand side,指运算中的左操作数。

rhs:right-hand side,指运算中的右操作数。

Torch OperatorEager Mode OperatorTorch constraint
torch.abs
torch.Tensor.abs
input:
Type: int8, int16, float16
Shape: [*]
output:
The input and output types need to be the same.
Shape: [*]
torch.acos
torch.Tensor.acos
horizon.nn.Acosinput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.acosh
torch.Tensor.acosh
horizon.nn.Acoshinput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.add
torch.Tensor.add
torch.nn.quantized.FloatFunctional OR
horizon.nn.quantized.FloatFunctional
lhs:
Type: int8, int16, int32, float16, float32
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
The input and output types need to be the same.
Shape: [*]
others:
When inputs are of integer or quantized type, if the result overflows, it will saturate to the maximum or minimum value of the result data type.
torch.all
torch.Tensor.all
input:
Type: bool8, int8, int16
Shape: [*]
Dim: reduce axis dim size ∈ [1, 65535]
Element : reduce Elements size ∈ [1, 65535]
output:
Type: bool8
Shape: reduce dim will be 1 or fused depend on keepDim
torch.any
torch.Tensor.any
input:
Type: bool8, int8, int16
Shape: [*]
Dim: reduce axis dim size ∈ [1, 65535]
Element : reduce Elements size ∈ [1, 65535]
output:
Type: bool8
Shape: reduce dim will be 1 or fused depend on keepDim
torch.argmax
torch.Tensor.argmax
input:
Type: int8, int16, int32, float16, float32
Shape: [*]
Dim: if type is int8, int16 or float16, reduce axis dim size ∈ [1, 32767]
Element : if type is int8 or int16, reduce Elements size ∈ [1, 65535]
dims:
If type is int32, float16, float32, only support one dim.
output:
Type: int8, int16, int32
Shape: reduce dim will be 1 or fused depend on keepDims
torch.argmin
torch.Tensor.argmin
input:
Type: int8, int16, int32, float16, float32
Shape: [*]
Dim: if type is int8, int16 or float16, reduce axis dim size ∈ [1, 32767]
Element : if type is int8 or int16, reduce Elements size ∈ [1, 65535]
dims:
If type is int32, float16, float32, only support one dim.
output:
Type: int8, int16, int32
Shape: reduce dim will be 1 or fused depend on keepDims
torch.asin
torch.Tensor.asin
horizon.nn.Asininput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.asinh
torch.Tensor.asinh
horizon.nn.Asinhinput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.atan 
torch.Tensor.atan
horizon.nn.Ataninput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.atanh
torch.Tensor.atanh
horizon.nn.Atanhinput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.bitwise_andlhs:
Type: int8, int16
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
The input and output types need to be the same.
Shape: [*]
torch.bitwise_notinput:
Type: int8, int16
Shape: [*]
output:
The input and output types need to be the same.
Shape: [*]
torch.bitwise_orlhs:
Type: int8, int16
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
The input and output types need to be the same.
Shape: [*]
torch.bitwise_xorlhs:
Type: int8, int16
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
The input and output types need to be the same.
Shape: [*]
torch.ceil
torch.Tensor.ceil
horizon.nn.Ceilinput:
Type: int8, int16, float16
output:
Type: int8, int16, float16. only support combinations (int8→int8), (int16→int8,int16), (float16→float16)
torch.clamp
torch.clip
torch.Tensor.clamp
torch.Tensor.clip
 if minmax is scalar:
input:
Type: int8, int16, float16
Shape: [*]
output:
The input and output types need to be the same.
Shape: [*]
if minmax is Tensor:
lhs:
Type: int8, int16, float16, float32
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
The input and output types need to be the same.
Shape: [*]
torch.cat
torch.concat
torch.concatenate
torch.nn.quantized.FloatFunctional OR
horizon.nn.quantized.FloatFunctional
input:
Type: No limits
Arg Number: input number ∈ [1, 1024]
Dim: all dims < 131072
Size: size < 2G
output:
The input and output types need to be the same.
Dim: all dims < 131072
Size: size < 2G
torch.cosh
torch.Tensor.cosh
horizon.nn.Coshinput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.cumsum
torch.Tensor.cumsum
horizon.nn.CumSuminput:
Type: int8, int16, float16, float32
Shape: [*, dim[axis], *]
Dim: * ∈ [1, 65536]; dim[axis] ∈ [1, 8192]
output:
Type: int8, int16, int32, float16, float32. only support combinations (int8→int8,int16,int32), (int16→int16,int32), (float16→float16), (float32→float32)
The Shape and Dim is same as input
exclusive:
If type is float16, float32, only support exclusive == 0, otherwise, exclusive is 0 or 1
reverse:
If type is float16, float32, only support reverse == 0, otherwise, exclusive is 0 or 1
torch.div
torch.Tensor.div
lhs:
Type: quantized type support int8, int16, others support int16, int32, float16, float32
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
For quantized type, support (int8,int16→int8,int16), others the input and output types need to be the same.
Shape: [*]
rounding_mode:
For integer, only support TRUNC, For float, only support NONE.
others:
When inputs are of integer or quantized type, if the result overflows, it will saturate to the maximum or minimum value of the result data type.
torch.eq
torch.Tensor.eq
 lhs:
Type: int8, int16, int32, float16, float32, bool8
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
Type: bool8
Shape: [*]
torch.gather
torch.Tensor.gather
 input:
Type: int8, int16, int32, float16, float32
Shape: [*]
Input will transpose to [N, W, C]. W is inputShape[dim], N is the product of inputShape[:dim], C is the product of inputShape[dim+1:].
N, C ∈ [1, 1048576]. N × C should not be larger than 1048576
W ∈ [1, 4096]. If input type is int8, int16, W ∈ [1, 32768].
indices:
Type: int8, int16, int32; Unsupported negative indices.
Shape: [*] indices value should not be larger than 32768
Indices will transpose to [N, D, C]. D is indicesShape[dim], N is the product of indicesShape[:dim], C is the product of indicesShape[dim+1:].
N, C ∈ [1, 1048576], D ∈ [1, 737280(720*1024)].
IndicesShape[i] <= inputShape[i] for all dimensions i != dim.
output:
The input and output types need to be the same.
torch.gt
torch.greater
torch.Tensor.gt
torch.Tensor.greater
 lhs:
Type: int8, int16, int32, float16, float32
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
Type: bool8
Shape: [*]
torch.ge
torch.greater_equal
torch.Tensor.ge
torch.Tensor.greater_equal
 lhs:
Type: int8, int16, int32, float16, float32
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
Type: bool8
Shape: [*]
torch.lt
torch.less
torch.Tensor.lt
torch.Tensor.less
 lhs:
Type: int8, int16, int32, float16, float32
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
Type: bool8
Shape: [*]
torch.le
torch.less_equal
torch.Tensor.le
torch.Tensor.less_equal
 lhs:
Type: int8, int16, int32, float16, float32
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
Type: bool8
Shape: [*]
torch.erf
torch.Tensor.erf
horizon.nn.Erfinput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.exp
torch.Tensor.exp
horizon.nn.Expinput:
Type: int8, int16
output:
Type: int8, int16. only support combinations (int8→int8), (int16→int8,int16)
torch.Tensor.expand input:
Type: No limits
Arg Number: input number ∈ [1, 1024]
Dim: all dims < 131072
Size: size < 2G
output:
The input and output types need to be the same.
Dim: all dims < 131072
Size: size < 2G
torch.flatten
torch.Tensor.flatten
torch.nn.Flatten
input:
Type: No limits
output:
The input and output types need to be the same.
torch.flip
torch.Tensor.flip
input:
Type: int8, int16, int32
output:
The input and output types need to be the same.
torch.floor
torch.Tensor.floor
horizon.nn.Floorinput:
Type: int8, int16, float16
output:
Type: int8, int16, float16. only support combinations (int8→int8), (int16→int8,int16), (float16→float16)
torch.fmod
torch.remainder
horizon.nn.FMod
horizon.nn.Remainder
lhs:
Type: int16, int32. Not support quantized type.
Shape: [*]
Value range: must be none-negative.
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
Value range: must be positive.
output:
The input and output types need to be the same.
Shape: [*]
torch.index_select
torch.Tensor.index_select
torch.unbind
torch.Tensor.unbind
input:
Type: int8, int16, int32, float16, float32
Shape: [*]
Input will transpose to [N, W, C]. W is inputShape[dim], N is the product of inputShape[:dim], C is the product of inputShape[dim+1:].
N, C ∈ [1, 1048576], W ∈ [1, 4096]. If input type is int8, int16, W ∈ [1, 32768].
index:
Type: int8, int16, int32; Unsupported negative indices.
Shape: [*] index value should not be larger than 32768. And the reduce multiple of all index dims of shape should in range [1, 737280(720*1024)], because all dims
Will be reduced to W dim of indices and output. If W of fout is larger than 737280, this op will be split too many sub-ops.
output:
The input and output types need to be the same.
torch.log
torch.Tensor.log
horizon.nn.HardLoginput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.logical_and
torch.Tensor.logical_and
lhs:
Type: int8, int16, bool8, float16
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
Type: bool8
Shape: [*]
torch.logical_not
torch.Tensor.logical_not
input:
Type: int8, int16, bool8, float16
Shape: [*]
output:
Type: bool8
Shape: [*]
torch.logical_or
torch.Tensor.logical_or
lhs:
Type: int8, int16, bool8, float16
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
Type: bool8
Shape: [*]
torch.logical_xor
torch.Tensor.logical_xor
lhs:
Type: int8, int16, bool8, float16
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
Type: bool8
Shape: [*]
torch.Tensor.masked_fillinput:
Type: int8, int16
output:
The output and input types need to be the same.
torch.matmul
torch.Tensor.matmul
horizon.nn.quantized.FloatFunctionallhs:
Type: int8, int16
Shape: [*,M,C]
Dim: * ∈ [1, 4096], M,C ∈ [1, 8192]
rhl:
Type: int8, int16
Shape: [*,C,N]
Dim: * ∈ [1, 4096]; C ∈ [1, 8192], N ∈  [1, 1048576]
output:
Type: int8, int16, int32
Shape: [*,M,N]
Other constraints: Same as lhs and rhs
torch.max
torch.Tensor.max
torch.min
torch.Tensor.min
 input:
Type: int8, int16, int32, float16, float32
Shape: [*]
Dim: if type is int8, int16 or float16, reduce axis dim size ∈ [1, 32767]
Element : if type is int8 or int16, reduce Elements size ∈ [1, 65535]
dims:
If type is int32, float16, float32, only support one dim
output:
Value Type: The input and output types need to be the same.
Index Type: int8, int16, int32
Shape: reduce dim will be 1 or fused depend on keepDim
torch.maximum
torch.Tensor.maximum
horizon.nn.quantized.FloatFunctionallhs:
Type: int8, int16, float16, float32
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
The input and output types need to be the same.
Shape: [*]
torch.mean
torch.Tensor.mean
horizon.nn.quantized.FloatFunctionalinput:
Type: int8, int16 float16, float32
Shape: [*]
Dim: reduce axis dim size ∈ [1, 65535]
Element : reduce Elements size ∈ [1, 65535]
output:
Type: int8, int16, float16, float32. only support combinations (int8→int8,int16), (int16→int16), (float16→float16), (float32→float32)
Shape: reduce dim will be 1 or fused depend on keepDim
dims:
If type is float16 or float32, only support one dim.
torch.minimum
torch.Tensor.minimum
horizon.nn.quantized.FloatFunctionallhs:
Type: int8, int16, float16, float32
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
The input and output types need to be the same.
Shape: [*]
torch.mul
torch.Tensor.mul
torch.nn.quantized.FloatFunctional or
horizon.nn.quantized.FloatFunctional
lhs:
Type: int8, int16, float16, float32
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
Type: int8, int16, int32, float16, float32. only support (int8,int16→int8,int16,int32), (float16→float16), (float32→float32).
Shape: [*]
others:
When inputs are of integer or quantized type, if the result overflows, it will saturate to the maximum or minimum value of the result data type.
torch.neg
torch.negative
torch.Tensor.neg
torch.Tensor.negative
input:
Type: int8, int16, int32, float16
Shape: [*]
output:
The input and output types need to be the same.
Shape: [*]
torch.ne
torch.not_equal
torch.Tensor.ne
torch.Tensor.not_equal
 lhs:
Type: int8, int16, int32, float16, float32, bool8
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
Type: bool8
Shape: [*]
torch.permute
torch.Tensor.permute
 input:
Type: No limits
output:
The input and output types need to be the same.
torch.pow
torch.Tensor.pow
horizon.nn.Powif exponent is scalar 2:
input:
Type: int8, int16, float16, float32
Shape: [*]
output:
Type: int8, int16, int32, float16, float32. only support (int8,int16→int8,int16,int32), (float16→float16), (float32→float32).
Shape: [*]
others:
When inputs are of integer or quantized type, if the result overflows, it will saturate to the maximum or minimum value of the result data type.
if exponent is scalar and not 2:
input:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.reciprocal
torch.Tensor.reciprocal
horizon.nn.Reciprocalinput:
Type: int8, int16, float16, float32
output:
Type: int8, int16, float16, float32. only support combinations (int8→int8), (int16→int8,int16), (float16→float16), (float32→float32)
torch.Tensor.repeat input:
Type: No limits
output:
The input and output types need to be the same.
torch.repeat_interleave
torch.Tensor.repeat_interleave
input:
Type: No limits
Dim: all dims < 2097152
output:
The input and output types need to be the same.
Dim: all dims < 131072
Size: size < 2G
torch.reshape
torch.Tensor.reshape
torch.Tensor.view
input:
Type: No limits
output:
The input and output types need to be the same.
torch.roll
torch.Tensor.roll
input:
Type: No limits
output:
The input and output types need to be the same.
torch.round
torch.Tensor.round
input:
Type: int8, int16, float16, float32
output:
The output and input types need to be the same.
torch.rsqrt
torch.Tensor.rsqrt
horizon.nn.Rsqrtinput:
Type: int8, int16, float16, float32
output:
Type: int8, int16, float16, float32. only support combinations (int8→int8), (int16→int8,int16), (float16→float16), (float32→float32)
torch.sign
torch.Tensor.sign
input:
Type: float16
Shape: [*]
output:
The shape and type is same as input
torch.sinh
torch.Tensor.sinh
horizon.nn.Sinhinput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.slice_scatterhorizon.nn.SliceScatterinput:
Type: No limits
Dim: all dims < 2097152
output:
The input and output types need to be the same.
The input and output constraints are the same.
torch.split
torch.Tensor.split
 input:
Type: No limits
Dim: all dims < 2097152
output:
The input and output types need to be the same.
The input and output constraints are the same.
torch.sqrthorizon.nn.Sqrtinput:
Type: int8, int16, float16, float32
output:
Type: int8, int16, float16, float32. only support combinations (int8→int8), (int16→int8,int16), (float16→float16), (float32→float32)
torch.squeeze
torch.Tensor.squeeze
 input:
Type: No limits
output:
The input and output types need to be the same.
torch.stackhorizon.nn.quantized.FloatFunctionalinput:
Type: No limits
Arg Number: input number ∈ [1, 1024]
Dim: all dims < 131072
Size: size < 2G
output:
The input and output types need to be the same.
Dim: all dims < 131072
Size: size < 2G
torch.sub
torch.Tensor.sub
horizon.nn.quantized.FloatFunctionallhs:
Type: int8, int16, float16
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
The input and output types need to be the same.
Shape: [*]
others:
When inputs are of integer or quantized type, if the result overflows, it will saturate to the maximum or minimum value of the result data type.
torch.sum
torch.Tensor.sum
horizon.nn.quantized.FloatFunctionalinput:
Type: int8, int16 float16, float32
Shape: [*]
Dim: reduce axis dim size ∈ [1, 65535]
Element : reduce Elements size ∈ [1, 65535]
output:
Type: int8, int16, float16, float32. only support combinations (int8→int8,int16), (int16→int16), (float16→float16), (float32→float32)
Shape: reduce dim will be 1 or fused depend on keepDim
dims:
If type is float16 or float32, only support one dim.
torch.tanhorizon.nn.Taninput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.tile
torch.Tensor.tile
input:
Type: No limits
output:
The input and output types need to be the same.
torch.Tensor.toinput:
Type: int8, int16, int32, float16, float32, bool8
Shape: [*]
output:
Type: int8, int16, int32, float16, float32, bool8. only support combinations (int8→int8,int16,int32,float16,float32,bool8), (int16→int8,int16,int32,float16,float32,bool8), (int32→int16,int32,float16,float32), (float16→int8,int16,int32,float16, float32), (float32→int8,int16,int32,float16,float32), (bool8→int8,int16,int32,float16,float32,bool8)
Shape: [*]
torch.Tensor.floatinput:
Type: int8, int16, int32, float16, float32
Shape: [*]
output:
Type: float32
Shape: [*]
torch.topk
torch.Tensor.topk
horizon.functional.stable_topk
 input:
Type: int8, int16, int32, float16, float32
output:
The type of output value is same as input, and indices is integer type.
k:
K <= 1024
others:
The combined size in bytes of the op's operands and outputs must be no more than 4.8MB.
torch.transpose
torch.Tensor.transpose
 input:
Type: No limits
output:
The input and output types need to be the same.
torch.tril
torch.triu
input:
Type: int8, int16
output:
The output and input types need to be the same.
torch.unsqueeze
torch.Tensor.unsqueeze
 input:
Type: No limits
output:
The input and output types need to be the same.
torch.wherehorizon.nn.Whereinput num must be 3.
condition:
Type: bool8
Shape: [*]
lhs:
Type: int8, int16
Shape: [*]
rhs:
The lhs and rhs types need to be the same.
Shape: [*]
output:
Type: float16, float32
Shape: [*]
input:
Type: int8,int16,int32
Shape: [*]
torch.zeros_like
torch.ones_like
No limits
torch.linalg.normhorizon.nn.LinalgNormonly support ord in (1.0, 2.0, "fro")
input:
Type: int8, int16
Dim: normalized dim size ∈ [1, 65535]
output:
Type: int8, int16
torch.nn.functional.avg_pool1d
torch.nn.AvgPool1d
torch.nn.functional.avg_pool2d
torch.nn.AvgPool2d
torch.nn.functional.adaptive_avg_pool1d
torch.nn.AdaptiveAvgPool1d
torch.nn.functional.adaptive_avg_pool2d
torch.nn.AdaptiveAvgPool2d
input:
Type: int8, int16
Shape: [*,H,W,C] or [*,L,C]
output:
The input and output types need to be the same.
kernel:
Shape: [KL] or [KH,KW], only support 1d or 2d now
Dim: 1d: KL ∈ [1, 256], KL*bitWidth/8 <= 24576; 2d: KH, KW ∈ [1, 256], KH*KW*bitWidth/8 <= 24576
stride:
Shape: [SH,SW] or [SL]
Dim: SH, SW, SL ∈ [1, 256]
pad:
Shape: [PH_BEGIN,PW_BEGIN,PH_END,PW_END] or [PL_BEGIN,PL_END]
PH_BEGIN,PW_BEGIN,PL_BEGIN,PH_END,PW_END,PL_END ∈ [-255, 256]
dilation:
Shape: 1d: [DW]; 2d: [DH, DW]
Dim: 1d: DW ∈ {1}; 2d: DH, DW ∈ {1}
torch.nn.functional.affine_gridlhs:
Type: int8, int16
Shape: [*,M,C]
Dim: * ∈ [1, 4096], M,C ∈ [1, 8192]
rhl:
Type: int8, int16
Shape: [*,C,N]
Dim: * ∈ [1, 4096]; C ∈ [1, 8192], N ∈  [1, 1048576]
output:
Type: int8, int16, int32
Shape: [*,M,N]
Other constraints: Same as lhs and rhs
torch.nn.functional.dropout
torch.nn.Dropout
torch.nn.functional.dropout1d
torch.nn.Dropout1d
torch.nn.functional.dropout2d
torch.nn.Dropout2d
torch.nn.functional.dropout3d
torch.nn.Dropout3d
torch.nn.Dropout
torch.nn.Dropout1d
torch.nn.Dropout2d
torch.nn.Dropout3d
N/A, collapsed in graph optimization phase
torch.nn.functional.elu
torch.nn.ELU
torch.nn.ELUinput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.nn.functional.embedding
torch.nn.Embedding
input:
Type: int8, int16, int32, float16, float32
Shape: [*]
If gather1d, W = inputShape[batchDim], W ∈ [1, 4096]. If input type is int8, int16 W ∈ [1, 32768]
If gather2d, H = inputShape[batchDim], W = inputShape[batchDim+1], H, W ∈ [1, 4096]. If input type is int8, int16, H, W ∈ [1, 32768]. H and W cannot both be greater than 4096 at the same time.
B is product of inputShape[0: batchDim], B ∈ [1, 1048576].
C is product of inputShape[batchDim+D:], C ∈ [1, 1048576].
indices:
Type: int8, int16, int32; Unsupported negative indices.
Shape: [*, D] indices value should not be larger than 32768. D ∈ [1, 2].
Size: I is product of indicesShape[batchDim: batchDim+D], I ∈ [1, 737280].
output:
Shape: [*]
The input and output types need to be the same.
batchDim:
The number of batch dimensions. The gather of indexing starts from dimension of input[batchDim:]
torch.nn.functional.gelu
torch.nn.GELU
torch.nn.GELUinput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.nn.functional.glu
torch.nn.GLU
torch.nn.GLUinput:
Type: int8, int16
output:
Type: int8, int16
torch.nn.functional.grid_sample input:
Type: int8
Shape: [*,H,W,C]
Dim: H ∈ [1, 32768], W ∈ [1, 32768], other dims ∈ [1, 65536].
NOTE: H and W cannot both be greater than 4096 at the same time.
grid:
Type: int16
Shape: [*,H,W,2]
output:
Same as input except Dim constraints
mode:
Only support bilinear and nearest
padding_mode:
Only support zeros and border
torch.nn.functional.hardsigmoid
torch.nn.HardSigmoid
torch.nn.HardSigmoidinput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.nn.functional.interpolate
torch.nn.Upsample
torch.nn.UpsamplingNearest2d
torch.nn.UpsamplingBilinear2d
 input:
--case 1: int8--
Type: int8
Shape: [*,H,W,C]
--case 2: float16, int16--
Type: float16, int16(with per tensor quant info)
Shape: [*,H,W,C]
Dim: H ∈ [1, 32768], W ∈ [1, 32768], C ∈ [1, 65536], other dims ∈ [1, 16384]. H and W cannot both be greater than 4096 at the same time. KN*KH*KW*KC*bitWidth/8 <= 786432
output:
The input and output types need to be the same.
mode:
If Type is int8, support nearest and bilinear
If Type is float16 or int16, only support bilinear
padValue:
If Type is int8, when padvalue is not equal to 0 and quantized, input only supports per tensor quantization
step:
If Type is int8, the integer part of step ∈ [-256, 255]
If Type is float16 or int16,  support steps are all less than and equal to 1
expansionMode:
If Type is int8, support border and constant
If Type is float16 or int16, only support border
torch.nn.functional.leaky_relu
torch.nn.LeakyReLU
torch.nn.LeakyReLUinput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.nn.functional.log_softmax
torch.nn.LogSoftmax
torch.nn.LogSoftmaxinput:
Type: int8, int16
Shape: [*]
Dim: reduce axis dim size ∈ [1, 65535]
Element : reduce Elements size ∈ [1, 65535]
output:
Type: int8, int16
Shape: [*]
torch.nn.functional.mish
torch.nn.Mish
torch.nn.Mishinput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.nn.functional.normalizehorizon.nn.Normalizeonly support p in (1.0, 2.0)
input:
Type: int8, int16
Dim: normalized dim size ∈ [1, 65535]
output:
Type: int8, int16
torch.nn.functional.pad
torch.nn.ConstantPad1d
torch.nn.ConstantPad2d
torch.nn.ConstantPad3d
torch.nn.ReplicationPad1d
torch.nn.ReplicationPad2d
torch.nn.ReplicationPad3d
torch.nn.ZeroPad2d
 input:
Type: int8, int16, float16, float32
Dim: all dims < 737280 when expansionMode is not 'constant' else no constraints
output:
The input and output constraints are the same.
begin/end:
Value should be in range [1, 1024]
torch.nn.functional.pixel_shuffle
torch.nn.PixelShuffle
 input:
dim ∈ [3, 7]
Type: No limits
output:
The output and input types need to be the same.
torch.nn.functional.pixel_unshuffle
torch.nn.PixelUnshuffle
 input:
Type: No limits.
output:
The output and input types need to be the same.
torch.nn.PReLUtorch.nn.PReLUinput:
Type: int8, int16
output:
The output and input types need to be the same.
torch.nn.functional.relu
torch.nn.ReLU
torch.nn.ReLUinput:
Type: int8, int16, int32, float16 if type is int32, this op must be fusible to a Conv op
Shape: [*]
output:
The input and output types need to be the same.
Shape: [*]
torch.nn.functional.relu6
torch.nn.ReLU6
torch.nn.ReLU6input:
Type: int8, int16, float16
Shape: [*]
output:
The input and output types need to be the same.
Shape: [*]
torch.nn.functional.silu
torch.nn.SiLU
torch.nn.SiLUinput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.nn.functional.softmax
torch.nn.Softmax
torch.softmax
torch.Tensor.softmax
torch.nn.Softmaxinput:
Type: int8, int16
Shape: [*]
Dim: reduce axis dim size ∈ [1, 65535]
Element : reduce Elements size ∈ [1, 65535]
output:
Type: int8, int16
Shape: [*]
torch.nn.functional.softplus
torch.nn.Softplus
input:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.nn.BatchNorm1d
torch.nn.BatchNorm2d
torch.nn.BatchNorm3d
 input:
Type: int8, int16
Shape: [*,H,W,C]
mean:
Type: float32
Shape: [C]
var:
Type: float32
Shape: [C]
weight:
Type: float32
Shape: [C]
bias:
Type: float32
Shape: [C]
output:
The input and output types need to be the same.
torch.nn.Conv1d
torch.nn.Conv2d
torch.nn.Conv3d
input:
--conv 1d--
Type: int8, int16
Shape: [*,L,C]
Dim: * ∈ [1, 4096]; L,C ∈ [1, 65536]
--conv 2d--
Type: int8, int16
Shape: [*,H,W,C]
Dim: * ∈ [1, 4096]; H,W,C ∈ [1, 65536]
--conv 3d--
Type: int8
Shape: [*,D,H,W,C]
Dim: * ∈ [1, 128]; D,H,W ∈ [1, 65536]; C ∈ [1, 4096];
weight:
--conv 1d--
Type: int8, int16
Shape: [N,KL,C]
Dim: C ∈ [1, 8192]; KL ∈ [1, 31]; N ∈ [1, 65536] if fout is the last layer of conv else [1, 8192]
Size: KL × C ∈ [1, 65535]
--conv 2d--
Type: int8, int16
Shape: [N,KH,KW,C]
Dim: C ∈ [1, 8192]; KH,KW ∈ [1, 31]; N ∈ [1, 65536] if fout is the last layer of conv else [1, 8192]
Size: KH × KW × C ∈ [1, 65535]
--conv 3d--
Type: int8
Shape: [N,KD,KH,KW,C]
N ∈ [1, 65535]; KD,KH,KW ∈ [1, 9]; Dim: C ∈ [1, 4096];
Size: KD × KH × KW × C ∈ [1, 131072]
bias:
Type: float32
output:
--conv 1d--
Type: int8, int16, int32
Shape: [*,L,C]
Dim: * ∈ [1, 4096]; L,C ∈ [1, 65536]
--conv 2d--
Type: int8, int16, int32
Shape: [*,H,W,C]
Dim: * ∈ [1, 4096]; H,W,C ∈ [1, 65536]
--conv 3d--
Type: int8, int16, int32
Shape: [*,D,H,W,C]
Dim: * ∈ [1, 128]; D,H,W ∈ [1, 65536]; C ∈ [1, 4096];
stride:
--conv 1d--
Shape: [SL]
Dim: SL ∈ [1, 256]; SL ∈ {1} if dilation > 1
--conv 2d--
Shape: [SH,SW]
Dim: SH,SW ∈ [1, 256]; SH,SW ∈ {1} if dilation > 1
--conv 3d--
Shape: [SD,SH,SW]
Dim: SD,SH,SW must be 1 or 2 and equal to each other.
pad:
--conv 1d--
Shape: [P_left,P_right]
Dim: P_left,P_right ∈ (-L, 256]
--conv 2d--
Shape: [P_top,P_left,P_bottom,P_right]
Dim: P_top,P_bottom ∈ (-H, 256], P_left,P_right ∈ (-W, 256]
--conv 3d--
Shape: [P_front, P_top, P_left, P_back, P_bottom, P_right]
Dim: P_front,P_back ∈ [0, KD/2], P_top,P_bottom ∈ [0, KH/2], P_left,P_right ∈ [0, KW/2]
groupNum:
Fin.c is divisible by group number, conv 3d only support 1
dilation:
--conv 1d--
Shape: [DL]
Dim: DL ∈ [1, 18]
--conv 2d--
Shape: [DH,DW]
Dim: DH,DW ∈ [1, 18]
--conv 3d--
Shape: [DD,DH,DW]
DD,DH,DW = 1
others:
--conv 1d--
Stride only support odd number and 2 when conv is a int16 depthwise conv
If groupNum > 1, for each group, fin.c' ∈ [1, 65535], KL × fin.c' ∈ [1, 65535]
--conv 2d--
Stride only support odd number and 2 when conv is a int16 depthwise conv
If groupNum > 1, for each group, fin.c' ∈ [1, 65535], KH × KW × fin.c' ∈ [1, 65535]
Fin.c' = fin.c × min(lcm(fout.c × (lcm(fin.c, 4) / fin.c), 8) / fout.c, groupNum)
torch.nn.ConvTranspose1d
torch.nn.ConvTranspose2d
torch.nn.ConvTranspose3d
input:
--conv 1d/2d--
Type: int8, int16; input and weight cannot both be int16
1d_Shape: [*,W,C]
1d_Dim: * ∈ [1, 128]; W ∈ [1, 65536]; C ∈ [1, 2048]
2d_Shape: [*,H,W,C]
2d_Dim: * ∈ [1, 128]; H,W ∈ [1, 65536]; C ∈ [1, 2048]
--conv 3d--
Type: int8
3d_Shape: [*,D,H,W,C]
3d_Dim: * ∈ [1, 128]; D,H,W ∈ [1, 65536]; C ∈ [1, 2048]
weight:
--conv 1d/2d--
Type: int8, int16; input and weight cannot both be int16
1d_Shape: [N,KW,C]
1d_Dim: N,C ∈ [1, 2048]; KW ∈ [1, 14]
1d_Size: KW × C ∈ [1, 65535]
2d_Shape: [N,KH,KW,C]
2d_Dim: N,C ∈ [1, 2048]; KH,KW ∈ [1, 14]; KH,KW cannot both be 1
2d_Size: KH × KW × C ∈ [1, 65535]
--conv 3d--
Type: int8
3d_Shape: [N,KD,KH,KW,C]
3d_Dim: N,C ∈ [1, 2048]; KD,KH,KW ∈ [1, 14]; KD,KH,KW cannot all be 1
3d_Size: KH × KW × C ∈ [1, 65535]
bias:
Type: float32
output:
Type: int8, int16, int32
stride:
1d_Shape: [SW]
1d_Dim: SW ∈ [1, 14];
2d_Shape: [SH,SW]
2d_Dim: SH,SW ∈ [1, 14];
3d_Shape: [SD,SH,SW]
3d_Dim: SD,SH,SW ∈ [1, 14];
pad:
1d_Shape: [P_left,P_bottom]
1d_Dim: P_left,P_bottom ∈ [0, 256]
2d_Shape: [P_top,P_left,P_bottom,P_right]
2d_Dim: P_top,P_left,P_bottom,P_right ∈ [0, 256]
3d_Shape: [P_front,P_top,P_left,P_back,P_bottom,P_right]
3d_Dim: P_front,P_top,P_left,P_back,P_bottom,P_right ∈ [0, 256]
groupNum:
Fin.c is divisible by group number, conv 3d only support 1
dilation:
1d_Shape: [DW]
1d_Dim: DW ∈ {1}
2d_Shape: [DH,DW]
2d_Dim: DH,DW ∈ {1}
3d_Shape: [DD,DH,DW]
3d_Dim: DD,DH,DW ∈ {1}
torch.nn.GRUdropout must be 0.0
input:
Type: int8, int16
Dim: C_in ∈ [1, 65535], Seq length < 1024, other dims < 2097152
output:
Type: int8, int16
Dim: all dims < 131072
size < 2G
torch.nn.LSTMinput:
Type: int8, int16
Dim: C_in ∈ [1, 65535], Seq length < 1024, other dims < 2097152
output:
Type: int8, int16
Dim: all dims < 131072
size < 2G
torch.nn.Identity N/A, collapsed in graph optimization phase
torch.nn.LayerNorm
torch.nn.GroupNorm
torch.nn.InstanceNorm1d
torch.nn.InstanceNorm2d
torch.nn.InstanceNorm3d
horizon.nn.LayerNorm
input:
Type: int8, int16
Dim: normalized dim size ∈ [1, 65535]
output:
Type: int8, int16
torch.nn.Linearlhs:
Type: int8, int16
Shape: [*,C_in]
Dim: *, C_in ∈ [1, 65536]]
weight:
Type: int8, int16
Shape: [C_out, C_in]
Dim: C_out ∈ [1, 1048576]; C_in ∈ [1, 8192]
bias:
Type: float32
output:
Type: int8, int16, int32
Other constraints: Same as input
torch.nn.functional.max_pool1d
torch.nn.MaxPool1d
torch.nn.functional.max_pool2d
torch.nn.MaxPool2d
torch.nn.functional.adaptive_max_pool1d
torch.nn.AdaptiveMaxPool1d
torch.nn.functional.adaptive_max_pool2d
torch.nn.AdaptiveMaxPool2d
 input:
Type: int8, int16
Shape: [*,H,W,C]
output:
The input and output types need to be the same.
kernel:
Shape: [KL] or [KH,KW], only support 1d or 2d now
Dim: 1d: KL ∈ [1, 256], KL*bitWidth/8 <= 24576; 2d: KH, KW ∈ [1, 256], KH*KW*bitWidth/8 <= 24576
stride:
Shape: [SH,SW] or [SL]
Dim: SH, SW, SL ∈ [1, 256]
pad:
Shape: [PH_BEGIN,PW_BEGIN,PH_END,PW_END] or [PL_BEGIN,PL_END]
PH_BEGIN,PW_BEGIN,PL_BEGIN,PH_END,PW_END,PL_END ∈ [-255, 256]
dilation:
Shape: 1d: [DW]; 2d: [DH, DW]
Dim: 1d: DW ∈ {1}; 2d: DH, DW ∈ {1}
torch.nn.MultiheadAttentionsrc_len, tgt_len, head_dim ∈ [1, 8192]
embed_dim, kdim, vdim ∈ [1, 65536]
input:
Type: int8, int16
output:
Type: int8, int16
torch.nn.functional.selu
torch.nn.SELU
torch.nn.SELUinput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.nn.functional.sigmoid
torch.sigmoid
torch.Tensor.sigmoid
torch.nn.Sigmoid
torch.nn.Sigmoidinput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.tanh
torch.Tensor.tanh
torch.nn.Tanh
torch.nn.Tanhinput:
Type: int8, int16
output:
Type: int8, int16. Only support (int8->int8), 
(int16->int8,int16)
torch.nn.TransformerDecoderLayerxxx_is_causal is not supported
input:
Type: int8, int16
output:
Type: int8, int16
torch.nn.TransformerEncoderLayerxxx_is_causal is not supported
input:
Type: int8, int16
output:
Type: int8, int16
torch.quantization.DeQuantStubinput:
Type: int8, int16, int32
output:
Type: float16, float32
torch.quantization.QuantStubinput:
Type: float16, float32
output:
Type: int8, int16
horizon.nn.AnchorGenerator No limits
horizon.nn.BaseGridGenerator No limits
horizon.nn.functional.filterinput:
Type: int8, int16
Shape: [*, H, W, C]
Bpu filter batch dim must be 1 when rank4, H/W must be in range (0, 32768)
W*C < L1M_SIZE/4 && W <= 4096 when H != 1
threshold:
Type: int8, int16
output:
The input and output types need to be the same.
others:
All ops between the filterData and the last layer of the model should be cpu ops
horizon.nn.GridSampleinput:
Type: nearest mode supports int8, int16, int32, float16, float32, pad must 0 when bit width > 8; the others only support int8
Shape: [*,H,W,C]
Dim: H ∈ [1, 32768], W ∈ [1, 32768], other dims ∈ [1, 65536].
NOTE: H and W cannot both be greater than 4096 at the same time.
grid:
Type: int16.
Shape: [*,H,W,2]
output:
The input and output types need to be the same except Dim constraints
torchvision.ops.DeformConv2dinput:
Type: int8
Shape: [*,H,W,C]
Dim: H,W ∈ [1, 1024]; H × W ≤ 720 × 1024; other dims ∈ [1, 65536]
offset:
Type: int16
Shape: [*,OH,OW,2 × offsetGroupNum × KH × KW]
Size: 2 × offsetGroupNum × KH × KW ∈ [2, 256], OH × KH × OW × KW ≤ 720 × 1024
mask:
Type: int8
Shape: [*,OH,OW,offsetGroupNum × KH × KW]
Size: offsetGroupNum × KH × KW ∈ [1, 128]
The value of mask is usually [0, 1]
weight:
Type: int8
Shape: [N,KH,KW,C]
Dim: C ∈ [1, 8192]; KH,KW ∈ [1, 8]; N ∈ [1, 4096]
Size: KH × KW × C ∈ [1, 65535]
bias:
Type: float32
output:
Type: int8, int16, int32
Other constraints: Same as fin
stride:
Shape: [SH,SW]
Dim: SH,SW ∈ [1]
pad:
Shape: [P_top,P_left,P_bottom,P_right]
Dim: P_top,P_bottom ∈ [-H/2, 256], P_left,P_right ∈ [-W/2, 256]
groupNum:
Fin.c is divisible by group number
offsetGroupNum:
Fin.c is divisible by offset group number
Size: offsetGroupNum ∈ [1, 2]
dilation:
Shape: [DH,DW]
Dim: DH,DW ∈ [1]
others:
For each group, fin.c ∈ [1, 8192], KH × KW × fin.c ∈ [1, 65535], fin.c = C when group = 1
torch.Tensor.__getitem__(if indices is index Tensor) input:
Type: int8, int16, int32, float16, float32
Shape: [*]
Input will transpose to [N, W, C]. W is inputShape[dim], N is the product of inputShape[:dim], C is the product of inputShape[dim+1:].
N, C ∈ [1, 1048576], W ∈ [1, 4096]. If input type is int8, int16, W ∈ [1, 32768].
index:
Type: int8, int16, int32; Unsupported negative indices.
Shape: [*] index value should not be larger than 32768. And the reduce multiple of all index dims of shape should in range [1, 737280(720*1024)], because all dims
Will be reduced to W dim of indices and output. If W of fout is larger than 737280, this op will be split too many sub-ops.
output:
The input and output types need to be the same.
torch.Tensor.__getitem__(if indices is int scalar)input:
Type: No limits
output:
The input and output types need to be the same.
torch.Tensor.__getitem__(if indices is slice)input:
Type: No limits
Dim: all dims < 2097152
output:
The input and output types need to be the same.
The input and output constraints are the same.
torch.Tensor.clone
torch.Tensor.contiguous
torch.Tensor.detach
 N/A, collapsed in graph optimization phase