Image Processing Transformer

This section will explain the concepts and parameters of each transformer you use when scaling and cropping images, and provide you with reference to the use of samples to make it easier for you to perform transformer operations.

Before reading the contents of the document, please note the following:

Attention

The image data is three-dimensional data, but the default first dimension of the transformer provided by Horizon is N-dimensional, and it will process the data with the first dimension split cycle, so if you need to do processing on the image, please provide four-dimensional data.

AddTransformer

Description:

Adds values to all the pixel values in the input image. The transformer converts the data format to float32 in the output.

Parameters:

  • value: Value to be added to each pixel. Note that value can be negative, e.g., -128.

Examples of use:

# Subtracts 128 from the image data ``AddTransformer(-128)`` # Adds 127 to the image data ``AddTransformer(127)``

MeanTransformer

Description:

subtracts the mean_value from all pixel values in the input image.

Parameters:

  • means: Subtracted value of each pixel. Note that the value can be a negative number, e.g., -128.
  • data_format: Layout type of the input. Value range: ["CHW", "HWC"], defaults to "CHW".

Examples of use:

# Each pixel subtracts 128.0. Input type: CHW MeanTransformer(np.array([128.0, 128.0, 128.0])) # Each pixel subtracts a different value: 103.94, 116.78, 123.68. Input type: HWC MeanTransformer(np.array([103.94, 116.78, 123.68]), data_format="HWC")

ScaleTransformer

Description:

Multiplies all pixel values in the input image by the data_scale factor.

Parameters:

  • scale_value: Factor to be multiplied, such as 0.0078125 or 1/128.

Examples of use:

# Adjusts all pixels within the range [-128, 127] to [-1, 1]. ScaleTransformer(0.0078125) # or ScaleTransformer(1/128)

NormalizeTransformer

Description:

Normalizes the input image and converts the data format to float32 in the output.

Parameters:

  • std: Value by which the first input image needs to be divided.

Examples of use:

# Adjusts all pixels within the range [-128, 127] to [-1, 1] NormalizeTransformer(128)

TransposeTransformer

Description:

Operation used to perform the layout conversion.

Parameters:

  • order: Order of the input image after the layout conversion (the order is related to the original layout order), e.g., suppose HWC order is (0,1,2), when converted to CHW, the order is (2,0,1).

Examples of use:

# HWC to CHW TransposeTransformer((2, 0, 1)) # CHW to HWC TransposeTransformer((1, 2, 0))

HWC2CHWTransformer

Description:

Converts NHWC to NCHW.

Parameters: None.

Examples of use:

# NHWC to NCHW HWC2CHWTransformer()

CHW2HWCTransformer

Description:

Converts NCHW to NHWC.

Parameters: None.

Examples of use:

# NCHW to NHWC CHW2HWCTransformer()

CenterCropTransformer

Description:

Cuts out a square image from the center of the image by directly truncating the value. This transformer will convert the data format to float32 in the output. When the value of data_type is uint8, the output is uint8.

Parameters:

  • crop_size: Size of the sides of the square cropped from the center.
  • data_type: Type of the output result. Value range: ["float", "uint8"].

Examples of use:

# Performs center cropping with 224*224. Default output type: float32 CenterCropTransformer(crop_size=224) # Performs center cropping with 224*224. Output type: uint8 CenterCropTransformer(crop_size=224, data_type="uint8")

PILCenterCropTransformer

Description:

Crops a square image from the center of the image by using the PIL method. This transformer will convert the data format to float32 in the output.

Parameters:

  • size: Size of the sides of the square cropped from the center.

Examples of use:

# Performs center cropping with 224*224 by using the PIL method. PILCenterCropTransformer(size=224)

LongSideCropTransformer

Descriptions:

Crops the longer side. This transformer will convert the data format to float32 in the output.

When width > height, crops a square based on the height, e.g., suppose width=100 and height=70, then the size after cropping is 70*70.

When height > width, crops a rectangle whose width remains the same and height is (height-weight)/2 + width, e.g., suppose width=70 and height=100, the size after cropping is 70*(100-70) /2+70 , which is a rectangle of size 70*85.

Parameters: None.

Examples of use:

LongSideCropTransformer()

PadResizeTransformer

Description:

Enlarges the image by using the padding method. The transformer converts the data format to float32 in the output.

Parameters:

  • target_size: Target size. The value is a tuple, e.g., (240,240).
  • pad_value: Value to be padded into the array, defaults to 127.
  • pad_position: Padding position. Values range: ["boundary","bottom_right"], defaults to "boundary".

Examples of use:

# Crops a square of 512*512 pixels, padding to the bottom right corner, with a padding value of 0 PadResizeTransformer((512, 512), pad_position='bottom_right', pad_value=0) # Crops a square of 608*608 pixels, padding to the border, with a padding value of 127 PadResizeTransformer(target_size=(608, 608))

ResizeTransformer

Description:

Resizes the image.

Parameters:

  • target_size: Target size. The value is a tuple, e.g., (240,240): width=240, height=240.

  • mode: Image processing mode, takes a range of values ("skimage", "opencv"), defaults to "skimage".

  • method: Interpolation method, this parameters only works in the skimage mode. Value range: [0,5], defaults to 1, of which :

    • 0: Nearest-neighbor;
    • 1: Bi-linear(default);
    • 2: Bi-quadratic;
    • 3: Bi-cubic;
    • 4: Bi-quartic;
    • 5: Bi-quintic.
  • data_type: Output type. Value range: (uint8, float), defaults to float. When set to uint8, the output type is uint8, otherwise it is float32.

  • interpolation: Interpolation method. This parameter only takes effect when mode is opencv. Value range: (opencv's interpolation method), defaults to null. Currently, interpolation only supports empty, or two interpolation methods of INTER_CUBIC in OpenCV, when it is empty, INTER_LINEAR method is used by default.

    The following are the interpolation methods supported in OpenCV and their descriptions (interpolation methods not supported will be gradually added in subsequent iterations):

    • INTER_NEAREST: Nearest Neighbor Interpolation.
    • INTER_LINEAR: Bi-linear interpolation, used by default when the interpolation is empty.
    • INTER_CUBIC: Bi-cubic interpolation within a 4x4 pixel neighborhood.
    • INTER_AREA: Resampling using pixel area relation. It may be the preferred method for image decimation as it can provide moiré-free results. But when the image is scaled, it is similar to the INTER_NEAREST method.
    • INTER_LANCZOS4: Lanczos interpolation of 8x8 neighborhood.
    • INTER_LINEAR_EXACT: Bit-accurate bilinear interpolation.
    • INTER_NEAREST_EXACT: Bit-exact nearest neighbor interpolation. This will produce the same results as the nearest neighbor method in PIL, scikit-image, or Matlab.
    • INTER_MAX: Mask for interpolation code.
    • WARP_FILL_OUTLIERS: Flag, padding all target image pixels. If some of them correspond to outliers in the source image, set them to zero.
    • WARP_INVERSE_MAP: Flag, inverter.

Examples of use:

# Resizes the input image to 224*224, use OpenCV to process the image. Interpolation: bilinear, output: float32 ResizeTransformer(target_size=(224, 224), mode='opencv', method=1) # Resizes the input image to 256*256, use skimage to process the image. Interpolation: bilinear, output: float32 ResizeTransformer(target_size=(256, 256)) # Resizes input image to 256*256, use skimage to process the image. Interpolation: bilinear, output: uint8 ResizeTransformer(target_size=(256, 256), data_type="uint8")

PILResizeTransformer

Description:

Resizes the image by using the PIL library.

Parameters:

  • size: Target size. The value is a tuple, e.g., (240,240).
  • interpolation: Specifies the interpolation method. Value range: (Image.NEAREST, Image.BILINEAR, Image.BICUBIC, Image.LANCZOS), default to Image.BILINEAR.
    • Image.NEAREST: Nearest neighbor sampling;
    • Image.BILINEAR: Linear interpolation;
    • Image.BICUBIC: Cubic spline interpolation;
    • Image.LANCZOS: High quality downsampling filter.

Examples of use:

# Adjusts the input image size to 256*256 and the interpolation method is linear interpolation PILResizeTransformer(size=256) # Adjusts the input image size to to 256*256 and the interpolation method is a high-quality downsampling filter PILResizeTransformer(size=256, interpolation=Image.LANCZOS)

ShortLongResizeTransformer

Description:

Scales the input image according to the original scale, and the size of the new image is related to the parameters set.

Perform the operation as follows:

  1. First, divide the size of short_size by the smaller value of the width and height of the original image, and use this value as the scaling factor.
  2. When the scaling factor is multiplied by the larger value of the width and height of the original image and the result is greater than the value of long_size, the scaling factor will be changed to long_size divided by the larger value of the width and height of the original image.
  3. Use the resize method in OpenCV to re-crop the image according to the scaling factor obtained above.

Parameters:

  • short_size: Expected length of the short edge after cutting.
  • long_size: Expected length of the short edge after cutting.
  • include_im: Defaults to True. When set to True, it will return the original image in addition to the processed image.

Examples of use:

# Shorter edge length is 20, longer edge length is 100, returns both the processed image and the original image ShortLongResizeTransformer(short_size=20, long_size=100)

PadTransformer

Description:

Resizes the image by dividing the target size by the larger value of the width or height of the input image, and then multiplying this factor by the original width and height.

Then according to the size of the new image, divide it by size_divisor and round it up, then multiply it by size_divisor to generate a new image with the new width and height.

Parameters:

  • size_divisor: Size divisor, defaults to 128.
  • target_size: Target size, defaults to 512.

Examples of use:

# The pad size is 1024*1024 PadTransformer(size_divisor=1024, target_size=1024)

ShortSideResizeTransformer

Description:

According to the desired length of the short side, this transformer use the current ratio of the long and short sides, and the crop out the operation of the new image size from the image center.

Parameters:

  • short_size: Expected length of short side.

  • data_type: Type of the output result. Value range: ("float", "uint8"), defaults to "float32", output in float32 type, and when set to uint8, the output type will be uint8.

  • interpolation: Interpolation method. Value range: (interpolation method used in OpenCV), defaults to empty. Currently, interpolation only supports empty, or two interpolation methods of INTER_CUBIC in OpenCV, when it is empty, INTER_LINEAR method is used by default.

    The following are the interpolation methods supported in OpenCV and their descriptions (interpolation methods not supported will be gradually added in subsequent iterations).

    • INTER_NEAREST: Nearest Neighbor Interpolation.
    • INTER_LINEAR: Bi-linear interpolation, which is used by default when the interpolation is empty.
    • INTER_CUBIC: Bi-cubic interpolation within a 4x4 pixel neighborhood.
    • INTER_AREA: Resampling using pixel area relation. It may be the preferred method for image decimation as it can provide moiré-free results. But when the image is scaled, it is similar to INTER_NEAREST method.
    • INTER_LANCZOS4: Lanczos interpolation of 8x8 neighborhood.
    • INTER_LINEAR_EXACT: Bit-accurate bilinear interpolation.
    • INTER_NEAREST_EXACT: Bit-exact nearest neighbor interpolation. This will produce the same results as the nearest neighbor method in PIL, scikit-image, or Matlab.
    • INTER_MAX: Mask for interpolation code.
    • WARP_FILL_OUTLIERS: Flag, padding all target image pixels. If some of them correspond to outliers in the source image, set them to zero.
    • WARP_INVERSE_MAP; Flag, inverter.

Examples of use:

# Adjusts the short side to 256 and the interpolation method is bilinear interpolation ShortSideResizeTransformer(short_size=256) # Resize the short side to 256 and the interpolation method is Lanczos interpolation within the 8x8 pixel neighborhood ShortSideResizeTransformer(short_size=256, interpolation=Image.LANCZOS4)

PaddedCenterCropTransformer

Description:

Crops the center of the image with padding.

Attention

Applicable only to EfficientNet-lite related instance models.

Calculation method:

  1. Calculates the factor, int((float( image_size ) / ( image_size + crop_pad )).
  2. Calculates the size of the center, coefficient * np.minimum(height of original image, width of original image)).
  3. Crops the image from its center according to the calculated size.

Parameters:

  • image_size: Size of the image, defaults to 224.
  • crop_pad: Size of the center padding, defaults to 32.

Examples of use:

# Crop size is 240*240, padding value is 32 PaddedCenterCropTransformer(image_size=240, crop_pad=32) # Crop size is 224*224, padding value is 32 PaddedCenterCropTransformer()

BGR2RGBTransformer

Description:

Converts the input format from BGR to RGB.

Parameters:

  • data_format: Data format. Value range: (CHW,HWC), defaults to CHW.

Examples of use:

# When the layout is NCHW, converts BGR to RGB BGR2RGBTransformer() # When the layout is NHWC,converts BGR to RGB BGR2RGBTransformer(data_format="HWC")

RGB2BGRTransformer

Description:

Converts the input format from RGB to BGR.

Parameters:

  • data_format: Data format.Value range: (CHW,HWC), defaults to CHW.

Examples of use:

# When the layout is NCHW, converts RGB to BGR RGB2BGRTransformer() # When the layout is NHWC, converts RGB to BGR RGB2BGRTransformer(data_format="HWC")

RGB2GRAYTransformer

Description:

Converts the input format from RGB to GRAY.

Parameters:

  • data_format: Input layout type. Value range: ("CHW", "HWC"), defaults to "CHW".

Examples of use:

# When the layout is NCHW, converts RGB to GRAY RGB2GRAYTransformer(data_format='CHW') # When the layout is NHWC, converts RGB to GRAY RGB2GRAYTransformer(data_format='HWC')

BGR2GRAYTransformer

Description:

Converts the input format from BGR to GRAY.

Parameters:

  • data_format: Input layout type. Value range: ["CHW", "HWC"], defaults to "CHW".

Examples of use:

# When the layout is NCHW, converts BGR to GRAY BGR2GRAYTransformer(data_format='CHW') # When the layout is NHWC, converts BGR to GRAY BGR2GRAYTransformer(data_format='HWC')

RGB2GRAY_128Transformer

Description:

Converts the input format from RGB to GRAY_128. Value range of GRAY_128: (-128,127).

Parameters:

  • data_format: Input layout type. Value range: ["CHW", "HWC"], required field, defaults to "CHW".

Examples of use:

# When the layout is NCHW, converts RGB to GRAY_128 RGB2GRAY_128Transformer(data_format='CHW') # When the layout is NHWC, converts RGB to GRAY_128 RGB2GRAY_128Transformer(data_format='HWC')

RGB2YUV444Transformer

Description:

Converts the input format from RGB to YUV444.

Parameters:

  • data_format: Input layout type. Value range: ["CHW", "HWC"], required field, defaults to "CHW".

Examples of use:

# When the layout is NCHW, converts RGB to YUV444 RGB2YUV444Transformer(data_format='CHW') # When the layout is NHWC, converts RGB to YUV444 RGB2YUV444Transformer(data_format='HWC')

BGR2YUV444Transformer

Description:

Convers the input format from BGR to YUV444.

Parameters:

  • data_format: Input layout type. Value range: ["CHW", "HWC"], required field, defaults to "CHW".

Examples of use:

# When the layout is NCHW, converts BGR to YUV444 BGR2YUV444Transformer(data_format='CHW') # When the layout is NHWC, converts BGR to YUV444 BGR2YUV444Transformer(data_format='HWC')

BGR2YUV444_128Transformer

Description:

Converts the input format from BGR to YUV444_128. Values range of YUV444_128: (-128,127).

Parameters:

  • data_format: Input layout type. Value range: ["CHW", "HWC"], required field, defaults to "CHW".

Examples of use:

# When the layout is NCHW, converts BGR to YUV444_128 BGR2YUV444_128Transformer(data_format='CHW') # When the layout is NHWC, converts BGR to YUV444_128 BGR2YUV444_128Transformer(data_format='HWC')

RGB2YUV444_128Transformer

Description:

Converts the input format from RGB to YUV444_128. Values range of YUV444_128: (-128,127).

Parameters:

  • data_format: Input layout type. Value range: ["CHW", "HWC"], required field, defaults to "CHW".

Examples of use:

# When the layout is NCHW , converts RGB to YUV444_128 RGB2YUV444_128Transformer(data_format='CHW') # When the layout is NHWC, converts RGB to YUV444_128 RGB2YUV444_128Transformer(data_format='HWC')

BGR2YUVBT601VIDEOTransformer

Description:

Converts the input format from BGR to YUV_BT601_Video_Range.

YUV_BT601_Video_Range, some camera input data are YUV BT601 (Video Range) format. Value range: 16~235, this transformer is adapted to this format of data generated.

Parameters:

  • data_format: Input layout type. Value range: ["CHW","HWC"], required field, defaults to "CHW".

Examples of use:

# When the layout is NCHW, converts BGR to YUV_BT601_Video_Range BGR2YUVBT601VIDEOTransformer(data_format='CHW') # When the layout is NHWC, converts BGR to YUV_BT601_Video_Range BGR2YUVBT601VIDEOTransformer(data_format='HWC')

RGB2YUVBT601VIDEOTransformer

Description:

Converts the input format from RGB to YUV_BT601_Video_Range.

YUV_BT601_Video_Range, some camera input data are YUV BT601 (Video Range) format. Value range: 16~235, this transformer is adapted to this format of data generated.

Parameters:

  • data_format: Input layout type. Value range: ["CHW","HWC"], required field, defaults to "CHW".

Examples of use:

# When the layout is NCHW, converts RGB to YUV_BT601_Video_Range RGB2YUVBT601VIDEOTransformer(data_format='CHW') # When the layout is NHWC, converts RGB to YUV_BT601_Video_Range RGB2YUVBT601VIDEOTransformer(data_format='HWC')

YUVTransformer

Description:

Converts the input format to YUV444.

Parameters:

  • color_sequence: Color sequence, required field.

Examples of use:

# Converting BGR read-in images to YUV444 YUVTransformer(color_sequence="BGR") # Converting RGB read-in images to YUV444 YUVTransformer(color_sequence="RGB")

ReduceChannelTransformer

Description:

Reduces the C channel to a single channel. The transformer is mainly for C channel, such as shape 1*3*224*224 to 1*1*224*224. In practice, the layout must be aligned with data_format value to avoid causing the wrong channel deletion.

Parameters:

  • data_format: Input layout type. Value range: ["CHW", "HWC"], defaults to "CHW".

Examples of use:

# Delete the C channel with layout as NCHW ReduceChannelTransformer() # Or ReduceChannelTransformer(data_format="CHW") # Delete the C channel with layout as NHWC ReduceChannelTransformer(data_format="HWC")

BGR2NV12Transformer

Description:

Converts the input format from BGR to NV12.

Parameters:

  • data_format: Input layout type. Value range: ["CHW","HWC"], defaults to "CHW".
  • cvt_mode: cvt mode. Value range: (rgb_calc, OpenCV), defaults to rgb_calc.
    • rgb_calc: Image processing using mergeUV.
    • opencv: Image processing using OpenCV.

Examples of use:

# When the layout is NCHW, converts BGR to NV12, and the image is processed by rgb_calc BGR2NV12Transformer() # Or BGR2NV12Transformer(data_format="CHW") # When the layout is NHWC, converts BGR to NV12, and the image is processed by OpenCV BGR2NV12Transformer(data_format="HWC", cvt_mode="opencv")

RGB2NV12Transformer

Description:

Converts the input format from RGB to NV12.

Parameters:

  • data_format: Input layout type. Value range: ["CHW", "HWC"], defaults to "CHW".
  • cvt_mode: cvt mode. Value range: (rgb_calc,opencv), defaults to rgb_calc.
    • rgb_calc: Image processing using mergeUV.
    • opencv: Image processing using opencv.

Examples of use:

# When the layout is NCHW, converts RGB to NV12, and the image is processed by rgb_calc RGB2NV12Transformer() # Or RGB2NV12Transformer(data_format="CHW") # When the layout is NHWC, converts RGB to NV12, and the image is processed by OpenCV RGB2NV12Transformer(data_format="HWC", cvt_mode="opencv")

NV12ToYUV444Transformer

Description:

Converts the input format from NV12 to YUV444.

Parameters:

  • target_size: Target size. Value is a tuple, e.g., (240,240).
  • yuv444_output_layout: YUV444 output layout. Value range: (HWC,CHW), defaults to "HWC".

Examples of use:

# layout is NCHW, size is 768*768, converts NV12 to YUV444 NV12ToYUV444Transformer(target_size=(768, 768)) # layout is NHWC, size is 224*224, converts NV12 to YUV444 NV12ToYUV444Transformer((224, 224), yuv444_output_layout="HWC")

WarpAffineTransformer

Description:

Performs image affine transformations.

Parameters:

  • input_shape: Input shape value.
  • scale: Factor to be multiplied.

Examples of use:

# The size is 512*512, the length of the long side is 1.0 WarpAffineTransformer((512, 512), 1.0)

F32ToS8Transformer

Description:

Convers the input format from float32 to int8.

Parameters: None.

Examples of use:

# Conversion of input format from float32 to int8 F32ToS8Transformer()

F32ToU8Transformer

Description:

Converts the input format from float32 to uint8.

Parameters: None.

Examples of use:

# Conversion of input format from float32 to uint8 F32ToU8Transformer()