Facebook has released the open source code for the QNNPACK AI library (Quantized Neural Network PACKage). It is designed for AI on mobile devices. The technology is used in Facebook applications for image processing. Since the computing power of mobile devices is lower than that of data processing servers, the developers used the latest advances in neural networks to keep the system performance at the proper level.
The library architecture is based on the convolutional neural network. Such a network is considered the most suitable for the recognition of visual images. To improve productivity, engineers applied the modified im2col memory matrix transformation and memory transformation technologies.
The im2col technology is a breakdown of the processed image into columns-vectors by the number of incoming channels. The creators of QNNPACK finalized this system by including a bypass buffer.
The bypass buffer contains pointers to rows of incoming pixels that should be involved in the calculation of outgoing. Using matrices, developers were able to reduce the buffer size relative to standard im2col implementations.
Facebook engineers refined the even-distributed convolution algorithm (depthwise convolution) by adding batch processing of 3 × 3 groups. For packet computing, general purpose registers (GPR) are used. The 3 × 3 convolution processing requires 18 registers (9 incoming and 9 for the filter), while the 32-bit ARM core architecture supports only 14. But since the filter remains unchanged during processing, the developers were able to reduce the resources needed for its storage to single register.
In the QNNPACK library, other advanced approaches to optimizing neural networks are implemented, for example, low-precision calculations.
The tool is published as part of the PyTorch 1.0 framework, released in early October 2018.