Abstract
Quantization is one of the most popular and widely used methods of speeding up a neural network. At the moment, the standard is 8-bit uniform quantization. Nevertheless, the use of uniform low-bit quantization (4- and 6-bit quantization) has significant advantages in speed and resource requirements for inference. We present our quantization algorithm that offers advantages when using uniform low-bit quantization. It is faster than quantization-aware training from scratch and more accurate than methods aimed only at selecting thresholds and reducing noise from quantization. We also investigated quantization noise in neural networks for low-bit quantization and concluded that quantization noise is not always a good metric for quantization quality.
Original language | English |
---|---|
Article number | 012004 |
Journal | Journal of Physics: Conference Series |
Volume | 2134 |
Issue number | 1 |
DOIs | |
Publication status | Published - 20 Dec 2021 |
Event | 8th International Young Scientists Conference on Information Technologies, Telecommunications and Control Systems, ITTCS 2021 - Innopolis, Russian Federation Duration: 16 Dec 2021 → 17 Dec 2021 |
OECD FOS+WOS
- 1.03 PHYSICAL SCIENCES AND ASTRONOMY