Trainable Thresholds for Neural Network Quantization

Alexander Goncharenko, Andrey Denisov, Sergey Alyamkin, Evgeny Terentev

Результат исследования: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференциинаучнаярецензирование

Аннотация

Embedded computer vision applications for robotics, security cameras, and mobile phone apps require the usage of mobile neural network architectures like MobileNet-v2 or MNAS-Net in order to reduce RAM consumption and accelerate processing. An additional option for further resource consumption reduction is 8-bit neural network quantization. Unfortunately, the known methods for neural network quantization lead to significant accuracy reduction (more than 1.2%) for mobile architectures and require long training with quantization procedure. To overcome this limitation, we propose a method that allows to quantize mobile neural network without significant accuracy loss. Our approach is based on trainable quantization thresholds for each neural network filter, that allows to accelerate training with quantization procedure up to 10 times in comparison with the standard techniques. Using the proposed technique, we quantize the modern mobile architectures of neural networks with the accuracy loss not exceeding 0.1%. Ready-for-use models and code are available at: https://github.com/agoncharenko1992/FAT-fast-adjustable-threshold.

Язык оригиналаанглийский
Название основной публикацииAdvances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings
РедакторыIgnacio Rojas, Gonzalo Joya, Andreu Catala
ИздательSpringer-Verlag GmbH and Co. KG
Страницы302-312
Число страниц11
ISBN (печатное издание)9783030205171
DOI
СостояниеОпубликовано - 1 янв 2019
Событие15th International Work-Conference on Artificial Neural Networks, IWANN 2019 - Gran Canaria, Испания
Продолжительность: 12 июн 201914 июн 2019

Серия публикаций

НазваниеLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Том11507 LNCS
ISSN (печатное издание)0302-9743
ISSN (электронное издание)1611-3349

Конференция

Конференция15th International Work-Conference on Artificial Neural Networks, IWANN 2019
СтранаИспания
ГородGran Canaria
Период12.06.201914.06.2019

    Fingerprint

Цитировать

Goncharenko, A., Denisov, A., Alyamkin, S., & Terentev, E. (2019). Trainable Thresholds for Neural Network Quantization. В I. Rojas, G. Joya, & A. Catala (Ред.), Advances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings (стр. 302-312). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Том 11507 LNCS). Springer-Verlag GmbH and Co. KG. https://doi.org/10.1007/978-3-030-20518-8_26