On Practical Approach to Uniform Quantization of Non-redundant Neural Networks

Alexander Goncharenko, Andrey Denisov, Sergey Alyamkin, Evgeny Terentev

Результат исследования: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференциинаучнаярецензирование

Аннотация

The neural network quantization is highly desired procedure to perform before running the neural networks on mobile devices. Quantization without fine-tuning leads to accuracy drop of the model, whereas commonly used training with quantization is done on the full set of the labeled data and therefore is both time- and resource-consuming. Real life applications require simplification and acceleration of the quantization procedure that will maintain the accuracy of full-precision neural network, especially for modern mobile neural network architectures like Mobilenet-v1, MobileNet-v2 and MNAS. Here we present two methods to significantly optimize the training with the quantization procedure. The first one is introducing the trained scale factors for discretization thresholds that are separate for each filter. The second one is based on mutual rescaling of consequent depth-wise separable convolution and convolution layers. Using the proposed techniques, we quantize the modern mobile architectures of neural networks with the set of train data of only ∼ 10% of the total ImageNet 2012 sample. Such reduction of the train dataset size and a small number of trainable parameters allow to fine-tune the network for several hours while maintaining the high accuracy of the quantized model (the accuracy drop was less than 0.5%). The ready-for-use models and code are available at: https://github.com/agoncharenko1992/FAT-fast-adjustable-threshold.

Язык оригиналаанглийский
Название основной публикацииArtificial Neural Networks and Machine Learning – ICANN 2019
Подзаголовок основной публикацииDeep Learning - 28th International Conference on Artificial Neural Networks, Proceedings
РедакторыIgor V. Tetko, Pavel Karpov, Fabian Theis, Vera Kurková
ИздательSpringer-Verlag GmbH and Co. KG
Страницы349-360
Число страниц12
ISBN (печатное издание)9783030304836
DOI
СостояниеОпубликовано - 1 янв 2019
Событие28th International Conference on Artificial Neural Networks, ICANN 2019 - Munich, Германия
Продолжительность: 17 сен 201919 сен 2019

Серия публикаций

НазваниеLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Том11728 LNCS
ISSN (печатное издание)0302-9743
ISSN (электронное издание)1611-3349

Конференция

Конференция28th International Conference on Artificial Neural Networks, ICANN 2019
СтранаГермания
ГородMunich
Период17.09.201919.09.2019

Fingerprint Подробные сведения о темах исследования «On Practical Approach to Uniform Quantization of Non-redundant Neural Networks». Вместе они формируют уникальный семантический отпечаток (fingerprint).

Цитировать