Trainable Thresholds for Neural Network Quantization

Alexander Goncharenko, Andrey Denisov, Sergey Alyamkin, Evgeny Terentev

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

Abstract

Embedded computer vision applications for robotics, security cameras, and mobile phone apps require the usage of mobile neural network architectures like MobileNet-v2 or MNAS-Net in order to reduce RAM consumption and accelerate processing. An additional option for further resource consumption reduction is 8-bit neural network quantization. Unfortunately, the known methods for neural network quantization lead to significant accuracy reduction (more than 1.2%) for mobile architectures and require long training with quantization procedure. To overcome this limitation, we propose a method that allows to quantize mobile neural network without significant accuracy loss. Our approach is based on trainable quantization thresholds for each neural network filter, that allows to accelerate training with quantization procedure up to 10 times in comparison with the standard techniques. Using the proposed technique, we quantize the modern mobile architectures of neural networks with the accuracy loss not exceeding 0.1%. Ready-for-use models and code are available at: https://github.com/agoncharenko1992/FAT-fast-adjustable-threshold.

Original languageEnglish
Title of host publicationAdvances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings
EditorsIgnacio Rojas, Gonzalo Joya, Andreu Catala
PublisherSpringer-Verlag GmbH and Co. KG
Pages302-312
Number of pages11
ISBN (Print)9783030205171
DOIs
Publication statusPublished - 1 Jan 2019
Event15th International Work-Conference on Artificial Neural Networks, IWANN 2019 - Gran Canaria, Spain
Duration: 12 Jun 201914 Jun 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11507 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference15th International Work-Conference on Artificial Neural Networks, IWANN 2019
CountrySpain
CityGran Canaria
Period12.06.201914.06.2019

Keywords

  • Distillation
  • Machine learning
  • Neural networks
  • Quantization

Fingerprint

Dive into the research topics of 'Trainable Thresholds for Neural Network Quantization'. Together they form a unique fingerprint.

Cite this