Iterative Adaptation to Quantization Noise

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

Abstract

Quantization allows accelerating neural networks significantly, especially for mobile processors. Existing quantization methods require either training neural network from scratch or gives significant accuracy drop for the quantized model. Low bits quantization (e.g., 4- or 6-bit) task is a much more resource consumptive problem in comparison with 8-bit quantization, it requires a significant amount of labeled training data. We propose a new low-bit quantization method for mobile neural network architectures that doesn’t require training from scratch and a big amount of train labeled data and allows to avoid significant accuracy drop.

Original languageEnglish
Title of host publicationAdvances in Computational Intelligence - 16th International Work-Conference on Artificial Neural Networks, IWANN 2021, Proceedings
EditorsIgnacio Rojas, Gonzalo Joya, Andreu Catala
PublisherSpringer Science and Business Media Deutschland GmbH
Pages303-310
Number of pages8
ISBN (Print)9783030850296
DOIs
Publication statusPublished - 2021
Event16th International Work-Conference on Artificial Neural Networks, IWANN 2021 - Virtual, Online
Duration: 16 Jun 202118 Jun 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12861 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th International Work-Conference on Artificial Neural Networks, IWANN 2021
CityVirtual, Online
Period16.06.202118.06.2021

Keywords

  • Distillation
  • Machine learning
  • Neural networks
  • Quantization

OECD FOS+WOS

  • 1.01 MATHEMATICS
  • 1.02 COMPUTER AND INFORMATION SCIENCES

Fingerprint

Dive into the research topics of 'Iterative Adaptation to Quantization Noise'. Together they form a unique fingerprint.

Cite this