Please use this identifier to cite or link to this item:
https://hdl.handle.net/2440/124484
Citations | ||
Scopus | Web of Science® | Altmetric |
---|---|---|
?
|
?
|
Type: | Conference paper |
Title: | Towards effective low-bitwidth convolutional neural networks |
Author: | Zhuang, B. Shen, C. Tan, M. Liu, L. Reid, I. |
Citation: | Proceedings / CVPR, IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2018, pp.7920-7928 |
Publisher: | IEEE |
Publisher Place: | Piscataway, NJ. |
Issue Date: | 2018 |
Series/Report no.: | IEEE Conference on Computer Vision and Pattern Recognition |
ISBN: | 1538664208 9781538664209 |
ISSN: | 1063-6919 2575-7075 |
Conference Name: | IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (18 Jun 2018 - 23 Jun 2018 : Salt Lake City, USA) |
Statement of Responsibility: | Bohan Zhuang, Chunhua Shen, Mingkui Tan, Lingqiao Liu, Ian Reid |
Abstract: | This paper tackles the problem of training a deep convolutional neural network with both low-precision weights and low-bitwidth activations. Optimizing a low-precision network is very challenging since the training process can easily get trapped in a poor local minima, which results in substantial accuracy loss. To mitigate this problem, we propose three simple-yet-effective approaches to improve the network training. First, we propose to use a two-stage optimization strategy to progressively find good local minima. Specifically, we propose to first optimize a net with quantized weights and then quantized activations. This is in contrast to the traditional methods which optimize them simultaneously. Second, following a similar spirit of the first method, we propose another progressive optimization approach which progressively decreases the bit-width from high-precision to low-precision during the course of training. Third, we adopt a novel learning scheme to jointly train a full-precision model alongside the low-precision one. By doing so, the full-precision model provides hints to guide the low-precision model training. Extensive experiments on various datasets (i.e., CIFAR-100 and ImageNet) show the effectiveness of the proposed methods. To highlight, using our methods to train a 4-bit precision network leads to no performance decrease in comparison with its fullprecision counterpart with standard network architectures (i.e., AlexNet and ResNet-50). |
Rights: | © 2018 IEEE |
DOI: | 10.1109/CVPR.2018.00826 |
Grant ID: | http://purl.org/au-research/grants/arc/DE170101259 http://purl.org/au-research/grants/arc/FL130100102 |
Published version: | https://ieeexplore.ieee.org/xpl/conhome/8576498/proceeding |
Appears in Collections: | Aurora harvest 4 Computer Science publications |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.