Any-Precision Deep Neural Networks

被引用1|引用|浏览52
其它链接arxiv.org
微博一下
Instead of seeking for a better operating point, we enable runtime adjustment of model precision-level to support flexible efficiency/accuracy tradeoff without additional storage or computation cost

摘要

We present Any-Precision Deep Neural Networks (Any-Precision DNNs), which are trained with a new method that empowers learned DNNs to be flexible in any numerical precision during inference. The same model in runtime can be flexibly and directly set to different bit-width, by truncating the least significant bits, to support dynamic spe...更多

代码

数据

0
简介
  • While state-of-the-art deep learning models can achieve very high accuracy on various benchmarks, runtime cost is another crucial factor to consider in practice.
  • 1-bit 1-bit 1-bit nnL-b-bit it AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeiF48t2FpoQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgpr6xubW8Xt0s7u3v5B+fCoreNUMWyxWMSqE1CNgktsGW4EdhKFNAoEPgTj25n/8IRK81jem0mCfkSHkoecUWOlpuyXK27VnYOsEi8nFcjR6Je/eoOYpRFKwwTVuuu5ifEzqgxnAqelXqoxoWxMh9i1VNIItZ/ND52SM6sMSBgrW9KQufp7IqOR1pMosJ0RNSO97M3E/7xuasJrP+MySQ1KtlgUpoKYmMy+JgOukBkxsYQyxe2thI2ooszYbEo2BG/55VXSrlW9i2qteVmp3+RxFOEETuEcPLiCOtxBA1rAAOEZXuHNeXRenHfnY9FacPKZY/gD5/MH2F+M9g== AAAB7HicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsaLYk2FhaYeEACF7K37MGGvb3L7pwJIfwGGwuNsfUH2flvXOAKBV8yyct7M5mZF6ZSGHTdb6ewsbm1vVPcLe3tHxwelY9PWibJNOM+S2SiOyE1XArFfRQoeSfVnMah5O1wfDv3209cG5GoR5ykPIjpUIlIMIpW8quqf1/tlytuzV2ArBMvJxXI0eyXv3qDhGUxV8gkNabruSkGU6pRMMlnpV5meErZmA5511JFY26C6eLYGbmwyoBEibalkCzU3xNTGhsziUPbGVMcmVVvLv7ndTOMroOpUGmGXLHloiiTBBMy/5wMhOYM5cQSyrSwtxI2opoytPmUbAje6svrpFWveW7Ne6hXGjd5HEU4g3O4BA+uoAF30AQfGAh4hld4c5Tz4rw7H8vWgpPPnMIfOJ8/3VuOCw== nni-b-bit it AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeiF48t2FpoQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgpr6xubW8Xt0s7u3v5B+fCoreNUMWyxWMSqE1CNgktsGW4EdhKFNAoEPgTj25n/8IRK81jem0mCfkSHkoecUWOlpuyXK27VnYOsEi8nFcjR6Je/eoOYpRFKwwTVuuu5ifEzqgxnAqelXqoxoWxMh9i1VNIItZ/ND52SM6sMSBgrW9KQufp7IqOR1pMosJ0RNSO97M3E/7xuasJrP+MySQ1KtlgUpoKYmMy+JgOukBkxsYQyxe2thI2ooszYbEo2BG/55VXSrlW9i2qteVmp3+RxFOEETuEcPLiCOtxBA1rAAOEZXuHNeXRenHfnY9FacPKZY/gD5/MH2F+M9g== AAAB7HicbVBNT8JAEJ3iF+IX6tHLRjDxRFoueiR68YiJBRNoyHaZwobtttndmpCG3+DFg8Z49Qd589+4QA8KvmSSl/dmMjMvTAXXxnW/ndLG5tb2Tnm3srd/cHhUPT7p6CRTDH2WiEQ9hlSj4BJ9w43Ax1QhjUOB3XByO/e7T6g0T+SDmaYYxHQkecQZNVby63LA64NqzW24C5B14hWkBgXag+pXf5iwLEZpmKBa9zw3NUFOleFM4KzSzzSmlE3oCHuWShqjDvLFsTNyYZUhiRJlSxqyUH9P5DTWehqHtjOmZqxXvbn4n9fLTHQd5FymmUHJlouiTBCTkPnnZMgVMiOmllCmuL2VsDFVlBmbT8WG4K2+vE46zYbnNrz7Zq11U8RRhjM4h0vw4ApacAdt8IEBh2d4hTdHOi/Ou/OxbC05xcwp/IHz+QMJe44o nn1-b-bit it AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeiF48t2FpoQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgpr6xubW8Xt0s7u3v5B+fCoreNUMWyxWMSqE1CNgktsGW4EdhKFNAoEPgTj25n/8IRK81jem0mCfkSHkoecUWOlpuyXK27VnYOsEi8nFcjR6Je/eoOYpRFKwwTVuuu5ifEzqgxnAqelXqoxoWxMh9i1VNIItZ/ND52SM6sMSBgrW9KQufp7IqOR1pMosJ0RNSO97M3E/7xuasJrP+MySQ1KtlgUpoKYmMy+JgOukBkxsYQyxe2thI2ooszYbEo2BG/55VXSrlW9i2qteVmp3+RxFOEETuEcPLiCOtxBA1rAAOEZXuHNeXRenHfnY9FacPKZY/gD5/MH2F+M9g== AAAB7HicbVBNT8JAEJ3iF+IX6tHLRjDxRFoueiR68YiJBRNoyHaZwobtttndmpCG3+DFg8Z49Qd589+4QA8KvmSSl/dmMjMvTAXXxnW/ndLG5tb2Tnm3srd/cHhUPT7p6CRTDH2WiEQ9hlSj4BJ9w43Ax1QhjUOB3XByO/e7T6g0T+SDmaYYxHQkecQZNVby63Lg1QfVmttwFyDrxCtIDQq0B9Wv/jBhWYzSMEG17nluaoKcKsOZwFmln2lMKZvQEfYslTRGHeSLY2fkwipDEiXKljRkof6eyGms9TQObWdMzVivenPxP6+Xmeg6yLlMM4OSLRdFmSAmIfPPyZArZEZMLaFMcXsrYWOqKDM2n4oNwVt9eZ10mg3PbXj3zVrrpoijDGdwDpfgwRW04A7a4AMDDs/wCm+OdF6cd+dj2VpyiplT+APn8we0VI3w.
  • To alleviate this issue, a number of approaches have been proposed to address it from different perspectives.
  • People consider to adaptively modify general deep learning model inference to dynamically determine the execution during the feed-forward pass to save some computation at the cost of potential accuracy drop [10, 28, 31, 29]
重点内容
  • While state-of-the-art deep learning models can achieve very high accuracy on various benchmarks, runtime cost is another crucial factor to consider in practice
  • We propose a method to train deep learning models to be flexible in numerical precision, namely Any-Precision deep neural networks
  • We introduce Any-Precision DNN to address the practical efficiency/accuracy trade-off dilemma from a new perspective
  • Instead of seeking for a better operating point, we enable runtime adjustment of model precision-level to support flexible efficiency/accuracy tradeoff without additional storage or computation cost
  • When all layers are set to low-bits, we show that the model achieved accuracy comparable to dedicated models trained at the same precision
  • We evaluate our method on three major image classification datasets with multiple network architectures
方法
  • The authors validate the method with several network architectures and datasets. These networks include a 8-layer CNN

    Dataset Cifar-10 [18] SVHN [23] ImageNet [7]

    Image Number (Train/Test) 50k/10k 604k/26k 1.3M/50k (a) Cifar-10

    (b) SVHN (c) ImageNet (named Model C in [36]), AlexNet [19], Resnet-8 [12], Resnet-20, and Resnet-50 [12].
  • On Cifar-10, the authors train AlexNet and Resnet-20 models for 200 epochs with initial learning rate 0.001 and decayed by 0.1 at epochs {100, 150, 280}.
  • On SVHN, the 8-layer CNN and Resnet-8 models are trained for 100 epochs with initial learning rate 0.001 and decayed by 0.1 at epochs {50, 75, 90}.
  • On ImageNet, the authors train Resnet-50 model for 120 epochs with initial learning rate 0.1 decayed by 0.1 at epochs {30, 60, 85, 95, 105} with SGD optimizer
结果
  • When all layers are set to low-bits, the authors show that the model achieved accuracy comparable to dedicated models trained at the same precision.
  • On Cifar-10, our 1-bit model achieves an accuracy of 89.99% while the recent work from Ding et al [8] reports 89.90%
结论
  • The authors introduce Any-Precision DNN to address the practical efficiency/accuracy trade-off dilemma from a new perspective.
  • The model accuracy drops gracefully when bit-width gets smaller.
  • To train an AnyPrecision DNN, the authors propose to have dynamic model-wise quantization in training and employ dynamically changed BatchNorm layers to align activation distributions across different bit-width.
  • When running in low-bit by bit-shifting the pre-trained weights and quantizing the activations, the model achieves comparable accuracy to dedicatedly trained low-precision models
总结
  • Introduction:

    While state-of-the-art deep learning models can achieve very high accuracy on various benchmarks, runtime cost is another crucial factor to consider in practice.
  • 1-bit 1-bit 1-bit nnL-b-bit it AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeiF48t2FpoQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgpr6xubW8Xt0s7u3v5B+fCoreNUMWyxWMSqE1CNgktsGW4EdhKFNAoEPgTj25n/8IRK81jem0mCfkSHkoecUWOlpuyXK27VnYOsEi8nFcjR6Je/eoOYpRFKwwTVuuu5ifEzqgxnAqelXqoxoWxMh9i1VNIItZ/ND52SM6sMSBgrW9KQufp7IqOR1pMosJ0RNSO97M3E/7xuasJrP+MySQ1KtlgUpoKYmMy+JgOukBkxsYQyxe2thI2ooszYbEo2BG/55VXSrlW9i2qteVmp3+RxFOEETuEcPLiCOtxBA1rAAOEZXuHNeXRenHfnY9FacPKZY/gD5/MH2F+M9g== AAAB7HicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsaLYk2FhaYeEACF7K37MGGvb3L7pwJIfwGGwuNsfUH2flvXOAKBV8yyct7M5mZF6ZSGHTdb6ewsbm1vVPcLe3tHxwelY9PWibJNOM+S2SiOyE1XArFfRQoeSfVnMah5O1wfDv3209cG5GoR5ykPIjpUIlIMIpW8quqf1/tlytuzV2ArBMvJxXI0eyXv3qDhGUxV8gkNabruSkGU6pRMMlnpV5meErZmA5511JFY26C6eLYGbmwyoBEibalkCzU3xNTGhsziUPbGVMcmVVvLv7ndTOMroOpUGmGXLHloiiTBBMy/5wMhOYM5cQSyrSwtxI2opoytPmUbAje6svrpFWveW7Ne6hXGjd5HEU4g3O4BA+uoAF30AQfGAh4hld4c5Tz4rw7H8vWgpPPnMIfOJ8/3VuOCw== nni-b-bit it AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeiF48t2FpoQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgpr6xubW8Xt0s7u3v5B+fCoreNUMWyxWMSqE1CNgktsGW4EdhKFNAoEPgTj25n/8IRK81jem0mCfkSHkoecUWOlpuyXK27VnYOsEi8nFcjR6Je/eoOYpRFKwwTVuuu5ifEzqgxnAqelXqoxoWxMh9i1VNIItZ/ND52SM6sMSBgrW9KQufp7IqOR1pMosJ0RNSO97M3E/7xuasJrP+MySQ1KtlgUpoKYmMy+JgOukBkxsYQyxe2thI2ooszYbEo2BG/55VXSrlW9i2qteVmp3+RxFOEETuEcPLiCOtxBA1rAAOEZXuHNeXRenHfnY9FacPKZY/gD5/MH2F+M9g== AAAB7HicbVBNT8JAEJ3iF+IX6tHLRjDxRFoueiR68YiJBRNoyHaZwobtttndmpCG3+DFg8Z49Qd589+4QA8KvmSSl/dmMjMvTAXXxnW/ndLG5tb2Tnm3srd/cHhUPT7p6CRTDH2WiEQ9hlSj4BJ9w43Ax1QhjUOB3XByO/e7T6g0T+SDmaYYxHQkecQZNVby63LA64NqzW24C5B14hWkBgXag+pXf5iwLEZpmKBa9zw3NUFOleFM4KzSzzSmlE3oCHuWShqjDvLFsTNyYZUhiRJlSxqyUH9P5DTWehqHtjOmZqxXvbn4n9fLTHQd5FymmUHJlouiTBCTkPnnZMgVMiOmllCmuL2VsDFVlBmbT8WG4K2+vE46zYbnNrz7Zq11U8RRhjM4h0vw4ApacAdt8IEBh2d4hTdHOi/Ou/OxbC05xcwp/IHz+QMJe44o nn1-b-bit it AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeiF48t2FpoQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgpr6xubW8Xt0s7u3v5B+fCoreNUMWyxWMSqE1CNgktsGW4EdhKFNAoEPgTj25n/8IRK81jem0mCfkSHkoecUWOlpuyXK27VnYOsEi8nFcjR6Je/eoOYpRFKwwTVuuu5ifEzqgxnAqelXqoxoWxMh9i1VNIItZ/ND52SM6sMSBgrW9KQufp7IqOR1pMosJ0RNSO97M3E/7xuasJrP+MySQ1KtlgUpoKYmMy+JgOukBkxsYQyxe2thI2ooszYbEo2BG/55VXSrlW9i2qteVmp3+RxFOEETuEcPLiCOtxBA1rAAOEZXuHNeXRenHfnY9FacPKZY/gD5/MH2F+M9g== AAAB7HicbVBNT8JAEJ3iF+IX6tHLRjDxRFoueiR68YiJBRNoyHaZwobtttndmpCG3+DFg8Z49Qd589+4QA8KvmSSl/dmMjMvTAXXxnW/ndLG5tb2Tnm3srd/cHhUPT7p6CRTDH2WiEQ9hlSj4BJ9w43Ax1QhjUOB3XByO/e7T6g0T+SDmaYYxHQkecQZNVby63Lg1QfVmttwFyDrxCtIDQq0B9Wv/jBhWYzSMEG17nluaoKcKsOZwFmln2lMKZvQEfYslTRGHeSLY2fkwipDEiXKljRkof6eyGms9TQObWdMzVivenPxP6+Xmeg6yLlMM4OSLRdFmSAmIfPPyZArZEZMLaFMcXsrYWOqKDM2n4oNwVt9eZ10mg3PbXj3zVrrpoijDGdwDpfgwRW04A7a4AMDDs/wCm+OdF6cd+dj2VpyiplT+APn8we0VI3w.
  • To alleviate this issue, a number of approaches have been proposed to address it from different perspectives.
  • People consider to adaptively modify general deep learning model inference to dynamically determine the execution during the feed-forward pass to save some computation at the cost of potential accuracy drop [10, 28, 31, 29]
  • Methods:

    The authors validate the method with several network architectures and datasets. These networks include a 8-layer CNN

    Dataset Cifar-10 [18] SVHN [23] ImageNet [7]

    Image Number (Train/Test) 50k/10k 604k/26k 1.3M/50k (a) Cifar-10

    (b) SVHN (c) ImageNet (named Model C in [36]), AlexNet [19], Resnet-8 [12], Resnet-20, and Resnet-50 [12].
  • On Cifar-10, the authors train AlexNet and Resnet-20 models for 200 epochs with initial learning rate 0.001 and decayed by 0.1 at epochs {100, 150, 280}.
  • On SVHN, the 8-layer CNN and Resnet-8 models are trained for 100 epochs with initial learning rate 0.001 and decayed by 0.1 at epochs {50, 75, 90}.
  • On ImageNet, the authors train Resnet-50 model for 120 epochs with initial learning rate 0.1 decayed by 0.1 at epochs {30, 60, 85, 95, 105} with SGD optimizer
  • Results:

    When all layers are set to low-bits, the authors show that the model achieved accuracy comparable to dedicated models trained at the same precision.
  • On Cifar-10, our 1-bit model achieves an accuracy of 89.99% while the recent work from Ding et al [8] reports 89.90%
  • Conclusion:

    The authors introduce Any-Precision DNN to address the practical efficiency/accuracy trade-off dilemma from a new perspective.
  • The model accuracy drops gracefully when bit-width gets smaller.
  • To train an AnyPrecision DNN, the authors propose to have dynamic model-wise quantization in training and employ dynamically changed BatchNorm layers to align activation distributions across different bit-width.
  • When running in low-bit by bit-shifting the pre-trained weights and quantizing the activations, the model achieves comparable accuracy to dedicatedly trained low-precision models
表格
  • Table1: Details of datasets used in our experiments
  • Table2: Comparison of the proposed Any-Precision DNN to dedicated models: the proposed method achieved the strong baseline accuracy in most cases, even occasionally outperforms the baselines in low-precision. We hypothesize that the gain is mainly from the knowledge distillation from high-precision models in training
  • Table3: Comparison to other post-training quantization methods: All models are Resnet-20 trained on Cifar-10. When bit-width drops from their original training setting, our method consistently outperforms them
  • Table4: Comparison to other post-training quantization methods: All models are Resnet-50 trained on ImageNet. When bit-width drops from their original training setting, our method consistently outperform them
  • Table5: Classification accuracy of Resnet-20 with different bitwidth combinations in training on Cifar-10
  • Table6: Classification accuracy of Resnet-8 with different knowledge distillation on SVHN test set
基金
  • When all layers are set to low-bits, we show that the model achieved accuracy comparable to dedicated models trained at the same precision
  • On Cifar-10, our 1-bit model achieves an accuracy of 89.99% while the recent work from Ding et al [8] reports 89.90%
研究对象与分析
datasets: 3
For example, on Cifar-10, our 1-bit model achieves an accuracy of 89.99% while the recent work from Ding et al [8] reports 89.90%. As shown in Table 2, on all three datasets, the proposed Any-Precision DNN achieves comparable performance to the competitive dedicated models. We compare our method to alternative post-training quantization methods

datasets: 3
For example, on Cifar-10, our 1-bit model achieves an accuracy of 89.99% while the recent work from Ding et al [8] reports 89.90%. As shown in Table 2, on all three datasets, the proposed Any-Precision DNN achieves comparable performance to the competitive dedicated models. 4.3

major image classification datasets with multiple network architectures: 3
To train an AnyPrecision DNN, we propose to have dynamic model-wise quantization in training and employ dynamically changed BatchNorm layers to align activation distributions across different bit-width. We evaluate our method on three major image classification datasets with multiple network architectures. When running in low-bit by simply bit-shifting the pre-trained weights and quantizing the activations, our model achieves comparable accuracy to dedicatedly trained low-precision models

引用论文
  • Yoshua Bengio, Nicholas Leonard, and Aaron Courville. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432, 2013. 2
    Findings
  • Han Cai, Ligeng Zhu, and Song Han. Proxylessnas: Direct neural architecture search on target task and hardware. arXiv preprint arXiv:1812.00332, 2018. 1
    Findings
  • Zhaowei Cai, Xiaodong He, Jian Sun, and Nuno Vasconcelos. Deep learning with low precision by half-wave gaussian quantization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5918– 5926, 2017. 2
    Google ScholarLocate open access versionFindings
  • Xin Chen, Lingxi Xie, Jun Wu, and Qi Tian. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. arXiv preprint arXiv:1904.12760, 2019. 1
    Findings
  • Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang, Vijayalakshmi Srinivasan, and Kailash Gopalakrishnan. Pact: Parameterized clipping activation for quantized neural networks. arXiv preprint arXiv:1805.06085, 2018. 2
    Findings
  • Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1. arXiv preprint arXiv:1602.02830, 2012, 3
    Findings
  • J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09, 2009. 6
    Google ScholarLocate open access versionFindings
  • Ruizhou Ding, Ting-Wu Chin, Zeye Liu, and Diana Marculescu. Regularizing activation distribution for training binarized deep networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 11408–11417, 2019. 2, 6
    Google ScholarLocate open access versionFindings
  • Zhen Dong, Zhewei Yao, Amir Gholami, Michael Mahoney, and Kurt Keutzer. Hawq: Hessian aware quantization of neural networks with mixed-precision. arXiv preprint arXiv:1905.03696, 2012, 3
    Findings
  • Michael Figurnov, Maxwell D Collins, Yukun Zhu, Li Zhang, Jonathan Huang, Dmitry Vetrov, and Ruslan Salakhutdinov. Spatially adaptive computation time for residual networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1039– 1048, 2017. 1
    Google ScholarLocate open access versionFindings
  • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015. 2
    Google ScholarLocate open access versionFindings
  • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. 6
    Google ScholarLocate open access versionFindings
  • Geoffrey Hinton. Neural networks for machine learning, 2012. 4
    Google ScholarFindings
  • Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015. 5
    Findings
  • Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, et al. Searching for mobilenetv3. arXiv preprint arXiv:1905.02244, 2019. 1
    Findings
  • Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015. 4, 5
    Findings
  • Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. 6
    Findings
  • Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009. 6
    Google ScholarFindings
  • Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012. 6
    Google ScholarLocate open access versionFindings
  • Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, and Kevin Murphy. Progressive neural architecture search. In Proceedings of the European Conference on Computer Vision (ECCV), pages 19–34, 2018. 1
    Google ScholarLocate open access versionFindings
  • Zechun Liu, Baoyuan Wu, Wenhan Luo, Xin Yang, Wei Liu, and Kwang-Ting Cheng. Bi-real net: Enhancing the performance of 1-bit cnns with improved representational capability and advanced training algorithm. In Proceedings of the European Conference on Computer Vision (ECCV), pages 722–737, 2018. 2
    Google ScholarLocate open access versionFindings
  • Markus Nagel, Mart van Baalen, Tijmen Blankevoort, and Max Welling. Data-free quantization through weight equalization and bias correction. arXiv preprint arXiv:1906.04721, 2019. 3
    Findings
  • Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning. 2011. 6
    Google ScholarFindings
  • Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in PyTorch. In NIPS Autodiff Workshop, 2017. 6
    Google ScholarLocate open access versionFindings
  • Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. Xnor-net: Imagenet classification using binary convolutional neural networks. In European Conference on Computer Vision, pages 525–542. Springer, 2016. 2, 3
    Google ScholarLocate open access versionFindings
  • Mingxing Tan and Quoc V Le. Efficientnet: Rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905.11946, 2019. 1
    Findings
  • Wei Tang, Gang Hua, and Liang Wang. How to train a compact binary neural network with high accuracy? In ThirtyFirst AAAI Conference on Artificial Intelligence, 2017. 2, 3
    Google ScholarLocate open access versionFindings
  • Surat Teerapittayanon, Bradley McDanel, and Hsiang-Tsung Kung. Branchynet: Fast inference via early exiting from deep neural networks. In 2016 23rd International Conference on Pattern Recognition (ICPR), pages 2464–2469. IEEE, 2016. 1
    Google ScholarLocate open access versionFindings
  • Andreas Veit and Serge Belongie. Convolutional networks with adaptive inference graphs. In Proceedings of the European Conference on Computer Vision (ECCV), pages 3–18, 2018. 1
    Google ScholarLocate open access versionFindings
  • Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, and Song Han. Haq: Hardware-aware automated quantization with mixed precision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8612–8620, 2019. 2, 3
    Google ScholarLocate open access versionFindings
  • Zuxuan Wu, Tushar Nagarajan, Abhishek Kumar, Steven Rennie, Larry S Davis, Kristen Grauman, and Rogerio Feris. Blockdrop: Dynamic inference paths in residual networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8817–8826, 2018. 1
    Google ScholarLocate open access versionFindings
  • Jiahui Yu and Thomas Huang. Universally slimmable networks and improved training techniques. arXiv preprint arXiv:1903.05134, 2019. 1, 5
    Findings
  • Jiahui Yu, Linjie Yang, Ning Xu, Jianchao Yang, and Thomas Huang. Slimmable neural networks. arXiv preprint arXiv:1812.08928, 2018. 3
    Findings
  • Dongqing Zhang, Jiaolong Yang, Dongqiangzi Ye, and Gang Hua. Lq-nets: Learned quantization for highly accurate and compact deep neural networks. In Proceedings of the European Conference on Computer Vision (ECCV), pages 365– 382, 2018. 2
    Google ScholarLocate open access versionFindings
  • Xiaofan Zhang, Haoming Lu, Cong Hao, Jiachen Li, Bowen Cheng, Yuhong Li, Kyle Rupnow, Jinjun Xiong, Thomas Huang, Honghui Shi, et al. Skynet: a hardware-efficient method for object detection and tracking on embedded systems. arXiv preprint arXiv:1909.09709, 2019. 1
    Findings
  • Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, and Yuheng Zou. Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160, 2016. 2, 3, 4, 6
    Findings
  • Yiren Zhou, Seyed-Mohsen Moosavi-Dezfooli, Ngai-Man Cheung, and Pascal Frossard. Adaptive quantization for deep neural network. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018. 3
    Google ScholarLocate open access versionFindings
  • Bohan Zhuang, Chunhua Shen, Mingkui Tan, Lingqiao Liu, and Ian Reid. Towards effective low-bitwidth convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7920– 7928, 2018. 2
    Google ScholarLocate open access versionFindings
下载 PDF 全文
您的评分 :
0

 

标签
评论