In Search for a SAT-friendly Binarized Neural Network Architecture

ICLR, 2020.

Cited by: 2|Bibtex|Views23
EI
Other Links: academic.microsoft.com|dblp.uni-trier.de
Weibo:
We demonstrated how to train BNNs so that the resulting network is easier to verify for logic-based engines, like SAT solver or approximate model counting solvers

Abstract:

Analyzing the behavior of neural networks is one of the most pressing challenges in deep learning. Binarized Neural Networks are an important class of networks that allow equivalent representation in Boolean logic and can be analyzed formally with logic-based reasoning tools like SAT solvers. Such tools can be used to answer existential...More

Code:

Data:

Introduction
  • Deep neural networks are among the most successful AI technologies making impact in a variety of practical applications ranging from vision to speech recognition and natural language (Goodfellow et al, 2016).
  • One way to address this problem is to define properties that we expect the network to satisfy.
  • The first approach, the certification of neural networks, trains a verified network that satisfies given properties, e.g. a network that is guaranteed to be robust to adversarial perturbations (Wong & Kolter, 2018; Dvijotham et al, 2018; Raghunathan et al, 2018; Mirman et al, 2018).
Highlights
  • Deep neural networks are among the most successful AI technologies making impact in a variety of practical applications ranging from vision to speech recognition and natural language (Goodfellow et al, 2016)
  • Binarized neural networks We summarise standard BNN structure (Hubara et al, 2016)
  • First issue is that A is a dense matrix by design of BNNs. To deal with this issue, we propose using a ternary quantization procedure instead of a binary quantization to learn sparse matrices for linear transformations
  • Our results show that a significant speed up can be obtained if we design an architecture of BNNs and its training procedure taking into account potential bottlenecks of SAT solvers
  • We demonstrated how to train BNNs so that the resulting network is easier to verify for logic-based engines, like SAT solver or approximate model counting solvers
Methods
  • We focus on the improvements we proposed for the neuron level of granularity. We use three datasets from (Narodytska et al, 2018; Baluta et al, 2019; Khalil et al, 2019).
  • The input of the network is an image of size 28 × 28.
  • The network has the following dimensions for all layers: [784, 500, 300, 200, 100, 10].
  • This gives 623K parameters, which is 3 to 10 times bigger than the networks in (Narodytska et al, 2018; Baluta et al, 2019).
  • As the first layer inputs are reals, we used an additional BN + sign layer after the
Results
  • If a solver time/memory outs on more than 70% of benchmarks we do not plot these results.
Conclusion
  • We analyze architectural design choices of BNNs and discuss how they affect the performance of logic-based reasoners, focusing on the individual neuron and block levels.
  • We demonstrated how to train BNNs so that the resulting network is easier to verify for logic-based engines, like SAT solver or approximate model counting solvers
Summary
  • Introduction:

    Deep neural networks are among the most successful AI technologies making impact in a variety of practical applications ranging from vision to speech recognition and natural language (Goodfellow et al, 2016).
  • One way to address this problem is to define properties that we expect the network to satisfy.
  • The first approach, the certification of neural networks, trains a verified network that satisfies given properties, e.g. a network that is guaranteed to be robust to adversarial perturbations (Wong & Kolter, 2018; Dvijotham et al, 2018; Raghunathan et al, 2018; Mirman et al, 2018).
  • Methods:

    We focus on the improvements we proposed for the neuron level of granularity. We use three datasets from (Narodytska et al, 2018; Baluta et al, 2019; Khalil et al, 2019).
  • The input of the network is an image of size 28 × 28.
  • The network has the following dimensions for all layers: [784, 500, 300, 200, 100, 10].
  • This gives 623K parameters, which is 3 to 10 times bigger than the networks in (Narodytska et al, 2018; Baluta et al, 2019).
  • As the first layer inputs are reals, we used an additional BN + sign layer after the
  • Results:

    If a solver time/memory outs on more than 70% of benchmarks we do not plot these results.
  • Conclusion:

    We analyze architectural design choices of BNNs and discuss how they affect the performance of logic-based reasoners, focusing on the individual neuron and block levels.
  • We demonstrated how to train BNNs so that the resulting network is easier to verify for logic-based engines, like SAT solver or approximate model counting solvers
Tables
  • Table1: Structure of internal and output blocks, which, stacked together, form a BNN
  • Table2: The test accuracy (%) and the number of non-zero parameters (#prms) of the trained networks
  • Table3: The original (the top row) and successfully perturbed (the bottom row) images for MNISTBG for = 3, 5, 15, 20
  • Table4: The average size of the SAT encoding in terms of the number of variables (#vars) and clauses (#cls), the probability of perturbation to be an adversarial attack(P(adv)) and the number of solved benchmarks out all of benchmarks that can be attacked for = 5
Download tables as Excel
Funding
  • Proposes changes to the BNN architecture and the training procedure to get a simpler network for SAT solvers without sacrificing accuracy on the primary task
  • Proposes a modified training procedure that makes the resulting network easier for logic-based verification tools, like SAT solvers, to reason about
Study subjects and analysis
datasets: 3
We focus on the improvements we proposed for the neuron level of granularity. We use three datasets from (Narodytska et al, 2018; Baluta et al, 2019; Khalil et al, 2019). For each, we experiment with two tasks

datasets: 3
However, on average, we get about 40% of signs stabilized. Figure 1 shows the performance of a SAT solver on three datasets. Results are averaged over 100 benchmarks

datasets: 3
As can be seen from these plots, in all except one case, at least 95% of benchmarks were solved within the 100 sec timeout. The plots show that ternary quantization greatly improves performance on three datasets as we were not even able to solve any benchmarks for the vanilla case. Using the ‘L1’ regularizer slightly improves performance in two datasets and helps a lot on MNIST

datasets: 2
The plots show that ternary quantization greatly improves performance on three datasets as we were not even able to solve any benchmarks for the vanilla case. Using the ‘L1’ regularizer slightly improves performance in two datasets and helps a lot on MNIST. Finally, the stabilization of signs method consistently improves performance of a SAT solver across datasets for both ‘Sparse’ and ‘Sparse+L1’ networks, e.g. we get from 3x to 6x speedup due to stabilization

datasets: 2
We observe similar patterns as for ILP solvers: sparsification and stabilization are mostly helpful for all datasets as we cannot solve most of the benchmarks if we do not apply these techniques. As can be seen from the graph, the ILP solver exhibits a bell shape in two datasets, MNIST and FASHION: if is small or large then the instance is easier to solve. values in the middle are the hardest to solve for an ILP solver. We have not observed this bell shape for SAT solvers

Reference
  • Milad Alizadeh, Javier Fernandez-Marques, Nicholas D. Lane, and Yarin Gal. An empirical study of binary neural networks’ optimisation. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019. URL https://openreview.net/forum?id=rJfUCoR5KX.
    Locate open access versionFindings
  • Gilles Audemard and Laurent Simon. On the glucose SAT solver. International Journal on Artificial Intelligence Tools, 27(1):1–25, 2018. doi: 10.1142/S0218213018400018. URL https://doi.org/10.1142/S0218213018400018.
    Locate open access versionFindings
  • Teodora Baluta, Shiqi Shen, Shweta Shinde, Kuldeep S. Meel, and Prateek Saxena. Quantitative verification of neural networks and its security applications. ACM Conference on Computer and Communications Security (CCS 2019), abs/1906.10395, 2019. URL http://arxiv.org/abs/1906.10395.
    Findings
  • Armin Biere, Marijn Heule, Hans van Maaren, and Toby Walsh (eds.). Handbook of Satisfiability, volume 185 of Frontiers in Artificial Intelligence and Applications. IOS Press, 2009. ISBN 978-1-58603-929-5.
    Google ScholarFindings
  • Aaron R. Bradley. Sat-based model checking without unrolling. In Ranjit Jhala and David A. Schmidt (eds.), Verification, Model Checking, and Abstract Interpretation - 12th International Conference, VMCAI 2011, Austin, TX, USA, January 23-25, 2011.
    Google ScholarLocate open access versionFindings
  • Proceedings, volume 6538 of Lecture Notes in Computer Science, pp. 70–87.
    Google ScholarLocate open access versionFindings
  • Springer, 2011. doi: 10.1007/ 978-3-642-18275-4\ URL https://doi.org/10.1007/978-3-642-18275-4
    Findings
  • 7. Supratik Chakraborty, Kuldeep S. Meel, and Moshe Y. Vardi. A scalable approximate model counter. In Christian Schulte (ed.), Principles and Practice of Constraint Programming - 19th International Conference, CP 2013, Uppsala, Sweden, September 16-20, 2013. Proceedings, volume 8124 of Lecture Notes in Computer Science, pp. 200–216.
    Google ScholarLocate open access versionFindings
  • Chih-Hong Cheng, Georg Nuhrenberg, Chung-Hao Huang, and Harald Ruess. Verification of binarized neural networks via inter-neuron factoring - (short paper). In Ruzica Piskac and Philipp Rummer (eds.), Verified Software. Theories, Tools, and Experiments - 10th International Conference, VSTTE 2018, Oxford, UK, July 18-19, 2018, Revised Selected Papers, volume 11294 of Lecture Notes in Computer Science, pp. 279–290.
    Google ScholarLocate open access versionFindings
  • Springer, 2018. doi: 10.1007/978-3-030-03592-1\ 16. URL https://doi.org/10.1007/978-3-030-03592-1
    Findings
  • 16. Arthur Choi, Weijia Shi, Andy Shih, and Adnan Darwiche. Compiling neural networks into tractable boolean circuits. 2019. Tri Dao, Albert Gu, Matthew Eichhorn, Megan Leszczynski, Nimit Sohoni, Amit Blonder, Atri Rudra, and Chris Re. Butterflies are all you need: A universal building block for structured linear maps. 2019a. URL https://dawn.cs.stanford.edu//2019/06/13/butterfly/.
    Locate open access versionFindings
  • Tri Dao, Albert Gu, Matthew Eichhorn, Atri Rudra, and Christopher Re. Learning fast algorithms for linear transforms using butterfly factorizations. In Kamalika Chaudhuri and Ruslan Salakhutdinov (eds.), Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, volume 97, pp. 1517–1527, 2019b.
    Google ScholarLocate open access versionFindings
  • Sajad Darabi, Mouloud Belbahri, Matthieu Courbariaux, and Vahid Partovi Nia. BNN+: improved binary network training. CoRR, abs/1812.11800, 2018. URL http://arxiv.org/abs/1812.11800.
    Findings
  • Krishnamurthy Dvijotham, Sven Gowal, Robert Stanforth, Relja Arandjelovic, Brendan O’Donoghue, Jonathan Uesato, and Pushmeet Kohli. Training verified learners with learned verifiers. CoRR, abs/1805.10265, 2018. URL http://arxiv.org/abs/1805.10265.
    Findings
  • Stefano Ermon, Carla P. Gomes, Ashish Sabharwal, and Bart Selman. Taming the curse of dimensionality: Discrete integration by hashing and optimization. In Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16-21 June 2013, volume 28, pp. 334–342. JMLR.org, 2013.
    Google ScholarLocate open access versionFindings
  • EU Data Protection Regulation. Regulation (EU) 2016/679 of the European Parliament and of the Council, 2016.
    Google ScholarFindings
  • Jonathan Frankle and Michael Carbin. The lottery ticket hypothesis: Finding sparse, trainable neural networks. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019. URL https://openreview.net/forum?id=rJl-b3RcF7.
    Locate open access versionFindings
  • Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. The MIT Press, 2016. ISBN 0262035618, 9780262035613.
    Google ScholarFindings
  • Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. In ICLR’15, 2015.
    Google ScholarLocate open access versionFindings
  • Bryce Goodman and Seth R. Flaxman. European Union regulations on algorithmic decision-making and a ”right to explanation”. AI Magazine, 38(3):50–57, 2017.
    Google ScholarLocate open access versionFindings
  • Arie Gurfinkel, Temesghen Kahsai, Anvesh Komuravelli, and Jorge A. Navas. The seahorn verification framework. In Daniel Kroening and Corina S. Pasareanu (eds.), Computer Aided Verification - 27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, Part I, volume 9206 of Lecture Notes in Computer Science, pp. 343–361.
    Google ScholarLocate open access versionFindings
  • Springer, 2015. doi: 10.1007/978-3-319-21690-4\ 20. URL https://doi.org/10.1007/978-3-319-21690-4
    Findings
  • 20. Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. Binarized Neural Networks. In Advances in Neural Information Processing Systems 29, pp. 4107–4115. Curran Associates, Inc., 2016. URL http://papers.nips.cc/paper/6573-binarized-neural-networks.pdf. Alexey Ignatiev, Antonio Morgado, and Joao Marques-Silva. PySAT: A Python toolkit for prototyping with SAT oracles. In SAT, pp. 428–437, 2018. doi: 10.1007/978-3-319-94144-8
    Locate open access versionFindings
  • 26. URL https://doi.org/10.1007/978-3-319-94144-8
    Findings
  • 26. Sergey Ioffe and Christian Szegedy. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In ICML’157, pp. 448–456, 2015a. doi: 10.1007/ s13398-014-0173-7.2. Sergey Ioffe and Christian Szegedy. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Arxiv, 2015b. ISSN 0717-6163. doi: 10.1007/ s13398-014-0173-7.2.
    Locate open access versionFindings
  • Guy Katz, Clark W. Barrett, David L. Dill, Kyle Julian, and Mykel J. Kochenderfer. Reluplex: An efficient SMT solver for verifying deep neural networks. In CAV, pp. 97–117, 2017.
    Google ScholarLocate open access versionFindings
  • Guy Katz, Derek A. Huang, Duligur Ibeling, Kyle Julian, Christopher Lazarus, Rachel Lim, Parth Shah, Shantanu Thakoor, Haoze Wu, Aleksandar Zeljic, David L. Dill, Mykel J. Kochenderfer, and Clark W. Barrett. The Marabou framework for verification and analysis of deep neural networks. In CAV, pp. 443–452, 2019.
    Google ScholarLocate open access versionFindings
  • Elias B. Khalil, Amrita Gupta, and Bistra Dilkina. Combinatorial attacks on binarized neural networks. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019. URL https://openreview.net/forum?id= S1lTEh09FQ.
    Locate open access versionFindings
  • Jaeha Kung, David Zhang, Gooitzen Van der Wal, Sek Chai, and Saibal Mukhopadhyay. Efficient Object Detection Using Embedded Binarized Neural Networks. Journal of Signal Processing Systems, pp. 1–14, 2017.
    Google ScholarLocate open access versionFindings
  • Jean-Marie Lagniez and Pierre Marquis. An improved decision-dnnf compiler. In IJCAI, pp. 667– 673, 2017.
    Google ScholarLocate open access versionFindings
  • Bradley McDanel, Surat Teerapittayanon, and H. T. Kung. Embedded binarized neural networks. In EWSN, pp. 168–173. Junction Publishing, Canada / ACM, 2017.
    Google ScholarLocate open access versionFindings
  • Matthew Mirman, Timon Gehr, and Martin T. Vechev. Differentiable abstract interpretation for provably robust neural networks. In Jennifer G. Dy and Andreas Krause (eds.), Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmassan, Stockholm, Sweden, July 10-15, 2018, volume 80 of Proceedings of Machine Learning Research, pp. 3575– 3583. PMLR, 2018. URL http://proceedings.mlr.press/v80/mirman18b.html.
    Locate open access versionFindings
  • Christian Muise, Sheila A. McIlraith, J. Christopher Beck, and Eric Hsu. DSHARP: Fast d-DNNF Compilation with sharpSAT. In Canadian Conference on Artificial Intelligence, 2012.
    Google ScholarLocate open access versionFindings
  • Nina Narodytska, Shiva Prasad Kasiviswanathan, Leonid Ryzhyk, Mooly Sagiv, and Toby Walsh. Verifying properties of binarized deep neural networks. In Sheila A. McIlraith and Kilian Q. Weinberger (eds.), Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence,(AAAI18), 2-7, 2018, pp. 6615–6624. AAAI Press, 2018. URL https://www.aaai.org/ocs/index.php/ AAAI/AAAI18/paper/view/16898.
    Locate open access versionFindings
  • Nina Narodytska, Aditya A. Shrotri, Kuldeep S. Meel, Alexey Ignatiev, and Joao Marques-Silva. Assessing heuristic machine learning explanations with model counting. In Mikolas Janota and Ines Lynce (eds.), Theory and Applications of Satisfiability Testing - SAT 2019 - 22nd International Conference, SAT 2019, volume 11628 of Lecture Notes in Computer Science, pp. 267–278.
    Google ScholarLocate open access versionFindings
  • NIPS IML Symposium. NIPS interpretable ML symposium, December 2017.
    Google ScholarFindings
  • Aditi Raghunathan, Jacob Steinhardt, and Percy Liang. Semidefinite relaxations for certifying robustness to adversarial examples. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3-8 December 2018, Montreal, Canada., pp. 10900–10910, 2018.
    Google ScholarLocate open access versionFindings
  • T. Sang, P. Beame, and H. Kautz. Performing bayesian inference by weighted model counting. In Prof. of AAAI, pp. 475–481, 2005.
    Google ScholarLocate open access versionFindings
  • Andy Shih, Adnan Darwiche, and Arthur Choi. Verifying binarized neural networks by angluinstyle learning. In Mikolas Janota and Ines Lynce (eds.), Theory and Applications of Satisfiability Testing - SAT 2019 - 22nd International Conference, SAT 2019, volume 11628 of Lecture Notes in Computer Science, pp. 354–370.
    Google ScholarLocate open access versionFindings
  • Gagandeep Singh, Timon Gehr, Matthew Mirman, Markus Puschel, and Martin T. Vechev. Fast and effective robustness certification. In NeurIPS, pp. 10825–10836, 2018. URL http://papers.nips.cc/paper/8278-fast-and-effective-robustness-certification.
    Locate open access versionFindings
  • Gagandeep Singh, Timon Gehr, Markus Puschel, and Martin T. Vechev. Boosting robustness certification of neural networks. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019. URL https://openreview.net/forum?id=HJgeEh09KQ.
    Locate open access versionFindings
  • Springer, 2005a.
    Google ScholarFindings
  • Carsten Sinz. Towards an optimal CNF encoding of boolean cardinality constraints. Technical report, University of Tubingen, 2005b.
    Google ScholarFindings
  • Mate Soos and Kuldeep S. Meel. Bird: Engineering an efficient cnf-xor sat solver and its applications to approximate model counting. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), 1 2019.
    Google ScholarLocate open access versionFindings
  • Mate Soos, Karsten Nohl, and Claude Castelluccia. Extending SAT solvers to cryptographic problems. In Theory and Applications of Satisfiability Testing - SAT 2009, 12th International Conference, SAT 2009, Swansea, UK, June 30 - July 3, 2009.
    Google ScholarLocate open access versionFindings
  • Proceedings, pp. 244–257, 2009.
    Google ScholarLocate open access versionFindings
  • Marc Thurley. sharpsat - counting models with advanced component caching and implicit BCP. In Armin Biere and Carla P. Gomes (eds.), Theory and Applications of Satisfiability Testing SAT 2006, 9th International Conference, Seattle, WA, USA, August 12-15, 2006, Proceedings, volume 4121 of Lecture Notes in Computer Science, pp. 424–429.
    Google ScholarLocate open access versionFindings
  • Springer, 2006. doi: 10.1007/ 11814948\ 38. URL https://doi.org/10.1007/11814948
    Findings
  • 38. Vincent Tjeng, Kai Y. Xiao, and Russ Tedrake. Evaluating robustness of neural networks with mixed integer programming. In ICLR, 2019. Leslie G. Valiant. The complexity of enumeration and reliability problems. SIAM J. Comput., 8(3): 410–421, 1979. doi: 10.1137/0208032. URL https://doi.org/10.1137/0208032.
    Locate open access versionFindings
  • Leslie G Valiant. A theory of the learnable. In Proceedings of the sixteenth annual ACM symposium on Theory of computing, pp. 436–445. ACM, 1984.
    Google ScholarLocate open access versionFindings
  • Tsui-Wei Weng, Huan Zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Luca Daniel, Duane S. Boning, and Inderjit S. Dhillon. Towards fast computation of certified robustness for relu networks. In ICML, pp. 5273–5282, 2018.
    Google ScholarLocate open access versionFindings
  • Eric Wong and J. Zico Kolter. Provable defenses against adversarial examples via the convex outer adversarial polytope. In Jennifer G. Dy and Andreas Krause (eds.), Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmassan, Stockholm, Sweden, July 10-15, 2018, volume 80 of Proceedings of Machine Learning Research, pp. 5283–5292. PMLR, 2018. URL http://proceedings.mlr.press/v80/wong18a.html.
    Locate open access versionFindings
  • Kai Y. Xiao, Vincent Tjeng, Nur Muhammad (Mahi) Shafiullah, and Aleksander Madry. Training for faster adversarial robustness verification via inducing relu stability. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019. URL https://openreview.net/forum?id=BJfIVjAcKm.
    Locate open access versionFindings
  • Huan Zhang, Tsui-Wei Weng, Pin-Yu Chen, Cho-Jui Hsieh, and Luca Daniel. Efficient neural network robustness certification with general activation functions. In NeurIPS, pp. 4944–4953, 2018.
    Google ScholarLocate open access versionFindings
  • Chenzhuo Zhu, Song Han, Huizi Mao, and William J. Dally. Trained ternary quantization. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 2426, 2017, Conference Track Proceedings. OpenReview.net, 2017. URL https://openreview.net/forum?id=S1 pAu9xl.
    Locate open access versionFindings
  • Sequential counters encoding. We recall the definition of sequential counters encoding from (Sinz, 2005a;b) that are used to encode cardinality constraints. Consider a cardinality constraint: m i=1 li
    Google ScholarFindings
Full Text
Your rating :
0

 

Tags
Comments