Motivating the Rules of the Game for Adversarial Example Research
arXiv: Learning, Volume abs/1807.06732, 2018.
Defenses against restricted perturbation adversarial examples are often motivated by security concerns, but the security motivation of the standard set of game rules seems much weaker than other possible rule sets
Advances in machine learning have led to broad deployment of systems with impressive performance on important problems. Nonetheless, these systems can be induced to make errors on data that are surprisingly similar to examples the learned system handles correctly. The existence of these errors raises a variety of questions about out-of-sa...More
PPT (Upload PPT)
- Machine learning models for classification, regression, and decision making are becoming ubiquitous in everyday systems.
- For some reason, there is an easier way for the attacker to achieve their goal that does not require a labeling error for a particular system of interest it would be a sign that adversarial example games might not capture something important about machine learning security for that system.
- The most general assumption we can make about the attacker action space is that they can produce any input example they want with no restriction to perturb some starting point or deliver some specific content.
- That it is important for security-motivated work within the machine learning community to expand its focus to include a broader set of rules for attackers and adversarial inputs.
- Practical systems that expect to face attackers that perform simplistic limited query attacks who are not constrained to imperceptible perturbations, for example, would likely avoid defense mechanisms that improve lp robustness at the expense of a worse error rate under the expected real-world distribution.
- As we discuss further in Section 5.2, reducing test error with respect to various classes of noisy image distributions that exceed the threshold of typical imperceptible modifications is well motivated and a necessary step towards securing a model against attackers constrained to content-preserving perturbations.
- We have not found real-world examples for the indistinguishable perturbation setting, so even if small lp perturbation norm constraints from the standard rules were a perfect approximation for human perception, the standard perturbation defense rules currently lack a strong security motivation.
- Based upon the lack of concrete security scenarios that require the standard game rules, we encourage future work to adopt one of two paths: either adopt a broader view of the types of adversarial examples that can be provided by an attacker, in a way that is consistent with a realistic threat model, or present a more specific rationale for the use of the standard game rules or the study of small adversarial perturbations10.
- Future papers that study adversarial examples from a security perspective, but within the machine learning community, must take extra care to build on prior work on ML security, 10Author Ian Goodfellow notes that machine learning research into small-norm perturbations that illuminates fundamental techniques, shortcomings, or advances in deep networks supports a useful form of basic research that may later have implications for the types of security we discuss in this paper.
- To have the largest impact, we should both recast future adversarial example research as a contribution to core machine learning functionality and develop new abstractions that capture realistic threat models
- Argues that adversarial example defense papers have, to date, mostly considered abstract, toy games that do not relate to any specific security concern
- Introduces a taxonomy of rules governing games between an attacker and defender in the context of a machine learning system
- Introduces in this work can be viewed as extending the “exploratory attack” framework presented in Barreno et al.
- Introduces expands upon the ones introduced in Papernot et al. and Barreno et al. and is intended to model real-world security settings