Understanding Variation in Subpopulation Susceptibility to Poisoning Attacks.
CoRR(2023)
摘要
Machine learning is susceptible to poisoning attacks, in which an attacker
controls a small fraction of the training data and chooses that data with the
goal of inducing some behavior unintended by the model developer in the trained
model. We consider a realistic setting in which the adversary with the ability
to insert a limited number of data points attempts to control the model's
behavior on a specific subpopulation. Inspired by previous observations on
disparate effectiveness of random label-flipping attacks on different
subpopulations, we investigate the properties that can impact the effectiveness
of state-of-the-art poisoning attacks against different subpopulations. For a
family of 2-dimensional synthetic datasets, we empirically find that dataset
separability plays a dominant role in subpopulation vulnerability for less
separable datasets. However, well-separated datasets exhibit more dependence on
individual subpopulation properties. We further discover that a crucial
subpopulation property is captured by the difference in loss on the clean
dataset between the clean model and a target model that misclassifies the
subpopulation, and a subpopulation is much easier to attack if the loss
difference is small. This property also generalizes to high-dimensional
benchmark datasets. For the Adult benchmark dataset, we show that we can find
semantically-meaningful subpopulation properties that are related to the
susceptibilities of a selected group of subpopulations. The results in this
paper are accompanied by a fully interactive web-based visualization of
subpopulation poisoning attacks found at
https://uvasrg.github.io/visualizing-poisoning
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要