Fake or JPEG? Revealing Common Biases in Generated Image Detection Datasets
CoRR(2024)
摘要
The widespread adoption of generative image models has highlighted the urgent
need to detect artificial content, which is a crucial step in combating
widespread manipulation and misinformation. Consequently, numerous detectors
and associated datasets have emerged. However, many of these datasets
inadvertently introduce undesirable biases, thereby impacting the effectiveness
and evaluation of detectors. In this paper, we emphasize that many datasets for
AI-generated image detection contain biases related to JPEG compression and
image size. Using the GenImage dataset, we demonstrate that detectors indeed
learn from these undesired factors. Furthermore, we show that removing the
named biases substantially increases robustness to JPEG compression and
significantly alters the cross-generator performance of evaluated detectors.
Specifically, it leads to more than 11 percentage points increase in
cross-generator performance for ResNet50 and Swin-T detectors on the GenImage
dataset, achieving state-of-the-art results.
We provide the dataset and source codes of this paper on the anonymous
website: https://www.unbiased-genimage.org
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要