The Impact of Prompts on Zero-Shot Detection of AI-Generated Text
CoRR(2024)
摘要
In recent years, there have been significant advancements in the development
of Large Language Models (LLMs). While their practical applications are now
widespread, their potential for misuse, such as generating fake news and
committing plagiarism, has posed significant concerns. To address this issue,
detectors have been developed to evaluate whether a given text is
human-generated or AI-generated. Among others, zero-shot detectors stand out as
effective approaches that do not require additional training data and are often
likelihood-based. In chat-based applications, users commonly input prompts and
utilize the AI-generated texts. However, zero-shot detectors typically analyze
these texts in isolation, neglecting the impact of the original prompts. It is
conceivable that this approach may lead to a discrepancy in likelihood
assessments between the text generation phase and the detection phase. So far,
there remains an unverified gap concerning how the presence or absence of
prompts impacts detection accuracy for zero-shot detectors. In this paper, we
introduce an evaluative framework to empirically analyze the impact of prompts
on the detection accuracy of AI-generated text. We assess various zero-shot
detectors using both white-box detection, which leverages the prompt, and
black-box detection, which operates without prompt information. Our experiments
reveal the significant influence of prompts on detection accuracy. Remarkably,
compared with black-box detection without prompts, the white-box methods using
prompts demonstrate an increase in AUC of at least 0.1 across all zero-shot
detectors tested. Code is available:
.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要