Chrome Extension
WeChat Mini Program
Use on ChatGLM

M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection.

Computing Research Repository (CoRR)(2024)

Mohamed bin Zayed Univ Artificial Intelligence | Tech Univ Darmstadt

Cited 29|Views69
Abstract
Large language models (LLMs) have demonstrated remarkable capability togenerate fluent responses to a wide variety of user queries. However, this hasalso raised concerns about the potential misuse of such texts in journalism,education, and academia. In this study, we strive to create automated systemsthat can detect machine-generated texts and pinpoint potential misuse. We firstintroduce a large-scale benchmark M4, which is a multi-generator,multi-domain, and multi-lingual corpus for machine-generated text detection.Through an extensive empirical study of this dataset, we show that it ischallenging for detectors to generalize well on instances from unseen domainsor LLMs. In such cases, detectors tend to misclassify machine-generated text ashuman-written. These results show that the problem is far from solved and thatthere is a lot of room for improvement. We believe that our dataset will enablefuture research towards more robust approaches to this pressing societalproblem. The dataset is available at https://github.com/mbzuai-nlp/M4.
More
Translated text
Key words
Language Modeling,Detection,Natural Language Processing,Multilingual Neural Machine Translation,Complex Word Identification
PDF
Bibtex
AI Read Science
AI Summary
AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.
Example
Background
Key content
Introduction
Methods
Results
Related work
Fund
Key content
  • Pretraining has recently greatly promoted the development of natural language processing (NLP)
  • We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
  • We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
  • The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
  • Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance
Try using models to generate summary,it takes about 60s
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Related Papers

Evade ChatGPT Detectors Via A Single Space

Shuyang Cai,Wanyun Cui
ICLR 2024 2024

被引用7

IMGTB: A Framework for Machine-Generated Text Detection Benchmarking

PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 3 SYSTEM DEMONSTRATIONS 2024

被引用1

Team QUST at SemEval-2024 Task 8: A Comprehensive Study of Monolingual and Multilingual Approaches for Detecting AI-generated Text

Xiaoman Xu, Xiangrun Li,Taihang Wang, Jianxiang Tian,Ye Jiang
PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024 2024

被引用0

RFBES at SemEval-2024 Task 8: Investigating Syntactic and Semantic Features for Distinguishing AI-Generated and Human-Written Texts

Mohammad Heydari Rad, Farhan Farsi, Shayan Bali,Romina Etezadi,Mehrnoush Shamsfard
PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024 2024

被引用0

TM-TREK at SemEval-2024 Task 8: Towards LLM-Based Automatic Boundary Detection for Human-Machine Mixed Text

Xiaoyan Qu, Xiangfeng Meng
PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024 2024

被引用0

AISPACE at SemEval-2024 Task 8: A Class-balanced Soft-voting System for Detecting Multi-generator Machine-generated Text

Renhua Gu, Xiangfeng Meng
PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024 2024

被引用0

PetKaz at SemEval-2024 Task 8: Can Linguistics Capture the Specifics of LLM-generated Text?

Kseniia Petukhova,Roman Kazakov,Ekaterina Kochmar
PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024 2024

被引用0

Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper

要点】:本研究旨在创建用于检测机器生成文本的自动化系统,并指出潜在的滥用情况,引入了一个大规模基准数据集M4,用于机器生成文本检测。

方法】:通过广泛实证研究M4数据集,展示了检测器在面对未见领域或大型语言模型生成的实例时泛化效果差,容易误将机器生成文本分类为人工撰写。

实验】:研究结果表明,这一问题远未解决,仍有很大的改进空间。研究团队希望其数据集将促进未来研究,提供更稳健的方法解决这一紧迫的社会问题。相关数据集可在 https://github.com/mbzuai-nlp/M4 获取。