Chrome Extension
WeChat Mini Program
Use on ChatGLM

The WMDP Benchmark: Measuring and Reducing Malicious Use with Unlearning

Nathaniel Li,Alexander Pan,Anjali Gopal,Summer Yue, Daniel Berrios,Alice Gatti, Justin Li,Ann-Kathrin Dombrowski,Shashwat Goel,Gabriel Mukobi, Nathan Helm-Burger,Rassin Lababidi, Lennart Justen,Andrew Liu, Michael Chen, Isabelle Barrass,Oliver Zhang, Xiaoyuan Zhu,Rishub Tamirisa, Bhrugu Bharathi, Ariel Herbert-Voss, Cort Breuer,Andy Zou,Mantas Mazeika,Zifan Wang, Palash Oswal,Weiran Lin, Adam Hunt, Justin Tienken-Harder, Kevin Shih, Kemper Talley, John Guan, Ian Steneker, David Campbell, Brad Jokubaitis,Steven Basart,Stephen Fitz,Ponnurangam Kumaraguru,Kallol Karmakar,Uday Tupakula,Vijay Varadharajan,Yan Shoshitaishvili,Jimmy Ba,Kevin Esvelt, Alexandr Wang,Dan Hendrycks

ICML 2024(2024)

Cited 9|Views101
Key words
Intrusion Detection
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined