Chrome Extension
WeChat Mini Program
Use on ChatGLM

Faster MoE LLM Inference for Extremely Large Models

Haoqi Yang, Luohe Shi,Qiwei Li,Zuchao Li,Ping Wang,Bo Du, Mengjia Shen,Hai Zhao

arxiv(2025)

Cited 0|Views5
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined