Calculon: a Methodology and Tool for High-Level Codesign of Systems and Large Language Models

SC23: International Conference for High Performance Computing, Networking, Storage and Analysis(2023)

引用 0|浏览3
暂无评分
摘要
This paper presents a parameterized analytical performance model of transformer-based Large Language Models (LLMs) for guiding high-level algorithm-architecture codesign studies. This model de-rives from an extensive survey of performance optimizations that have been proposed for the training and inference of LLMs; the model's parameters capture application characteristics, the hardware system, and the space of implementation strategies. With such a model, we can systematically explore a joint space of hardware and software configurations to identify optimal system designs under given constraints, like the total amount of system memory. We implemented this model and methodology in a Python-based open-source tool called Calculon. Using it, we identified novel system designs that look significantly different from current inference and training systems, showing quantitatively the estimated potential to achieve higher efficiency, lower cost, and better scalability.
更多
查看译文
关键词
Co-design,Large Language Models,System Design,Implementation Of Strategies,Training System,Configuration Space,Inference System,Amount Of Memory,Hardware Configuration,Software Configuration,Amount Of Time,Impact Of Change,Statistical Distribution,Graphics Processing Unit,Multilayer Perceptron,System Size,System Configuration,Memory Capacity,Detailed Simulation,Memory System,Transformer Block,Standard Desktop Computer,Backward Pass,Forward Pass,Parallel Strategy,Number Of Processors,Memory Usage,Strategy Execution,Communication Cost,Software Implementation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要