Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression.
dblp(2023)
Key words
Distributed Systems,Systems for Machine Learning,Large-scale NLP Training,Pipeline Parallelism,3D Parallelism,Gradient Compression,Communication Optimization
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined