Scaling Laws for Autoregressive Generative Modeling

Mor Katz
Mor Katz
Mark Chen
Mark Chen
Christopher Hesse
Christopher Hesse
Jacob Jackson
Jacob Jackson
Heewoo Jun
Heewoo Jun
Prafulla Dhariwal
Prafulla Dhariwal
Chris Hallacy
Chris Hallacy
Benjamin Mann
Benjamin Mann
Cited by: 0|Bibtex|Views12
Other Links: arxiv.org

Abstract:

We identify empirical scaling laws for the cross-entropy loss in four domains: generative image modeling, video modeling, multimodal image$\leftrightarrow$text models, and mathematical problem solving. In all cases autoregressive Transformers smoothly improve in performance as model size and compute budgets increase, following a power-l...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments