Multi-scale Transformer Language Models

Cited by: 0|Views131

Abstract:

We investigate multi-scale transformer language models that learn representations of text at multiple scales, and present three different architectures that have an inductive bias to handle the hierarchical nature of language. Experiments on large-scale language modeling benchmarks empirically demonstrate favorable likelihood vs memory ...More

Code:

Data:

Full Text
Bibtex
Your rating :
0

 

Tags
Comments