DeLighT: Very Deep and Light-weight Transformer

Cited by: 0|Bibtex|Views28
Other Links: arxiv.org

Abstract:

We introduce a very deep and light-weight transformer, DeLighT, that delivers similar or better performance than transformer-based models with significantly fewer parameters. DeLighT more efficiently allocates parameters both (1) within each Transformer block using DExTra, a deep and light-weight transformation and (2) across blocks usi...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments