EdgeBERT: Optimizing On-Chip Inference for Multi-Task NLP
Abstract:
Transformer-based language models such as BERT provide significant accuracy improvement to a multitude of natural language processing (NLP) tasks. However, their hefty computational and memory demands make them challenging to deploy to resource-constrained edge platforms with strict latency requirements. We present EdgeBERT an in-dept...More
Code:
Data:
Full Text
Tags
Comments