Leveraging Continuously Differentiable Activation Functions for Learning in Quantized Noisy Environments
CoRR(2024)
摘要
Real-world analog systems intrinsically suffer from noise that can impede
model convergence and accuracy on a variety of deep learning models. We
demonstrate that differentiable activations like GELU and SiLU enable robust
propagation of gradients which help to mitigate analog quantization error that
is ubiquitous to all analog systems. We perform analysis and training of
convolutional, linear, and transformer networks in the presence of quantized
noise. Here, we are able to demonstrate that continuously differentiable
activation functions are significantly more noise resilient over conventional
rectified activations. As in the case of ReLU, the error in gradients are 100x
higher than those in GELU near zero. Our findings provide guidance for
selecting appropriate activations to realize performant and reliable hardware
implementations across several machine learning domains such as computer
vision, signal processing, and beyond.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要