Smish: A Novel Activation Function for Deep Learning Methods

ELECTRONICS（2022）

引用 18|浏览3

暂无评分

摘要

Activation functions are crucial in deep learning networks, given that the nonlinear ability of activation functions endows deep neural networks with real artificial intelligence. Nonlinear nonmonotonic activation functions, such as rectified linear units, Tan hyperbolic (tanh), Sigmoid, Swish, Mish, and Logish, perform well in deep learning models; however, only a few of them are widely used in mostly all applications due to their existing inconsistencies. Inspired by the MB-C-BSIF method, this study proposes Smish, a novel nonlinear activation function, expressed as f(x)=x & BULL;tanh[ln(1+sigmoid(x))], which could overcome other activation functions with good properties. Logarithmic operations are first used to reduce the range of sigmoid(x). The value is then calculated using the tanh operator. Inputs are ultimately used to multiply the previous value, thus exhibiting negative output regularization. Experiments show that Smish tends to operate more efficiently than Logish, Mish, and other activation functions on EfficientNet models with open datasets. Moreover, we evaluated the performance of Smish in various deep learning models and the parameters of its function f(x)=alpha x & BULL;tanh[ln(1+sigmoid(beta x))], and where alpha = 1 and beta = 1, Smish was found to exhibit the highest accuracy. The experimental results show that with Smish, the EfficientNetB3 network exhibits a Top-1 accuracy of 84.1% on the CIFAR-10 dataset; the EfficientNetB5 network has a Top-1 accuracy of 99.89% on the MNIST dataset; and the EfficientnetB7 network has a Top-1 accuracy of 91.14% on the SVHN dataset. These values are superior to those obtained using other state-of-the-art activation functions, which shows that Smish is more suitable for complex deep learning models.

查看译文

关键词

activation function, deep learning, image classification

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要