Choosing the Sample with Lowest Loss makes SGD Robust
AISTATS, pp. 2120-2130, 2020.
The presence of outliers can potentially significantly skew the parameters of machine learning models trained via stochastic gradient descent (SGD). In this paper we propose a simple variant of the simple SGD method: in each step, first choose a set of k samples, then from these choose the one with the smallest current loss, and do an S...More
PPT (Upload PPT)