Rethinking the Role of Hyperparameter Tuning in Optimizer Benchmarking


引用 0|浏览30
Many optimizers have been proposed for training deep neural networks, and 1 they often have multiple hyperparameters, which make it tricky to benchmark 2 their performance. In this work, we propose a new benchmarking protocol to 3 evaluate both end-to-end efficiency (training a model from scratch without knowing 4 the best hyperparameter configuration) and data-addition training efficiency (the 5 previously selected hyperparameters are used for periodically re-training the model 6 with newly collected data). For end-to-end efficiency, unlike previous work that 7 assumes random hyperparameter tuning, which may over-emphasize the tuning 8 time, we propose to evaluate with a bandit hyperparameter tuning strategy. For 9 data-addition training, we design a new protocol for assessing the hyperparameter 10 sensitivity to data shift. We then apply the proposed benchmarking framework 11 to 7 optimizers on various tasks, including computer vision, natural language 12 processing, reinforcement learning, and graph mining. Our results show that there 13 is no clear winner across all the tasks. 14
AI 理解论文
Chat Paper