「Effective hyperparameter optimization using Nelder-Mead method in deep learning」
Effective hyperparameter optimization using Nelder-Mead method in deep learning
［IPSJ Transactions on Computer Vision and Applications 2017, 9:20］
In deep learning, deep neural network (DNN) hyperparameters can severely affect network performance. Currently, such hyperparameters are frequently optimized by several methods, such as Bayesian optimization and the covariance matrix adaptation evolution strategy. However, it is difficult for non-experts to employ these methods. In this paper, we adapted the simpler coordinate-search and Nelder-Mead methods to optimize hyperparameters. Several hyperparameter optimization methods were compared by configuring DNNs for character recognition and age/gender classification. Numerical results demonstrated that the Nelder- Mead method outperforms the other methods and achieves state-of-the-art accuracy for age/gender classification.
［Reasons for the award］
The performance of deep neural networks, which have attracted much attention in recent years, is greatly influenced by the setting of hyperparameters. In this paper, the authors proposed a method to search optimum hyperparameters by Nelder-Mead method, in which highly original ideas are included. The effectiveness of the proposed method is demonstrated by constructing a deep neural network with optimum hyperparameters determined on the basis of this method, and by achieving performance superior to other methods in character recognition, gender classification and age estimation from face images. While the importance of this topic is highly recognized, the proposed method exceeds the performance of other related methods. For the above reasons, this paper deserves the Best Paper Award.
He is an engineer at GREE, Inc. He is also a Specified Concentrated Research Specialist at AIST. His main research areas are blackbox optimization and automated machine learning.
He is a data scientist at DeNA Co.,Ltd. to analyze the data of taxi for mobility services. He received the B.Eng. degree from University of Tsukuba in 2018.
He is currently a team leader at the Artificial Intelligence Research Center of the National Institute of Advanced Industrial Science and Technology (AIST), Japan. His research work concentrates on optimization of machine learning algorithms and crowd behavior analysis.