Hyperparameter Optimization for Machine Learning Models Based on Bayesian Optimization |
| |
Authors: | Jia Wu Xiu-Yun Chen Hao Zhang Li-Dong Xiong Hang Lei Si-Hao Deng |
| |
Affiliation: | 1. School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China;2. Université de Technologie de Belfort-Montbéliard, Belfort 90010, France |
| |
Abstract: | Hyperparameters are important for machine learning algorithms since they directly control the behaviors of training algorithms and have a significant effect on the performance of machine learning models. Several techniques have been developed and successfully applied for certain application domains. However, this work demands professional knowledge and expert experience. And sometimes it has to resort to the brute-force search. Therefore, if an efficient hyperparameter optimization algorithm can be developed to optimize any given machine learning method, it will greatly improve the efficiency of machine learning. In this paper, we consider building the relationship between the performance of the machine learning models and their hyperparameters by Gaussian processes. In this way, the hyperparameter tuning problem can be abstracted as an optimization problem and Bayesian optimization is used to solve the problem. Bayesian optimization is based on the Bayesian theorem. It sets a prior over the optimization function and gathers the information from the previous sample to update the posterior of the optimization function. A utility function selects the next sample point to maximize the optimization function. Several experiments were conducted on standard test datasets. Experiment results show that the proposed method can find the best hyperparameters for the widely used machine learning models, such as the random forest algorithm and the neural networks, even multi-grained cascade forest under the consideration of time cost. |
| |
Keywords: | Bayesian optimization Gaussian process hyperparameter optimization machine learning |
本文献已被 ScienceDirect 等数据库收录! |
| 点击此处可从《电子科技学刊:英文版》浏览原始摘要信息 |
|
点击此处可从《电子科技学刊:英文版》下载全文 |
|