首页 | 官方网站   微博 | 高级检索  
     


A hybrid algorithm for Bayesian network structure learning with application to multi-label learning
Affiliation:1. Department of Electronics Convergence Engineering, Wonkwang University, 344-2, Shinyong-Dong, Iksan, Jeonbuk 570-749, South Korea;2. Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta T6G 2G7, Canada;3. Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland;1. Department of Computer Science and Information Engineering, National Taichung University of Science and Technology, No. 129, Sec. 3, Sanmin Rd., Taichung, Taiwan, ROC;2. Department of Electrical Engineering, National Chung Hsing University, No. 250 Kuo Kuang Rd., Taichung, Taiwan, ROC;1. Department of Systems Engineering and Engineering Management, City University of Hong Kong, 83 Tat Chee Avenu, Kowloon Tong, Hong Kong;2. Centre for Systems Informatics Engineering, City University of Hong Kong, 83 Tat Chee Avenu, Kowloon Tong, Hong Kong;3. School of Management, Hefei University of Technology, Hefei, Box 270, Hefei 230009, Anhui, PR China;4. Key Laboratory of Process Optimization and Intelligent Decision-making, Ministry of Education, Hefei, Box 270, Hefei 230009, Anhui, PR China
Abstract:We present a novel hybrid algorithm for Bayesian network structure learning, called H2PC. It first reconstructs the skeleton of a Bayesian network and then performs a Bayesian-scoring greedy hill-climbing search to orient the edges. The algorithm is based on divide-and-conquer constraint-based subroutines to learn the local structure around a target variable. We conduct two series of experimental comparisons of H2PC against Max–Min Hill-Climbing (MMHC), which is currently the most powerful state-of-the-art algorithm for Bayesian network structure learning. First, we use eight well-known Bayesian network benchmarks with various data sizes to assess the quality of the learned structure returned by the algorithms. Our extensive experiments show that H2PC outperforms MMHC in terms of goodness of fit to new data and quality of the network structure with respect to the true dependence structure of the data. Second, we investigate H2PC’s ability to solve the multi-label learning problem. We provide theoretical results to characterize and identify graphically the so-called minimal label powersets that appear as irreducible factors in the joint distribution under the faithfulness condition. The multi-label learning problem is then decomposed into a series of multi-class classification problems, where each multi-class variable encodes a label powerset. H2PC is shown to compare favorably to MMHC in terms of global classification accuracy over ten multi-label data sets covering different application domains. Overall, our experiments support the conclusions that local structural learning with H2PC in the form of local neighborhood induction is a theoretically well-motivated and empirically effective learning framework that is well suited to multi-label learning. The source code (in R) of H2PC as well as all data sets used for the empirical tests are publicly available.
Keywords:Bayesian networks  Multi-label learning  Markov boundary  Feature subset selection
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号