首页 | 官方网站   微博 | 高级检索  
     

针对Lasso问题的多维权重求解算法
引用本文:陈善雄,刘小娟,陈春蓉,郑方园.针对Lasso问题的多维权重求解算法[J].计算机应用,2017,37(6):1674-1679.
作者姓名:陈善雄  刘小娟  陈春蓉  郑方园
作者单位:1. 西南大学 计算机与信息科学学院, 重庆 400715;2. 贵州工程应用技术学院 信息工程学院, 贵州 毕节 551700
基金项目:国家自然科学基金资助项目(61303227);贵州省普通高等学校科技拔尖人才支持计划项目(黔教合KY字[2016]098);贵州省科技厅联合基金资助项目(黔科合LH字[2016]7053)。
摘    要:最小绝对收缩和选择算子(Lasso)在数据维度约减、异常检测方面有着较强的计算优势。针对Lasso用于异常检测中检测精度不高的问题,提出了一种基于多维度权重的最小角回归(LARS)算法解决Lasso问题。首先考虑每个回归变量在回归模型中所占权重不同,即此属性变量在整体评价中的相对重要程度不同,故在LARS算法计算角分线时,将各回归变量与剩余变量的联合相关度纳入考虑,用来区分不同属性变量对检测结果的影响;然后在LARS算法中加入主成分分析(PCA)、独立权数法、基于Intercriteria相关性的指标的重要度评价(CRITIC)法这三种权重估计方法,并进一步对LARS求解的前进方向和前进变量选择进行优化。最后使用Pima Indians Diabetes数据集验证算法的优良性。实验结果表明,在更小阈值的约束条件下,加入多维权重后的LARS算法对Lasso问题的解具有更高的准确度,能更好地用于异常检测。

关 键 词:最小绝对收缩和选择算子  变量选择  最小角回归  多元线性回归  加权  
收稿时间:2016-11-07
修稿时间:2017-01-12

Method for solving Lasso problem by utilizing multi-dimensional weight
CHEN Shanxiong,LIU Xiaojuan,CHEN Chunrong,ZHENG fangyuan.Method for solving Lasso problem by utilizing multi-dimensional weight[J].journal of Computer Applications,2017,37(6):1674-1679.
Authors:CHEN Shanxiong  LIU Xiaojuan  CHEN Chunrong  ZHENG fangyuan
Affiliation:1. College of Computer and Information Science, Southwest University, Chongqing 400715, China;2. School of Information Engineering, Guizhou University of Engineering Science, Bijie Guizhou 551700, China
Abstract:Least absolute shrinkage and selection operator (Lasso) has performance superiority in dimension reduction of data and anomaly detection. Concerning the problem that the accuracy is low in anomaly detection based on Lasso, a Least Angle Regression (LARS) algorithm based on multi-dimensional weight was proposed. Firstly, the problem was considered that each regression variable had different weight in the regression model. Namely, the importance of the attribute variable was different in the overall evaluation. So, in calculating angular bisector of LARS algorithm, the united correlation of regression variable and residual vector was introduced to distinguish the effect of different attribute variables on detection results. Then, the three weight estimation methods of Principal Component Analysis (PCA), independent weight evaluation and CRiteria Importance Though Intercriteria Correlation (CRITIC) were added into LARS algorithm respectively. The approach direction and approach variable selection in the solution of LARS were further optimized. Finally, the Pima Indians Diabetes dataset was used to prove the optimal property of the proposed algorithm. The experimental results show that, the LARS algorithm based on multi-dimensional weight has a higher accuracy than the traditional LARS under the same constraint condition with smaller threshold value, and can be more suitable for anomaly detection.
Keywords:Least absolute shrinkage and selection operator (Lasso)  variable selection  Least Angle Regression (LARS)  Multiple Linear Regression (MLR)  weighting  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号