首页 | 官方网站   微博 | 高级检索  
     

基于Spark框架的用于金融信贷风险控制的加权随机森林算法
引用本文:胡婵娟,于莲芝,薛震.基于Spark框架的用于金融信贷风险控制的加权随机森林算法[J].小型微型计算机系统,2020(2):369-374.
作者姓名:胡婵娟  于莲芝  薛震
作者单位:上海理工大学光电信息与计算机工程学院
基金项目:国家自然科学基金项目(61603257)资助.
摘    要:为解决互联网时代线上贷款业务量庞大带来的困扰,优化快速迭代的数据模型,从线上贷款业务的特点出发,以Spark分布式计算引擎为核心设计并实现了能够并行处理非平衡数据的加权随机森林算法.该算法从特征切分点抽样统计、特征分箱、逐层训练三个角度对加权随机森林算法进行并行化优化.该算法有效提高了随机森林算法的分类准确率,同时有效降低了决策过程中出现的平局现象.对非平衡数据,该文章通过SMOTE算法对数据进行重构,较好的保留了原有数据集信息.实验表明,该算法能够有效提高放贷效率性与及时性,极大的提高了生产力.

关 键 词:大数据  SPARK  并行化  随机森林  风险控制

Weighted Random Forest Algorithm Based on Spark Framework for Financial Credit Risk Control
HU Chan-juan,YU Lian-zhi,XUE Zhen.Weighted Random Forest Algorithm Based on Spark Framework for Financial Credit Risk Control[J].Mini-micro Systems,2020(2):369-374.
Authors:HU Chan-juan  YU Lian-zhi  XUE Zhen
Affiliation:(School of Optical-Electrical and Computer Engineering,University in Shanghai of Science and Technology,Shanghai 200093,China)
Abstract:In order to solve the problems caused by huge online loan business in the Internet era and optimize the fast iterative data model,starting from the characteristics of online load business,based on Spark distributed computing engine,a weighted random forest algorithm capable of parallel processing unbalanced data is designed and implemented.In this algorithm,the weighted random forest algorithm is parallelized and optimized from three aspects:sampling statistics of feature points,feature box,and layer by layer training.This algorithm improves the classification accuracy of the random forest algorithm and reduces the phenomenon of tie in the decisionmaking processing effectively.For the unbalanced data,SMOTE algorithm is used to reconstruct the data in the paper,and the original data set information is preserved well.Experiments show that the system can effectively improve the effectively and timeliness during the lending,and greatly improve the productivity.
Keywords:big-data  Spark  parallelization  random forests  risk control
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号