首页 | 官方网站   微博 | 高级检索  
     

基于Spark的分布式交通流数据预测系统
引用本文:黄廷辉,王玉良,汪振,崔更申.基于Spark的分布式交通流数据预测系统[J].计算机应用研究,2018,35(2):405-409,416.
作者姓名:黄廷辉  王玉良  汪振  崔更申
作者单位:桂林电子科技大学 计算机与信息安全学院,桂林电子科技大学 计算机与信息安全学院,桂林电子科技大学 计算机与信息安全学院,桂林电子科技大学 计算机与信息安全学院
基金项目:赛尔网络下一代互联网技术创新项目(NGII20160306);广西科技攻关项目(PD160189)
摘    要:在大数据时代,在城市复杂交通环境中,实现实时、准确的交通流预测,是实现智能交通系统的必要前提。提出了一种在Spark平台上基于梯度优化决策树的分布式城市交通流预测模型(distributed urban traffic prediction with GBDT,DUTP-GBDT);并提出了分布式情况下梯度优化决策树模型实现的优化方法,包括切分点抽样、特征装箱和逐层训练三种,提高了分布式情况下梯度优化决策树训练效率。基于Spark分布式计算平台高效、可靠、弹性可扩展的优势,以及梯度优化决策树模型准确率较高和时间复杂度较低的优点,利用时间特征、道路状况特征以及天气特征等特征参数,建立了DUTP-GBDT模型,实现了实时、准确的交通流预测。通过与GABP、GA-KNN、MSTAR等模型的对比,证明了利用Spark平台,DUTP-GBDT模型在分布式环境下准确率和训练速度方面均有所提高,符合城市交通流预测系统的各项要求。

关 键 词:交通流预测  分布式计算  Spark平台  梯度优化决策树模型
收稿时间:2017/3/21 0:00:00
修稿时间:2017/12/26 0:00:00

Distributed Traffic Flow Data Prediction System Based on Spark
Huang Tinghui,Wang Yuliang,Wang Zhen and Cui Gengshen.Distributed Traffic Flow Data Prediction System Based on Spark[J].Application Research of Computers,2018,35(2):405-409,416.
Authors:Huang Tinghui  Wang Yuliang  Wang Zhen and Cui Gengshen
Affiliation:School of Computer Science and Information Security,Guilin University of Electronic Technology,GuiLin GuangxiS,,,
Abstract:In the era of big data and complex urban traffic environment, real-time and accurate traffic flow forecast is a prerequisite to implementing intelligent transportation system. In this paper, we present a distributed urban traffic flow forecasting model which based on gradient optimization decision tree on Spark platform. The gradient optimization decision tree model combines the boosting algorithm in ensemble learning with the decision tree model to improve the accuracy of the prediction model and avoid over-fitting problem on the basis of keeping the time complexity of the decision tree model as lower as possible. The optimization method of gradient optimization decision tree model in distributed case is also proposed, which includes sampling points, feature packing and layer-by-layer training. All of them can improve the training efficiency of gradient optimization decision tree in distributed case. The characteristics of time, road condition and weather are established based on the advantages of efficient, reliable and flexible expansibility of Spark distributed computing platform and the advantages of high accuracy and time complexity of gradient optimization decision tree model. The DUTP-GBDT model implements real-time and accurate traffic flow prediction. Compared with GA-BP, GA-KNN and MSTAR models, the results prove that, the accuracy and training speed of DUTP-GBDT model using Spark platform in the distributed environment are both improved. In line with the requirements of urban traffic flow forecasting system.
Keywords:Traffic flow forecast  Distributed Computing  Spark platform  Gradient optimization decision tree model
本文献已被 维普 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号