首页 | 官方网站   微博 | 高级检索  
     

基于混合人工免疫算法的流程挖掘事件日志融合方法
引用本文:徐杨,袁峰,林琪,汤德佑,李东.基于混合人工免疫算法的流程挖掘事件日志融合方法[J].软件学报,2018,29(2):396-416.
作者姓名:徐杨  袁峰  林琪  汤德佑  李东
作者单位:华南理工大学 软件学院, 广东 广州 510006,广州中国科学院软件应用技术研究所, 广东 广州 511458,华南理工大学 软件学院, 广东 广州 510006,华南理工大学 软件学院, 广东 广州 510006,华南理工大学 软件学院, 广东 广州 510006
基金项目:国家自然科学基金(71090403);广东省科技计划(2014B090901001,2015B010103002,2016B090918062,2016B050502001);广州市科技计划项目(201604010127);华南理工大学软件学院985学科建设基金(x2rjD615015III)
摘    要:流程挖掘是流程管理和数据挖掘交叉领域中的一个研究热点.在实际业务环境中,流程执行的数据往往分散记录到不同的事件日志中,需要将这些事件日志融合成为单一事件日志文件,才能应用当前基于单一事件日志的流程挖掘技术.然而,由于流程日志间存在着执行实例的多对多匹配关系、融合所需信息可能缺失等问题,导致事件日志融合问题具有较高挑战性.本文对事件日志融合问题进行了形式化定义,指出该问题是一个搜索优化问题,并提出了一种基于混合人工免疫算法的事件日志融合方法:以启发式方法生成初始种群,人工免疫系统的克隆选择理论基础,通过免疫进化获得“最佳”的融合解,从而支持包含多对多的实例匹配关系的日志融合;考虑两个实例级别的因素:流程执行路径出现的频次和流程实例间的时间匹配关系,分别从“量”匹配和“时间”匹配两个维度来评价进化中的个体;通过设置免疫记忆库、引入模拟退火机制,保证新一代种群的多样性,减少进化早熟几率.实验结果表明,本文的方法能够实现多对多的实例匹配关系的事件日志融合的目标,相比随机方法生成初始种群,启发式方法能加快免疫进化的速度.文中还针对利用分布式技术提高事件日志融合性能,探讨了大规模事件日志的分布式融合中的数据划问题.

关 键 词:事件日志融合  流程挖掘  人工免疫系统  日志预处理
收稿时间:2016/10/10 0:00:00
修稿时间:2016/12/12 0:00:00

Merging Event Logs for Process Mining with a Hybrid Artificial Immune Algorithm
XU Yang,YUAN Feng,LIN Qi,TANG De-You and LI Dong.Merging Event Logs for Process Mining with a Hybrid Artificial Immune Algorithm[J].Journal of Software,2018,29(2):396-416.
Authors:XU Yang  YUAN Feng  LIN Qi  TANG De-You and LI Dong
Affiliation:School of Software Engineering, South China University of Technology, Guangzhou 510006, China,Institute of Software Application Technology, Guangzhou & Chinese Academy of Science, Guangzhou 511458, China,School of Software Engineering, South China University of Technology, Guangzhou 510006, China,School of Software Engineering, South China University of Technology, Guangzhou 510006, China and School of Software Engineering, South China University of Technology, Guangzhou 510006, China
Abstract:Process mining is an active research topic in the cross domain of process management and data mining. In an actual business environment, the recorded data of a process execution which may be supported by different computer systems is scattered into different event log files. It is necessary to merge the scattered data into one single event log file when applying current process mining techniques and tools for process mining. This mission is still challenging, however, because of the complex relationships between cases in two logs and the possible lack of information for the merging. In this paper, event log merging for process mining is regard as a kind of search and optimization problems based on the formal definition and a merging approach with a hybrid artificial Immune algorithm is present in order to achieve the event log merging with many to many relationship between cases in the two event logs. In the merging approach, the clonal selection principle is selected as its underlying principle, which requires the matching process to undergo iterations of clonal selection, hypermutation and receptor editing in order to get the best solution. The algorithm starts from an initial population produced with a heuristic approach. Two factors, occurrence frequency and temporal relation, are designed in the affinity function to evaluate the individuals in the population. And immunological memory and simulated annealing are exploited to make the artificial immune merging jumping out from the trap of local optima. The experiment results show that the hybrid algorithm has a good performance in merging logs with complex cases relationships and the heuristic approach for initial population can speed the process of the evolution. This paper also discusses the ways of data distribution in which the log merging problems can be distributed.
Keywords:event log merging  process mining  artificial immune system  log preprocessing
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号