首页 | 官方网站   微博 | 高级检索  
     

异质信息网络中最大路径连通Steiner分量查询算法
引用本文:李源,范晓林,孙晶,赵会群,杨森,王国仁.异质信息网络中最大路径连通Steiner分量查询算法[J].软件学报,2023,34(2):655-675.
作者姓名:李源  范晓林  孙晶  赵会群  杨森  王国仁
作者单位:北方工业大学 信息学院, 北京 100144;北京理工大学 计算机学院, 北京 100081
基金项目:科技创新2030——“新一代人工智能”重大项目(2020AAA0108503);国家自然科学基金(61902004,61672041,61772124,61977001,61732003);北京市教委科技项目(KM202010009009)
摘    要:异质信息网络(HINs)是包含多种类型对象(顶点)和链接(边)的有向图,能够表达丰富复杂的语义和结构信息.HINs中的稠密子图查询问题,即给定一个查询点q,在HINs中查询包含q的稠密子图,已成为该领域的热点和重点研究问题,并在活动策划、生物分析和商品推荐等领域具有广泛应用.但现有方法主要存在以下两个问题:(1)基于模体团和关系约束查询的稠密子图具有多种类型顶点,导致其不能解决仅关注某种特定类型顶点的场景;(2)基于元路径的方法虽然可查询到某种特定类型顶点的稠密子图,但其忽略了子图中顶点之间基于元路径的连通度.为此,首先在HINs中提出了基于元路径的边不相交路径的连通度,即路径连通度;然后,基于路径连通度提出了k-路径连通分量(k-PCC)模型,该模型要求子图的路径连通度至少为k;其次,基于k-PCC模型提出了最大路径连通Steiner分量(SMPCC)概念,其为包含q的具有最大路径连通度的k-PCC;最后,提出一种高效的基于图分解的k-PCC发现算法,并在此基础上提出了优化查询SMPCC算法.大量基于真实和合成HINs数据的实验结果验证了所提出模型和算法的有效性和高效性.

关 键 词:异质信息网络  稠密子图查询  k-路径连通分量  最大路径连通Steiner分量  元路径
收稿时间:2021/8/1 0:00:00
修稿时间:2022/1/17 0:00:00

Querying Algorithm for Steiner Maximum Path-connected Components in Heterogeneous Information Networks
LI Yuan,FAN Xiao-Lin,SUN Jing,ZHAO Hui-Qun,YANG Sen,WANG Guo-Ren.Querying Algorithm for Steiner Maximum Path-connected Components in Heterogeneous Information Networks[J].Journal of Software,2023,34(2):655-675.
Authors:LI Yuan  FAN Xiao-Lin  SUN Jing  ZHAO Hui-Qun  YANG Sen  WANG Guo-Ren
Affiliation:School of Information Science and Technology, North China University of Technology, Beijing 100144, China; School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
Abstract:Heterogeneous information networks (HINs) are directed graphs including multi-typed objects (vertices) and links (edges), which can express rich semantic information and complex structural information. The problem of cohesive subgraph query in HINs, i.e., given a query vertex q, it could be found that the cohesive subgraphs containing q in HINs has become an important problem, and has been widely used in various areas such as event planning, biological analysis, and product recommendation. Yet existing methods mainly have the two deficiencies:(1) cohesive subgraphs based on relational constraint and motif cliques contain multiple types of vertices, which makes it hard to solve the scenario of focusing on a specific type of vertices; (2) although the method based on meta-path can query the cohesive subgraphs with a specific type of vertices, it ignores the meta-path-based connectivity between the vertices in the subgraphs. Therefore, the connectivity with novel edge-disjoint paths is firstly proposed based on meta-path in HINs, i.e., path-connectivity. Then, the k-path connected component (k-PCC) is raised based on path-connectivity, which requires the path-connectivity of subgraph to be at least k. Next, the Steiner maximum path-connected component (SMPCC) is further proposed, which is the k-PCC containing q with the maximum path-connectivity. Finally, an efficient graph decomposition-based k-PCC discovery algorithm is designed, and based on this, an optimized SMPCC query algorithm is proposed. A large number of experiments on five real and synthetic HINs prove the effectiveness and efficiency of the proposed approaches.
Keywords:heterogeneous information networks (HINs)  cohesive subgraph query  k-path connected component  Steiner maximum-path-connected component  meta-path
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号