首页 | 官方网站   微博 | 高级检索  
     

一种元路径下基于频繁模式的实体集扩展方法
引用本文:郑玉艳,田莹,石川.一种元路径下基于频繁模式的实体集扩展方法[J].软件学报,2018,29(10):2915-2930.
作者姓名:郑玉艳  田莹  石川
作者单位:北京邮电大学计算机学院, 北京 100876,北京邮电大学计算机学院, 北京 100876,北京邮电大学计算机学院, 北京 100876
基金项目:国家自然科学基金(61772082,61375058);国家重点基础研究发展计划(973)(2013CB329606);北京市教育委员会共建项目建设计划.
摘    要:实体集扩展是指,已知某个特定类别的几个种子实体,根据一定的规则得到该类别的更多的实体.作为一种经典的数据挖掘任务,实体集扩展已经有很多的应用,诸如字典建立、查询建议等.现有的实体集扩展主要是基于文本或网页信息,即实体之间的关系从其在文本或者网页中的共现来推断.随着知识图谱研究的兴起,根据知识图谱中知识的共现来研究实体集扩展也成为了一种可能.本文主要研究知识图谱中的实体集扩展问题,即给定几个种子实体,利用知识图谱来得到更多的同类别的实体.我们首先把知识图谱建模成一个异质信息网络,即含有多种实体类型或者关系类型的网络,提出了一种新的元路径下基于频繁模式的实体集扩展方法,称为FPMP_ESE.FPMP_ESE采用异质信息网络中的元路径来捕捉种子实体之间的潜在共同特征.,为了找到种子实体之间的重要的元路径,我们设计了一种新的基于频繁模式的元路径自动产生算法FPMPG.之后,为了更好地给每条元路径分配相应的权重,我们设计了启发式的方法和PU learning的方法.最后,在真实数据集Yago上的实验,验证了提出方法较其他方法在实体集扩展任务上具有更好地性能以及更高地效率.

关 键 词:知识图谱  实体集扩展  异质信息网络  元路径  频繁模式  PU  learning
收稿时间:2017/7/20 0:00:00
修稿时间:2017/11/8 0:00:00

Method of Entity Set Expansion Based on Frequent Pattern Under Meta Path
ZHENG Yu-Yan,TIAN Ying and SHI Chuan.Method of Entity Set Expansion Based on Frequent Pattern Under Meta Path[J].Journal of Software,2018,29(10):2915-2930.
Authors:ZHENG Yu-Yan  TIAN Ying and SHI Chuan
Affiliation:Department of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, China,Department of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, China and Department of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, China
Abstract:Entity set expansion (ESE) refers to getting a more complete set according to some rules, given several seed entities with specific semantic meaning. As a popular data mining task, ESE has many applications, such as dictionary construction, querysuggestion and so on. Contemporary ESE mainly utilizes text or Web information. That is, the intrinsic relations among entities are inferred from theirco-occurrences in text or Web. With the surge of knowledge graph in recentyears, it is possible to extend entities according to their co-occurrencesinknowledge graph. In this paper, we study theproblem of the entity set expansion in knowledge graph.That is, given several seed entities, more entities are obtainedby leveraging knowledge graph. We firstly model the knowledge graph as aheterogeneous information network (HIN), whichcontains multiple types of entities or relationships, andpropose a novel methodof entity set expansion based on frequent pattern undermeta path, called FPMP_ESE. The FPMP_ESE employs meta pathsto capture the implicit commontraits of seed entities. In order to find the important meta paths between entities, we design an automatic meta path generation method based on frequent patterncalled FPMPG. Then, we design two kinds of heuristic and PU learning methodsto distribute the weights ofmeta paths. Finally, experiments on real dataset Yagodemonstrate that the proposed method has better effectiveness and higher efficiency compared to other methods.
Keywords:knowledge graph  entity set expansion  heterogeneous information network  meta path  frequent pattern  PU learning
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号