首页 | 官方网站   微博 | 高级检索  
     

基于知识图谱的中文地址匹配方法研究
引用本文:陈雨晖,皮洲,姜滕圣,李响,王震,奚雪峰,吴宏杰,付保川.基于知识图谱的中文地址匹配方法研究[J].计算机工程与应用,2022,58(14):306-312.
作者姓名:陈雨晖  皮洲  姜滕圣  李响  王震  奚雪峰  吴宏杰  付保川
作者单位:1.苏州科技大学 电子与信息工程学院,江苏 苏州 215009 2.苏州市公安局,江苏 苏州 215009 3.苏州科技大学 苏州智慧城市研究院,江苏 苏州 215009
摘    要:随着信息技术的迅猛发展,建设新型高效智慧型城市已成为趋势。智慧城市中有大量以地理信息为基础的应用场景,如在城市规划建设、城市便民生活服务、城市细化管理等都离不开地理信息。由于中文地址的复杂性与人工输入的不确定性,地址数据不规范性、不一致、不明确现象给业务系统之间与内部带来了很多困难。急需优秀的中文地址匹配方法。现有的匹配方法仅从地址文字出发进行匹配,而忽略地址作为一个实体蕴含着丰富的地理知识,这些知识可以有效地协助匹配过程,由此,提出注意力知识图谱的中文地址匹配方法,从而解决复杂中文地址匹配准确率低的问题。通过对传统的标准地址库进行地址分词以及特征抽取,建立标准地址知识图谱与POI知识图谱;采用基于选择注意力机制的知识图谱关系抽取方法来进行对地址的特征提取,从而进行地址分类;通过计算知识图谱实体相似度,从而进行非标中文地址的地址匹配。实验结果表明,该方法较基于Jaccard相似度的地址匹配方法、基于动态规划的地址匹配方法、基于Sorensen Dice的全文检索地址匹配方法和基于bert4keras预训练模型的地址匹配方法准确率分别提高了11.05%、15.30%、11.05%、0.95%,有效对复杂中文地址进行匹配。

关 键 词:知识图谱  中文地址  地址匹配  

Research on Chinese Address Matching Based on Knowledge Graph
CHEN Yuhui,PI Zhou,JIANG Tengsheng,LI Xiang,WANG Zhen,XI Xuefeng,WU Hongjie,FU Baochuan.Research on Chinese Address Matching Based on Knowledge Graph[J].Computer Engineering and Applications,2022,58(14):306-312.
Authors:CHEN Yuhui  PI Zhou  JIANG Tengsheng  LI Xiang  WANG Zhen  XI Xuefeng  WU Hongjie  FU Baochuan
Affiliation:1.School of Electronics and Information Engineering, Suzhou University of Science and Technology, Suzhou, Jiangsu 215009, China 2.Suzhou Public Security Bureau, Suzhou, Jiangsu 215009, China 3.Suzhou Smart City Research Institute, Suzhou University of Science and Technology, Suzhou, Jiangsu 215009, China
Abstract:With the rapid development of information technology, it has become a trend to build new efficient and intelligent cities. In smart cities, there are a large number of application scenarios based on geographic information, such as urban planning and construction, urban convenient life services, urban refinement management and so on, which are inseparable from geographic information. Due to the complexity of Chinese address and the uncertainty of manual input, the non-standardization, inconsistency and ambiguity of address data bring a lot of difficulties between and within business systems. An excellent Chinese address matching method is urgently needed. Matching method only match from address text, and ignoring address as a single entity contains the rich geographical knowledge, these knowledge can effectively assist in matching process, as a result, this paper puts forward attention Chinese address matching method of knowledge graph, so as to solve the problem of low accuracy of complex Chinese address matching. Firstly, the knowledge graph of standard address and the knowledge map of POI are established through the address segmentation and feature extraction of the traditional standard address library. Secondly, a knowledge graph relation extraction method based on selective attention mechanism is used to extract the features of addresses, so as to classify addresses. Finally, the address matching of non-standard Chinese addresses is carried out by calculating the entity similarity of knowledge graph. The experimental results show that the accuracy of this method is improved by 11.05%, 15.30% , 11.05% and 0.95%, respectively, compared with the address matching method based on Jaccard similarity, the address matching method based on dynamic programming , the address matching method based on Sorensen Dice and address matching method based on bert4keras pre-training model, which can effectively match complex Chinese addresses.
Keywords:knowledge graph  Chinese address  address matching  
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号