首页 | 官方网站   微博 | 高级检索  
     

融合Copulas理论和关联规则挖掘的查询扩展
引用本文:黄名选,胡小春.融合Copulas理论和关联规则挖掘的查询扩展[J].模式识别与人工智能,2021,34(2):176-188.
作者姓名:黄名选  胡小春
作者单位:1.广西财经学院 广西跨境电商智能信息处理重点实验室 南宁 530003
2.广西财经学院 信息与统计学院 南宁 530003
基金项目:国家自然科学基金项目(No.61762006)资助
摘    要:将Copulas理论引入文本特征词关联模式挖掘,提出融合Copulas理论和关联规则挖掘的查询扩展算法.从初检文档集中提取前列n篇文档构建伪相关反馈文档集或用户相关反馈文档集,利用基于Copulas理论的支持度和置信度对相关反馈文档集挖掘含有原查询词项的特征词频繁项集和关联规则模式,从这些规则模式中提取扩展词,实现查询扩展.在NTCIR-5 CLIR中英文本语料上的实验表明,文中算法可有效遏制查询主题漂移和词不匹配问题,改善信息检索性能,提升扩展词质量,减少无效扩展词.

关 键 词:自然语言处理  查询扩展  信息检索  关联规则  文本挖掘  
收稿时间:2020-07-06

Query Expansion Combining Copulas Theory and Association Rules Mining
HUANG Mingxuan,Hu Xiaochun.Query Expansion Combining Copulas Theory and Association Rules Mining[J].Pattern Recognition and Artificial Intelligence,2021,34(2):176-188.
Authors:HUANG Mingxuan  Hu Xiaochun
Affiliation:1. Guangxi Key Laboratory of Cross-Border E-commerce Intelligent Information Processing, Guangxi University of Finance and Economics, Nanning 530003
2. School of Information and Statistics, Guangxi University of Finance and Economics, Nanning 530003
Abstract:The Copulas theory is introduced into the association pattern mining of text feature terms, and a query expansion algorithm combining Copulas theory and association rules mining is proposed. Firstly, top n documents of the document set returned by the query are extracted to construct the pseudo-relevance feedback document set (PRFDS) or user relevance feedback document set(URFDS). Then, the support and the confidence based on Copulas theory are applied to mine the feature term frequent itemsets and association rule patterns with the original query terms in PRFDS or URFDS, and the expansion terms are obtained from the patterns to realize query expansion. The experimental results on NTCIR-5 CLIR Chinese and English corpus show that the proposed expansion algorithm effectively restrains the problems of query topic drift and word mismatch, and enhances the performance of information retrieval with the quality of expansion terms improved and the invalid expansion terms reduced.
Keywords:Natural Language Processing  Query Expansion  Information Retrieval  Association Rule  Text Mining  
点击此处可从《模式识别与人工智能》浏览原始摘要信息
点击此处可从《模式识别与人工智能》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号