基于句子相似度的论文抄袭检测模型研究 Study on model for plagiarism-detection of scientific papers based on sentence similarity期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于句子相似度的论文抄袭检测模型研究

引用本文：	冷强奎,秦玉平,王春立.基于句子相似度的论文抄袭检测模型研究[J].计算机工程与应用,2011,47(24):199-201.

作者姓名：	冷强奎秦玉平王春立

作者单位：	1. 渤海大学信息科学与工程学院,辽宁锦州,121000 2. 大连海事大学信息科学技术学院,辽宁大连,116026

基金项目：	国家自然科学基金，辽宁省教育厅重点实验室项目

摘要：	提出一种基于句子相似度的论文抄袭检测模型。利用局部词频指纹算法对大规模文档进行快速检测,找出疑似抄袭文档。根据最长有序公共子序列算法计算句子间的相似度,并标注抄袭细节,给出抄袭依据。在标准中文数据集SOGOU-T上进行的实验表明,该模型具有较强的局部信息挖掘能力,在一定程度上克服了现有的论文抄袭检测算法精度不高的缺点。
关键词：	句子相似度抄袭检测局部词频最长有序公共子序列
修稿时间：
Study on model for plagiarism-detection of scientific papers based on sentence similarity

LENG Qiangkui,QIN Yuping,WANG Chunli.Study on model for plagiarism-detection of scientific papers based on sentence similarity[J].Computer Engineering and Applications,2011,47(24):199-201.

Authors:	LENG Qiangkui QIN Yuping WANG Chunli

Affiliation:	1.College of Information Science and Engineering,Bohai University,Jinzhou,Liaoning 121000,China 2.College of Information Science and Technology,Dalian Maritime University,Dalian,Liaoning 116026,China

Abstract:	A new model for plagiarism-identification of scientific papers based on sentence similarity is presented.Large-scale texts are quickly detected with Local Word-Frequency Fingerprin（tLWFF） to find suspected plagiarism ones.Sentence similari-ty is computed according to the Longest Sorted Common Subsequence（LSCS） between source texts and destination texts.The algorithm can mark plagiarism details,and show evidence.The identification experiments on the SOGOU-T database are done with this model.The results show it has higher information mining capacity,and partly overcomes the shortage of low-er precision on existing plagiarism-identification of scientific papers.

Keywords:	sentence similarity plagiarism-detection local word-frequency Longest Sorted Common Subsequence（LSCS）
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机工程与应用》浏览原始摘要信息
	点击此处可从《计算机工程与应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏