首页 | 官方网站   微博 | 高级检索  
     

一种基于网页特征提取的网站全文搜索系统的设计与实现
引用本文:杨如祥,曾献辉. 一种基于网页特征提取的网站全文搜索系统的设计与实现[J]. 东华大学学报(自然科学版), 2007, 33(5): 639-643
作者姓名:杨如祥  曾献辉
作者单位:1. 宁波振东光电有限公司,浙江,宁波,315403
2. 东华大学,信息科学与技术学院,上海,201620
摘    要:给出了一种针对目标网站的全文搜索系统的程序框架图,介绍了其工作原理及实现过程.在全文信息数据库的建立过程中,针对HTML文档的特点,提出了网页特征信息提取技术,有效地减少了信息存储量.最后,给出了应用结果.

关 键 词:特征提取  网站  全文搜索系统  全文信息库  搜索代理
文章编号:1671-0444(2007)05-0639-05
修稿时间:2006-02-22

Design and Realization of a Kind of Full-text Searching System for Website Based on the Feature Extraction of Web Pages
YANG Ru-xiang,ZENG Xian-hui. Design and Realization of a Kind of Full-text Searching System for Website Based on the Feature Extraction of Web Pages[J]. Journal of Donghua University, 2007, 33(5): 639-643
Authors:YANG Ru-xiang  ZENG Xian-hui
Affiliation:1. Ningbo Zhendong Optoelectronics Co. , Ltd, Ningbo Zhejiang 315403, China; 2. College of Information Science and Technology, Donghua University, Shanghai 201620, China
Abstract:The programming framework of a kind of full-text searching system for the target website is presented. Then, its process principle and implementation are also introduced. During the establishment of full-text information database, the technique of the feature extraction of web pages is proposed based on the characteristic of HTML documents, which may decrease storage efficiently. At last, some results are given.
Keywords:feature extraction   website   full-text searching system   full-text information database   searching agent
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号