期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

随着Internet技术的发展,万维网上的文档数目成指数级增长。在如此浩瀚的信息库中,用户很难找到自己所需要的信息,如何自动且高效地处理这些海量文档信息成为了目前重要的研究课题。文章通过对抽取到的数据集文档中的标题,超连接和标记等超文本信息,以及文档内容本身分别建立分类模型。然后根据神经网络集成各个分类模型得出判别结果,提出了一种基于元信息的超文本集成分类算法,该算法能更好的综合利用超文本的多元结构化信息。实验结果表明,相对于单独利用某种超文本结构信息进行分类的方法。基于元信息的超文本集成分类算法具有更好的分类性能。相似文献

11.

A Structured Hypertext Data Model with Versioning for Engineering Documents

Ken C.K. Law Yan Wang Horace H.S. Ip 《Multimedia Tools and Applications》2003,19(3):241-258

相似文献

12.

Outline of initial design of the Structured Hypertext Transfer Protocol

下载免费PDF全文

孙斌《计算机科学技术学报》2003,18(3):0-0

This paper presents an introduction to the initial design of the Structured Hypertext Transfer Protocol(STTP),a compatible extension to the HTTP.It includes a new message set for the control of resuource transmission,and the Structured Hypertext Markup Language(STML) for describing the structural information of Web pages.Experimental tests show that STTP can be significantly faster than HTTP,with the improvement of transmission time being around 70% to 400% and the same magnitude of packet savings,which is among the best performance improvement ever reported.The paper discusses the basic idea and major design considerations of these components,as well as a few important issues in developing STTP servers and clients. 相似文献

13.

A Study of Approaches to Hypertext Categorization 总被引：34，自引：2，他引：34

Yiming Yang Seán Slattery Rayid Ghani 《Journal of Intelligent Information Systems》2002,18(2-3):219-241

Hypertext poses new research challenges for text classification. Hyperlinks, HTML tags, category labels distributed over linked documents, and meta data extracted from related Web sites all provide rich information for classifying hypertext documents. How to appropriately represent that information and automatically learn statistical patterns for solving hypertext classification problems is an open question. This paper seeks a principled approach to providing the answers. Specifically, we define five hypertext regularities which may (or may not) hold in a particular application domain, and whose presence (or absence) may significantly influence the optimal design of a classifier. Using three hypertext datasets and three well-known learning algorithms (Naive Bayes, Nearest Neighbor, and First Order Inductive Learner), we examine these regularities in different domains, and compare alternative ways to exploit them. Our results show that the identification of hypertext regularities in the data and the selection of appropriate representations for hypertext in particular domains are crucial, but seldom obvious, in real-world problems. We find that adding the words in the linked neighborhood to the page having those links (both inlinks and outlinks) were helpful for all our classifiers on one data set, but more harmful than helpful for two out of the three classifiers on the remaining datasets. We also observed that extracting meta data from related Web sites was extremely useful for improving classification accuracy in some of those domains. Finally, the relative performance of the classifiers being tested provided insights into their strengths and limitations for solving classification problems involving diverse and often noisy Web pages. 相似文献

14.

利用数据库技术实现的可扩展的分类算法 总被引：9，自引：0，他引：9

刘红岩陆宏钧陈剑《软件学报》2002,13(6):1075-1081

重点研究将数据挖掘中的分类技术与数据库技术紧密结合的高效的可扩展的分类算法.提出一种基于分组记数技术构造分类器的方法,利用数据库系统的结构化查询语言来实现主要计算任务.为了提高算法的执行效率,还提出了优化策略和冗余规则的剪裁策略,并将分类规则的发现过程与相关属性的选择方法有机地结合在一起.使用这些方法和策略,分类算法能够从大规模数据集中快速地发现一组简洁的规则.除了具有与现有分类算法相当的准确度和较高的执行效率以外,该分类算法还具有良好的基于训练集元组个数和属性个数两方面的可扩展性和易于实现的特点. 相似文献

15.

基于遗传算法的关系数据库水印优化算法研究

王春芳崔新春《计算机安全》2010,(2):14-17

提出了利用遗传算法在关系数据库海量的元组中选择最优元组进行水印嵌入,实现了水印短时间内的优化嵌入,大大提高了水印嵌入的效率。同时,采用纠错编码技术和投票选取机制来增加水印的鲁棒性。实验结果表明,这种关系数据库水印优化方案大大提高了水印嵌入的效率,并可以较好地协调水印的鲁棒性和数据库可用性之间的矛盾,实现了关系数据库水印的优化。相似文献

16.

基于P2P主题索引网络的数据库搜索算法

下载免费PDF全文

马光志杨曦廖家国卢炎生《计算机工程》2007,33(19):72-74,8

传统的P2P单层网络难于兼顾搜索效率和高动态性,存在单点失效和负载不均等问题,该文利用“双层主题索引网络”构建系统,融合了无结构和有结构网络的优点,采用多哈希函数策略加入节点、发布资源。基于兴趣度cache缓存和相对距离,选取高优先级节点进行通信,使模型在搜索速度、查准程度、单点失效、负载均衡等方面有了很大的改进。相似文献

17.

关于高校数字图书馆网络数据库的研究

胡玉荣《数字社区&智能家居》2006,(36)

通过对高校数字图书馆中网络数据库的介绍,提出了对其进行选择和管理的建议。相似文献

18.

基于信息隐藏的关系数据库数字水印算法

下载免费PDF全文

王志伟孔祥维《计算机工程》2009,35(23):161-163

设计一种无需修改关系数据的属性值即可嵌入水印的关系数据库数字水印算法。通过使用信息隐藏技术,把对关系数据的修改转换到对信息隐藏载体的修改,使水印的嵌入过程能有效地避免对原始数据的破坏,保持原始数据的真实性和使用价值。实验证明该算法具有较好的鲁棒性。相似文献

19.

Building a Hybrid Database Application for Structured Documents

Böhm Klemens Aberer Karl Klas Wolfgang 《Multimedia Tools and Applications》1999,8(1):65-90

In this article, we propose a database-internal representation for SGML-/HyTime-documents based on object-oriented database technology with the following features: documents of arbitrary type can be administered. The semantics of architectural forms is reflected by means of methods that are part of the database schema and by the database-internal representation of HyTime-specific characteristics. The framework includes mechanisms to ensure conformance of documents to the HyTime standard. Measures for improved performance of HyTime operations are also described. The database-internal representation of documents is a hybrid between a completely structured and a flat representation. Namely, the structured representation is better to support the HyTime semantics, and modifications of document components. On the other hand, most operations are faster for the flat representation, as will be shown. 相似文献

20.

XML-Based Hypertext Functionalities for Software Engineering

Luca Bompani Paolo Ciancarini Fabio Vitali 《Annals of Software Engineering》2002,13(1-4):231-247

Hypertext functionalities represent a form of the distilled wisdom of the hypermedia community. Even if they were introduced and advocated already in the pre-Web era, most of these functionalities are absent in current Web browsers. However, such functionalities can be very useful in some specific applicative fields, like for instance browsing complex software engineering documents, using standard WWW components. We propose to exploit the advent of XML as a basic infrastructure for describing software engineering hypertexts. In fact, we describe XMLC, a prototype of an XML browser that, given its modular architecture and general scope, can be seen as the basis for implementing sophisticated hypertext functionalities for software engineering documentation to be maintained and browsed on the Web. 相似文献