首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
一种基于DTD的XML索引方法   总被引:9,自引:0,他引:9  
路径查询是XML查询的一个主要特征,现已提出了多种XML索引方法.DTD的结构信息对于XML索引的建立及查询效率的提高很重要,但现有的大部分索引方法没有利用DTD这一有效资源,提出一种利用DTD的XML索引方法——DBXI(DTD-based XML indexing),该方法采用了新的编码方法,可使路径查询具备如下特征:对于由N个元素/属性组成的具有1个谓词约束的路径表达式,DBXI处理每个XML文档仅需0次或1次元素/属性结点集的结构连接操作;对于在XML文档中不存在匹配结构的路径查询,DBXI能够在比现有的XML索引方法较短的时间内给出无查询结果的判断.实验表明,与Lore,SphinX和XISS等索引方法相比,DBXI能够缩短路径查询的响应时间.  相似文献   

2.
提出了一种用于搜索XML文档的新的索引方法即RIST。通过采用代码化的结构序列(SES)来表示XML文档和XML查询,得出查询XML数据等同于查找子序列匹配。RIST采用树结构作为查询的基本单元,从而避免了代价高昂的连接操作。另外,RIST还在XML文档的内容和结构上提供了一个统一的索引,所以它的一个很明显的优势就是克服了仅仅根据内容或结构建立索引的弊端。实验表明RIST在支持结构查询上是一种高效的方法。  相似文献   

3.
摘要:本文提出了一种用于搜索XML文档的新的索引方法即RIST。通过采用代码化的结构序列(SES)来表示XML文档和XML查询,我们得出查询XML数据等同于查找子序列匹配。RIST采用树结构作为查询的基本单元,从而避免了代价高昂的连接操作。另外,RIST还在XML文档的内容和结构上提供了一个统一的索引,所以它的一个很明显的优势就是克服了仅仅根据内容或结构建立索引的弊端。实验表明RIST在支持结构查询上是一种高效的方法。  相似文献   

4.
基于编码的XML关系数据库存储   总被引:2,自引:0,他引:2  
在XML的发展过程中,如何有效地利用关系数据库技术存储和查询XML数据已经成为一个研究热点.提出了一种基于前、后序编码的XML关系数据库存储方法,该方法采用的模式映射方法能够使基于不同DTD(或schema)的XML文档保存在同一个关系表中,支持快速的XML路径查询,且具有较高的XML文档重组效率.对该方法中递归模式的处理技术也进行了讨论.实验表明,与XRel,Florescu和Kossman等人提出的XML关系数据库存储方法相比,该方法能够缩短复杂XML路径查询(如带条件谓词约束的路径查询)的响应时间.  相似文献   

5.
针对XML数据特有的树型结构模式,提出了一种将树型结构的XML数据和查询语句转化为特定格式的字符串,基于串匹配原理对结构复杂的XML数据进行查询的方法,避免了传统的基于路径的查询方式所必需的路径之间的连接(join)操作,从而提高查询效率。利用本文提出的编码方式,可以建立关于XML数据结构和数据内容舍为一体的索引。实验显示,本文使用的针对XML数据查询的方法比传统的基于连接操作的数据查询方式高效,且本方法具有良好的扩展性。  相似文献   

6.
7.
GML文档是XML技术在GIS方面的应用,成为空间数据在Internet上的实际表示、传输和交换的标准。目前,GML文档的查询是GIS领域的研究热点。对这一问题,研究了GML文档的数据特点和结构特点,设计了一种新的索引结构--GB树,GB树是专门针对GML文档中空间数据节点的索引结构。将XML Twig模式查询思想引入GML文档查询,借助GB树的索引特点,提出了GML文档的Twig模式查询算法--GMLTwigStackGB。GMLTwigStackGB算法保留了XML文档Twig模式查询算法的优势和特点,具有完整的空间查询功能。测试实验表明,该算法能够高效地满足GML文档上的各种数据查询。  相似文献   

8.
针对XML文档索引查询中非法路径查询响应时问过长的问题,提出一种利用DTD模式进行预处理的索引方法。建立索引DWBI,采用新的基十区域编码方式,有效地支持祖先一后代判断。查询时利用DTD模式对查询进行预处理,再查询带有DTD信息的XML索引树,从而提高查询的效率。  相似文献   

9.
We describe a method for generating queries for retrieving data from distributed heterogeneous semistructured documents, and its implementation in the metadata interface DDXMI (distributed document XML metadata interchange). The proposed system generates local queries appropriate to local schemas from a user query over the global schema. The system constructs mappings between global schema and local schemas (extracted from local documents if not given), path substitution, and node identification for resolving the heterogeneity among nodes with the same label that often exist in semistructured data. The system uses Quilt as its XML query language. An experiment is reported over three local semistructured documents: ‘thesis’, ‘reports’, and ‘journal’ documents with ‘article’ global schema. The prototype was developed under Windows system with Java and JavaCC.  相似文献   

10.
黎玲利  王宏志  高宏  李建中 《软件学报》2012,23(6):1561-1577
利用关键字可以在模式未知的情况下对XML数据进行查询.在当前的XML数据流上的关键字查询处理中,打分函数往往不能都满足各种用户不同的需求.提出了一种基于skyline的XML数据流上的Top-K关键字查询.对于这种查询,不需要考虑影响结果与查询相关性的复杂因素,只需利用skyline挑选与查询最相关的结果.提出了两种XML数据流上的有效的基于skyline的Top-K关键查询处理算法,包括对单查询和多查询的处理算法.通过扩展实验对两种算法的有效性和可扩展性进行了验证.经过实验验证,所提出的查询处理算法的效率几乎不受关键字个数、查询结果数量、查询数量等参数的影响,运行时间和文档大小大致呈线性关系.  相似文献   

11.
针对XML文档集的关键词检索结果排序   总被引:1,自引:0,他引:1       下载免费PDF全文
探讨了针对XML文档集中只与内容相关的关键词检索结果的排序问题,针对XML文档特征提出了一种新的排序模型,它不同于面向Web的XML网页的搜索结果的排序。设计了满足这种排序模型的倒排列表索引结构和搜索引擎的体系结构。  相似文献   

12.
采用索引技术,对输入的XML文档建立一个双索引结构来改进YFilter算法,优化XML文档过滤性能。藉助索引结构,该算法超前搜索元素结点在文档中的结构信息,预先排除不能保证得到任何匹配结果的元素结点,以避免大量不必要的查询处理。实验结果显示,当输入的XML文档较大时,该算法有较好的过滤性能。  相似文献   

13.
XML is a flexible and powerful tool that enables information and security sharing in heterogeneous environments. Scalable technologies are needed to effectively manage the growing volumes of XML data. A wide variety of methods exist for storing and searching XML data; the two most common techniques are conventional tree-based and relational approaches. Tree-based approaches represent XML as a tree and use indexes and path join algorithms to process queries. In contrast, the relational approach utilizes the power of a mature relational database to store and search XML. This method relationally maps XML queries to SQL and reconstructs the XML from the database results. To date, the limited acceptance of the relational approach to XML processing is due to the need to redesign the relational schema each time a new XML hierarchy is defined. We, in contrast, describe a relational approach that is fixed schema eliminating the need for schema redesign at the expense of potentially longer runtimes. We show, however, that these potentially longer runtimes are still significantly shorter than those of the tree approach. We use a popular XML benchmark to compare the scalability of both approaches. We generated large collections of heterogeneous XML documents ranging in size from 500 MB to 8 GB using the XBench benchmark. The scalability of each method was measured by running XML queries that cover a wide range of XML search features on each collection. We measure the scalability of each method over different query features as the collection size increases. In addition, we examine the performance of each method as the result size and the number of predicates increase. Our results show that our relational approach provides a scalable approach to XML retrieval by leveraging existing relational database optimizations. Furthermore, we show that the relational approach typically outperforms the tree-based approach while scaling consistently over all collections studied.
Ophir Frieder (Corresponding author)Email:
  相似文献   

14.
Effective support for temporal applications by database systems represents an important technical objective that is difficult to achieve since it requires an integrated solution for several problems, including (i) expressive temporal representations and data models, (ii) powerful languages for temporal queries and snapshot queries, (iii) indexing, clustering and query optimization techniques for managing temporal information efficiently, and (iv) architectures that bring together the different pieces of enabling technology into a robust system. In this paper, we present the ArchIS system that achieves these objectives by supporting a temporally grouped data model on top of RDBMS. ArchIS’ architecture uses (a) XML to support temporally grouped (virtual) representations of the database history, (b) XQuery to express powerful temporal queries on such views, (c) temporal clustering and indexing techniques for managing the actual historical data in a relational database, and (d) SQL/XML for executing the queries on the XML views as equivalent queries on the relational database. The performance studies presented in the paper show that ArchIS is quite effective at storing and retrieving under complex query conditions the transaction-time history of relational databases, and can also assure excellent storage efficiency by providing compression as an option. This approach achieves full-functionality transaction-time databases without requiring temporal extensions in XML or database standards, and provides critical support to emerging application areas such as RFID.  相似文献   

15.
Web-based databases are gaining increased popularity. This has positively influenced the availability of structured and semi-structured databases for access by a variety of users ranging from professionals to naive users. The number of users accessing online databases will continue to increase if the visual tools connected to web-based databases are flexible and user-friendly enough to meet the expectations of naive users and professionals. Further, XML is accepted as the standard for platform independent data exchange. This motivated for the development of the conversion tools between structured databases and XML. Realizing that such a need has not been well handled by the available tools, including Clio from IBM, we developed VIREX as a visual tool for converting relational databases into XML, and since then has been empowered with further capabilities to manipulate the produced XML schema including the maintenance of materialized views and schema evolution functions. VIREX provides an interactive approach for querying and integrating relational databases to produce XML documents and the corresponding XML schema(s). VIREX supports VRXQuery as a visual naive users-oriented query language that allows users to specify queries and define views directly on the interactive diagram as a sequence of mouse clicks with minimum keyboard input. As the query result, VIREX displays on the screen the XML schema that satisfies the specified characteristics and generates colored (easy to read) XML document(s). The main contribution described in this paper is the novel approach for turning query results into materialized views which are maintained to remain consistent with the underlying database. VIREX supports deferred update of XML views by keeping an ordered summary of the necessary and sufficient information required for the process. Each view has a corresponding marker in the ordered summary to indicate the start of the information to be reflected onto the view when it is accessed. When a view is accessed, its marker moves to the head of the list to mark for the next update. In addition, VIREX supports some basic schema evolution functions include renaming, adding and dropping of elements and attributes, among others. The supported schema evolution functions add flexibility to the view maintenance and materialization process.  相似文献   

16.
We consider data exchange for XML documents: given source and target schemas, a mapping between them, and a document conforming to the source schema, construct a target document and answer target queries in a way that is consistent with the source information. The problem has primarily been studied in the relational context, in which data-exchange systems have also been built. Since many XML documents are stored in relations, it is natural to consider using a relational system for XML data exchange. However, there is a complexity mismatch between query answering in relational and in XML data exchange. This indicates that to make the use of relational systems possible, restrictions have to be imposed on XML schemas and mappings, as well as on XML shredding schemes. We isolate a set of five requirements that must be fulfilled in order to have a faithful representation of the XML data-exchange problem by a relational translation. We then demonstrate that these requirements naturally suggest the in-lining technique for data-exchange tasks. Our key contribution is to provide shredding algorithms for schemas, documents, mappings and queries, and demonstrate that they enable us to correctly perform XML data-exchange tasks using a relational system.  相似文献   

17.
XML's increasing diffusion makes efficient XML query processing and indexing all the more critical. Given the semistructured nature of XML documents, however, general query processing techniques won't work. Researchers have proposed several specialized indexing methods that offer query processors efficient access to XML documents, although none are yet fully implemented in commercial products. In this article the classification of XML indexing techniques identifies current practices and trends, offering insight into how developers can improve query processing and select the best solution for particular contexts.  相似文献   

18.
XML is a new standard for exchanging and representing information on the Internet. Documents can be hierarchically represented by XML-elements. In this paper, we propose that an XML document collection be represented and indexed using a bitmap indexing technique. We define the similarity and popularity operations suitable for bitmap indexes. We also define statistical measurements in the BitCube: center, and radius. Based on these measurements, we describe a new bitmap indexing based technique to cluster XML documents. The techniques for clustering are motivated by the fact that the bitmap indexes are expected to be very sparse.Furthermore, a 2-dimensional bitmap index is extended to a 3-dimensional bitmap index, called the BitCube. Sophisticated querying of XML document collections can be performed using primitive operations such as slice, project, and dice. Experiments show that the BitCube can be created efficiently and the primitive operations can be performed more efficiently with the BitCube than with other alternatives.  相似文献   

19.
基于文档实例的中文信息检索   总被引:2,自引:0,他引:2  
传统的信息检索系统基于关键词建立索引并进行信息检索.这些系统存在查询返回文档集大、准确率低和普通用户不便于构造查询等不足.为此,该文提出基于文档实例的信息检索,即以已有文档作为样本,在文档库中检索与样本文档相似的所有文档.文中给出了基于文档实例的中文信息检索的解决方法和实现技术.初步实验结果表明该方法是行之有效的.  相似文献   

20.
Due to an explosive increase of XML documents, it is imperative to manage XML data in an XML data warehouse. XML warehousing imposes challenges, which are not found in the relational data warehouses. In this paper, we firstly present a framework to build an XML data warehouse schema. For the purpose of scalability due to the increase of data volume, we propose a number of partitioning techniques for multi-version XML data warehouses, including document based partitioning, schema based partitioning, and cascaded (mixed) partitioning model. Finally, we formulate cost models to evaluate various types of queries for an XML data warehouse.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号