首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 500 毫秒
1.
近年来, XML数据查询成为一个重要的研究课题。处理小枝查询是XML查询实现的核心操作,针对小枝模式查询,提出了一种改进的小枝模式匹配算法。该算法通过剪去无用的数据流以减少待处理结点的数目,从而节省处理时间,提高查询的准确率。实验结果表明,该算法能够有效提高查询效率。  相似文献   

2.
Matching twigs in fuzzy XML   总被引:2,自引:0,他引:2  
A considerable amount of twig pattern matching algorithms have been proposed to holistically process a twig query. Those algorithms mainly focus on twig pattern query with the AND-logic. However, there is often a need to process a twig query with the OR-predicates. Furthermore, the existing algorithms fall short in their ability to support twig query with OR-logic in fuzzy XML. To overcome this limitation, in this paper, we first introduce a novel encoding scheme to represent node information in fuzzy XML. Based on the encoding scheme, we then propose an effective algorithm for matching a twig pattern query with the AND/OR-logic in fuzzy XML. Our approach adopts a compact stack technique to process the complicated twig query consisting of both AND-logic and OR-logic. More importantly, our method eliminates re-scanning unnecessary portions of XML documents and redundant intermediate results. Finally, the experimental results demonstrate the performance advantages of our approach.  相似文献   

3.
随着互联网的迅速发展,XML已经成为网上通用的数据表示与交换的标准。因此,如何有效地查询XML数据成为一个重要的研究课题。近年来,小枝模式匹配问题已被广泛地研究,提出了不少小枝模式匹配算法。在汲取各种小枝模式匹配算法优点的基础上,提出了一种新的小枝模式匹配算法TwigEN。根据XML文档结构它可以跳过那些在结构连接中无用的元素结点,这样不仅减少了待处理结点的数目,缩短了处理时间,而且也节省了内存空间。  相似文献   

4.
XML数据库的查询优化技术是当前数据库领域中的一个研究热点,而小枝模式匹配又是其中的一个研究重点.在总结分析各种小枝模式匹配算法的基础上,提出了一种新的基于Extended Dewey编码的小枝模式匹配方法.该方法首先使用TJFast算法在XML文档的JoinGuide索引上进行预匹配,然后再扫描预匹配结果中的叶子结点序列就可以找出所有的匹配结果.最后,用实验的方法同其它算法作了比较,并对实验结果进行了分析.  相似文献   

5.
XML数据的广泛应用,使得高性能的XQuery实现成为XML数据处理领域的重要课题,但XQuery的灵活性和复杂性为其实现技术研究提出了巨大挑战。XQuery语言的高性能实现需要利用XML查询代数提供的查询优化方法,也需要采取高效的树模式整体匹配算法。给出了XQuery语言实现的基础架构,探讨了原生XML数据库系统中XQuery实现的关键技术——查询代数和树模式查询的国内外研究现状,展望了未来的研究方向及面临的挑战。  相似文献   

6.
Indexing and querying XML using extended Dewey labeling scheme   总被引:1,自引:0,他引:1  
Finding all the occurrences of a tree pattern in an XML database is a core operation for efficient evaluation of XML queries. The Dewey labeling scheme is commonly used to label an XML document to facilitate XML query processing by recording information on the path of an element. In order to improve the efficiency of XML tree pattern matching, we introduce a novel labeling scheme, called extended Dewey, which effectively extends the existing Dewey labeling scheme to combine the types and identifiers of elements in a label, and to avoid the scan of labels for internal query nodes to accelerate query processing (in I/O cost). Based on extended Dewey, we propose a series of holistic XML tree pattern matching algorithms. We first present TJFast to answer an XML twig pattern query. To efficiently answer a generalized XML tree pattern, we then propose GTJFast, an optimization that exploits the non-output nodes. In addition, we propose TJFastTL and GTJFastTL based on the tag + level data partition scheme to further reduce I/O costs by level pruning. Finally, we report our comprehensive experimental results to show that our set of XML tree pattern matching algorithms are superior to existing approaches in terms of the number of elements scanned, the size of intermediate results and query performance.  相似文献   

7.
XQuery语言的高性能实现需要利用XML查询代数提供的查询优化方法,也需要采取高效的树模式整体匹配算法。为了将这两种XML查询处理技术有效地结合在XQuery语言处理系统中,提出了一种通用系统框架来支持XQuery语言的高性能实现。在这个框架内,提供开放式XML数据源连接,并且通过作为中间语言的一种函数式查询计划描述语言FXQL来支持各种查询代数算子和树查询模式的表示,既允许采用各种XML查询代数,又允许采用各种树模式查询算法;进而,通过这种中间层的程序变换可以实现基于各种查询代数的查询重写,并从查询计划中分离出独立的树模式查询计算,使两种查询处理技术适当地统一在同一系统框架中,有效地支持了多种环境下XQuery语言的实现。  相似文献   

8.
针对XML流数据的复杂Twig Pattern查询处理   总被引:2,自引:0,他引:2  
XML流数据处理在研究领域引起了研究者的广泛兴趣.针对XML流数据的、具有嵌套AND/OR谓词的复杂Twig Pattern查询处理,提出一种新方法.为了提高查询处理性能,将所有Twig Pattern合并为一个共享前缀的查询树,其中,AND/OR谓词被表示为单独的抽象语法树,因而能够以文档顺序、单遍地处理复杂Twig Pattern的匹配,并避免了YFilter中对嵌套谓词进行后置处理所产生的中间结果.实验结果表明,该方法能够有效改善Twig Pattern的处理性能,尤其是在处理大文档的情况下.基于已  相似文献   

9.
Jian Liu  Z. M. Ma  Li Yan 《World Wide Web》2013,16(3):325-353
As the next generation language of the Internet, XML has been the de-facto standard of information exchange over the web. A core operation for XML query processing is to find all the occurrences of a twig pattern in an XML database. In addition, the study of probabilistic data has become an emerging topic for various applications on the Web. Therefore, researching the combination of XML twig pattern and probabilistic data is quite significant. In prior work of probabilistic XML, the answers of a given twig query are always complete. However, complete answers with low probabilities may be deemed irrelevant while incomplete answers with high probabilities are of great significance because incomplete answers may be the potential answers that interest the users. Different from complete evaluation, evaluating incomplete twigs in probabilistic XML introduces some new challenges. On one hand, incomplete queries do not only obtain complete matches, but also return answers that contain considerable incomplete matches. On the other hand, the processing of incomplete evaluation is more complicated. It is obvious that a ranking approach should be adopted along with evaluating incomplete answers. In this paper, we propose an efficient algorithm to handle the problem of querying incomplete twigs over the probabilistic XML database. We also present a novel algorithm for ranking the incomplete answers. The experimental results show that our proposed algorithms can improve the performance of querying and ranking incomplete twigs significantly.  相似文献   

10.
路径表达式查询是XML数据查询处理的核心研究问题之一,研究者开展了大量的研究工作.但这些研究更多关注XML数据上路径表达式的匹配,忽略了谓词"包含".研究XML查询处理中谓词"包含"的查询处理方法.采用了两种方法,第一种是采用跳跃表的方法,在XML分枝模式匹配时动态地对结点数据进行读取和关键字匹配.第二种是为XML文档中的词语建立倒排索引,来实现关键字的匹配.并从分枝模式路径长度、查询关键的数量和"包含"谓词判断结点的类型,对两种方法进行了分析和比较.  相似文献   

11.
目前,XML文档查询是研究的热点,其中小枝模式匹配方法是重要的研究方向,但是大多数基于这种思想的算法只能处理包含祖先/后代关系的查询。为此,提出了一种新的小枝模式匹配算法——TwigStackPC,它能够有效地处理包含祖先/后代和父/子关系的查询。  相似文献   

12.
Massive XML data are increasingly generated for the representation, storage and exchange of web information. Twig query processing over massive XML data has become a research focus. However, most traditional algorithms cannot be directly implemented in a distributed manner. Some of the existing distributed algorithms generate a lot of useless intermediate results and execute many join operations of partial results in most cases; others require the priori knowledge of query pattern before XML partition, storage and query processing, which is impractical in the cases of large-scale data or frequent incoming new queries. To improve efficiency and scalability, in this paper, we propose a 3-phase distributed algorithm DisT3 based on node distribution mechanism to avoid unnecessary intermediate results. Furthermore, we propose a lightweight local index ReP with an enhanced XML partitioning approach using arbitrary partitioning strategy, and based on ReP we propose an improved 2-phase distributed algorithm DisT2ReP to further reduce the communication cost. After the performance guarantees are analyzed, extensive experiments are conducted to verify the efficiency and scalability of our proposed algorithms in distributed twig query applications.  相似文献   

13.
As huge volumes of data are organized or exported in tree-structured form, it is quite necessary to extract useful information from these data collections using effective and efficient query processing methods. A natural way of retrieving desired information from XML documents is using twig pattern (TP), which is, actually, the core component of existing XML query languages. Twig pattern possesses the inherent feature that query nodes on the same path have concrete precedence relationships. It is this featu...  相似文献   

14.
XML data broadcast is an efficient way to disseminate XML data to a large number of mobile clients in mobile wireless networks. Recently, several indexing methods have been proposed to improve the performance of XML query processing in terms of access time and tuning time over XML streams. However, existing indexing methods cannot process twig pattern XML queries. In this paper, we propose a novel structure for streaming XML data called PS+Pre/Post by integrating the path summary technique and the pre/post labeling scheme. Our proposed XML stream structure exploits the benefits of the path summary technique and the pre/post labeling scheme to efficiently process different types of XML queries over the broadcast stream. Experimental results show that our proposed XML stream structure improves the performance of access time and tuning time in processing different types of XML queries.  相似文献   

15.
Query matching on XML streams is challenging work for querying efficiency when the amount of queried stream data is huge and the data can be streamed in continuously. In this paper, the method Syntactic Twig-Query Matching (STQM) is proposed to process queries on an XML stream and return the query results continuously and immediately. STQM matches twig queries on the XML stream in a syntactic manner by using a lexical analyzer and a parser, both of which are built from our lexical-rules and grammar-rules generators according to the user's queries and document schema, respectively. For query matching, the lexical analyzer scans the incoming XML stream and the parser recognizes XML structures for retrieving every twig-query result from the XML stream. Moreover, STQM obtains query results without a post-phase for excluding false positives, which are common in many streaming query methods. Through the experimental results, we found that STQM matches the twig query efficiently and also has good scalability both in the queried data size and the branch degree of the twig query. The proposed method takes less execution time than that of a sequence-based approach, which is widely accepted as a proper solution to the XML stream query.  相似文献   

16.
针对目前不确定XML小枝模式匹配算法均基于归并,易造成很大的空间和时间浪费问题,提出基于P-文档模型的连续不确定XML的非归并的小枝模式匹配算法.算法在节点入队列和出队列时分别进行过滤剪枝操作,减少待处理节点的个数,匹配过程使用相互关联的链表存储中间结果,不需要归并.理论分析与实验结果表明,该算法是一种高效的连续不确定XML查询算法.  相似文献   

17.
缪丰羽  王宏志 《计算机科学》2016,43(11):284-290
模糊XML文档是指包含不确定信息的XML文档。在模糊XML文档查询方面,现有的研究成果较少,并且都是基于树型结构的XML文档进行的。针对图结构下模糊XML文档的特征,设计了一组高效的图结构模糊XML文档上的模式匹配算法。该算法基于一种适合于图结构文档的索引方式,采用自底向上的结点匹配顺序,大大减少了结点的重复判断操作,也不需要进行局部匹配结果的归并以及针对PC关系设计额外的过滤函数。理论分析以及实验结果证明,提出的模式匹配算法不仅在小枝查询性能上优于现有的相关算法,而且能够较好地实现DAG模式匹配查询。  相似文献   

18.
Various index structures have been proposed to speed up the evaluation of XML path expressions. However, existing XML path indices suffer from at least one of three limitations: they focus only on indexing the structure (relying on a separate index for node content), they are useful only for simple path expressions such as root-to-leaf paths, or they cannot be tightly integrated with a relational query processor. Moreover, there is no unified framework to compare these index structures. In this paper, we present a framework defining a family of index structures that includes most existing XML path indices. We also propose two novel index structures in this family, with different space–time tradeoffs, that are effective for the evaluation of XML branching path expressions (i.e., twigs) with value conditions. We also show how this family of index structures can be implemented using the access methods of the underlying relational database system. Finally, we present an experimental evaluation that shows the performance tradeoff between index space and matching time. The experimental results show that our novel indices achieve orders of magnitude improvement in performance for evaluating twig queries, albeit at a higher space cost, over the use of previously proposed XML path indices that can be tightly integrated with a relational query processor.  相似文献   

19.
RRSi: indexing XML data for proximity twig queries   总被引:2,自引:2,他引:0  
Twig query pattern matching is a core operation in XML query processing. Indexing XML documents for twig query processing is of fundamental importance to supporting effective information retrieval. In practice, many XML documents on the web are heterogeneous and have their own formats; documents describing relevant information can possess different structures. Therefore some “user-interesting” documents having similar but non-exact structures against a user query are often missed out. In this paper, we propose the RRSi, a novel structural index designed for structure-based query lookup on heterogeneous sources of XML documents supporting proximate query answers. The index avoids the unnecessary processing of structurally irrelevant candidates that might show good content relevance. An optimized version of the index, oRRSi, is also developed to further reduce both space requirements and computational complexity. To our knowledge, these structural indexes are the first to support proximity twig queries on XML documents. The results of our preliminary experiments show that RRSi and oRRSi based query processing significantly outperform previously proposed techniques in XML repositories with structural heterogeneity.
Vincent T. Y. NgEmail:
  相似文献   

20.
Recursive queries are quite important in the context of XML databases. In addition, several recent papers have investigated a relational approach to store XML data and there is growing evidence that schema-conscious approaches are a better option than schema-oblivious techniques as far as query performance is concerned. However, the issue of recursive XML queries for such approaches has not been dealt with satisfactorily. In this paper we argue that it is possible to design a schema-oblivious approach that outperforms schema-conscious approaches for certain types of recursive queries. To that end, we propose a novel schema-oblivious approach, called Sucxent++ (Schema Unconcious XML Enabled System), that outperforms existing schema-oblivious approaches such as XParent by up to 15 times and schema-conscious approaches (Shared-Inlining) by up to eight times for recursive query execution. Our approach has up to two times smaller storage requirements compared to existing schema-oblivious approaches and 10% less than schema-conscious techniques. In addition Sucxent++ performs marginally better than Shared-Inlining and is 5.7–47 times faster than XParent as far as insertion time is concerned.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号