首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 937 毫秒
1.
一个XML的数据模型及其存储策略   总被引:6,自引:0,他引:6  
XML是用于数据表示、交换的Internet标准。通过和DTD的连接可以用像XML-QL这样的语言来执行丰富的查询操作。近年来,很多人致力于半结构化数据模型和其查询语言的研究^[1,2,5],其重点逐渐转移到XML数据集的查询上来,其中两个重要问题是使XML查询语言正规化和如何将XML数据转换为底层存储格式以获得理想的效率^[4]。表述了一个XML的正规数据模型及其代数方法,并介绍基于RDBMS实现该模型的方法。  相似文献   

2.
Path queries have been extensively used to query semistructured data, such as the Web and XML documents. In this paper we introduce weighted path queries, an extension of path queries enabling several classes of optimization problems (such as the computation of shortest paths) to be easily expressed. Weighted path queries are based on the notion of weighted regular expression, i.e., a regular expression whose symbols are associated to a weight. We characterize the problem of answering weighted path queries and provide an algorithm for computing their answer. We also show how weighted path queries can be effectively embedded into query languages for XML data to express in a simple and compact form several meaningful research problems.  相似文献   

3.
The problem of answering XML queries using path-based indexes is to find efficient methods for accelerating the XML query with pre-designed index structures over the XML database. This problem received increasing interests and have been lucubrated in recent years. Regular path expression is the core of the XML query languages e.g., XPath and XQuery. Most of the state-of-the-art path-based XML indexes, therefore, hammer at how to efficiently answer the path-based XML queries. This paper surveys various approaches to indexing XML data proposed in the literature. We give a step by step analysis to show the evolution of index structures for XML path information, based on tree structures or more commonly, directed labeled graphs. For each approach, we first present the specific issue it aims to tackle, and then the proposed solution presented. Furthermore, construction, physical data storage and maintenance costs, are analyzed.  相似文献   

4.
基于XML的半结构数据查询语言研究   总被引:1,自引:0,他引:1  
半结构数据管理的核心问题之一是数据的有效查询问题。文章重点分析、比较了两种基于XML的半结构查询语言,即XQL和XML-QL。在此基础上总结出了XML查询语言的基本需求,并对目前的XML查询语言提出了四点扩充建议。  相似文献   

5.
目前XML查询语言及查询界面对Web用户过于复杂,该文描述了一种XML文档索引机制,在此基础上建立了一个通用的贝叶斯网络查询模型。用户只需在交互界面输入自然语言描述的查询,系统就能对其实现基于语义的构造,由它生成多个结构化查询;对这些查询建立贝叶斯网络,计算查询在给定文档下的概率,选择概率最大的前3个查询提交给系统执行。  相似文献   

6.
Flesca  Sergio  Furfaro  Filippo  Greco  Sergio 《World Wide Web》2002,5(2):125-157
In this paper we present a graphical query language for XML. The language, based on a simple form of graph grammars, permits us to extract data and reorganize information in a new structure. As with most of the current query languages for XML, queries consist of two parts: one extracting a subgraph and one constructing the output graph. The semantics of queries is given in terms of graph grammars. The use of graph grammars makes it possible to define, in a simple way, the structural properties of both the subgraph that has to be extracted and the graph that has to be constructed. We provide an example-driven comparison of our language w.r.t. other XML query languages, and show the effectiveness and simplicity of our approach.  相似文献   

7.
A number of indexing techniques have been proposed in recent times for optimizing the queries on XML and other semi-structured data models. Most of the semi-structured models use tree-like structures and query languages (XPath, XQuery, etc.) which make use of regular path expressions to optimize the query processing. In this paper, we propose two algorithms called Entry-point algorithm (EPA) and Two-point Entry algorithms that exploit different types of indices to efficiently process XPath queries. We discuss and compare two approaches namely, Root-first and Bottom-first in implementing the EPA. We present the experimental results of the algorithms using XML benchmark queries and data and compare the results with that of traditional methods of query processing with and without the use of indexes, and ToXin indexing approach. Our algorithms show improved performance results than the traditional methods and Toxin indexing approach.  相似文献   

8.
李求实  王秋月  王珊 《软件学报》2012,23(8):2002-2017
与纯文本文档集相比,使用语义标签标注的半结构化的XML文档集,有助于信息检索系统更好地理解待检索文档.同样,结构化查询,比如SQL,XQuery和Xpath,相对于纯关键词查询更加清晰地表达了用户的查询意图.这二者都能够帮助信息检索系统获得更好的检索精度.但关键词查询因其简单和易用性,仍被广泛使用.提出了XNodeRelation算法,以自动推断关键词查询的结构化信息(条件/目标节点类型).与已有的推断算法相比,综合了XML文档集的模式和统计信息以及查询关键词出现的上下文及其关联关系等推断用户的查询意图.大量的实验验证了该算法的有效性.  相似文献   

9.
Finding the occurrences of structural patterns in XML data is a key operation in XML query processing. Existing algorithms for this operation focus almost exclusively on path patterns or tree patterns. Current applications of XML require querying of data whose structure is complex or is not fully known to the user, or integrating XML data sources with different structures. These applications have motivated recently the introduction of query languages that allow a partial specification of path patterns in a query. In this paper, we consider partial path queries, a generalization of path pattern queries, and we focus on their efficient evaluation under the indexed streaming evaluation model. Our approach explicitly deals with repeated labels (that is, multiple occurrences of the same label in a query). We show that partial path queries can be represented as rooted dags for which a topological ordering of the nodes exists. We present three algorithms for the efficient evaluation of these queries. The first one exploits a structural summary of data to generate a set of path patterns that together are equivalent to a partial path query. To evaluate these path patterns, we extend a previous algorithm for path-pattern queries so that it can work on path patterns with repeated labels. The second one extracts a spanning tree from the query dag, uses a stack-based algorithm to find the matches of the root-to-leaf paths in the tree, and merge-joins the matches to compute the answer. Finally, the third one exploits multiple pointers of stack entries and a topological ordering of the query dag to apply a stack-based holistic technique. We analyze our algorithms and perform extensive experimental evaluations. Our experimental results show that the holistic algorithm outperforms the other ones. Our approaches are the first ones to efficiently evaluate this class of queries in the indexed streaming model.  相似文献   

10.
11.
《Information Systems》2005,30(6):467-487
Due to its flexibility, XML is becoming the de facto standard for exchanging and querying documents over the Web. Many XML query languages such as XQuery and XPath use label paths to traverse the irregularly structured XML data. Without a structural summary and efficient indexes, query processing can be quite inefficient due to an exhaustive traversal on XML data. To overcome the inefficiency, several path indexes have been proposed in the research community. Traditional indexes generally record all label paths from the root element in XML data and are constructed with the use of data only. Such path indexes may result in performance degradation due to large sizes and exhaustive navigations for partial matching path queries which start with the self-or-descendent axis(“//”). To improve the query performance, we propose an adaptive path index for XML data (termed APEX). APEX does not keep all paths starting from the root and utilizes frequently used paths on query workloads. APEX also has a nice property that it can be updated incrementally according to the changes of query workloads. Experimental results with synthetic and real-life data sets clearly confirm that APEX improves the query processing cost typically 2–69 times compared with the traditional indexes, with the performance gap increasing with the irregularity of XML data.  相似文献   

12.
基于关系数据库有效地实现RPE查询   总被引:5,自引:1,他引:5  
各种XML查询语言的共同特点就是利用正则路径表达式(RPE)来导航XML文档的查询。本文结合我们提出的一种新的XML数据的关系存储模式,对有效地实现RPE查询的相关研究工作进行了总结,并提出了两个有效地实现包含连接的索引改进归并连接算法。算法采用索引定位技术、短路技术和预侦技术来减少连接代价。因此,不仅能够在当前上下文计算环境下有效地实现包含连接的计算,而且能够大量地避免包含连接中不必要的扫描和搜索。  相似文献   

13.
对XML文档查询的常用方法有两种:一种是使用查询语言;另一种是使用关键字,而使用关键字查询XML文档比使用查询语言更为简单方便。给出了一种使用关键字查询XML文档的索引查找算法。该算法只需要扫描一次关键字对应的编码列,就可以找到需要的编码,提高了查询效率。实验表明该算法是可行的和有效的。  相似文献   

14.
One of the key technologies of XML data management is XQuery, the query language for both retrieving and transforming XML data. In the paper, limitations of the XQuery facilities for transforming XML data are discussed. It is shown that, for one important class of queries, XQuery expressions are too cumbersome and computationally inefficient. In the paper, it is suggested to extend the XQuery language by functional update expressions. From the syntax standpoint, such expressions are similar to expressions of XML update languages. However, they can be evaluated without side effects, which makes it possible to integrate them in XQuery in a natural way. The expressiveness of the extended language is demonstrated, and approaches to efficient implementation of the suggested extension are considered. In addition, the problem of arbitrary compositions of XML query and update expressions (the problem of nested update expressions) is discussed. The existing XML update languages are based, as a rule, on the XQuery language; however, the possibility of constructing arbitrary compositions of XQuery expressions and those of the XML update language is not provided. This impedes development of practical XML applications in the XQuery language. In the paper, an approach to solving the composition problem based on the use of functional update expressions is suggested. Possibilities of the implementation of the suggested extension are discussed.  相似文献   

15.
Web systems, Web services, and Web-based publish/subscribe systems communicate events as XML messages and in many cases, require composite event detection: it is not sufficient to react to single event messages, but events have to be considered in relation to other events that are received over time. This entails a need for expressive, high-level languages for querying composite events. Emphasizing language design and formal semantics, we describe the rule-based composite event query language XChangeEQ. XChangeEQ is designed to completely cover and integrate the four complementary querying dimensions: event data, event composition, temporal relationships, and event accumulation. Semantics are provided as a model theory with accompanying fixpoint theory, an approach that is established for rule languages but has not been applied to event queries so far. Because they are highly declarative, thus easy to understand and well suited for query optimization, such semantics are desirable for event queries.  相似文献   

16.
17.
一种基于DTD的XML索引方法   总被引:9,自引:0,他引:9  
路径查询是XML查询的一个主要特征,现已提出了多种XML索引方法.DTD的结构信息对于XML索引的建立及查询效率的提高很重要,但现有的大部分索引方法没有利用DTD这一有效资源,提出一种利用DTD的XML索引方法——DBXI(DTD-based XML indexing),该方法采用了新的编码方法,可使路径查询具备如下特征:对于由N个元素/属性组成的具有1个谓词约束的路径表达式,DBXI处理每个XML文档仅需0次或1次元素/属性结点集的结构连接操作;对于在XML文档中不存在匹配结构的路径查询,DBXI能够在比现有的XML索引方法较短的时间内给出无查询结果的判断.实验表明,与Lore,SphinX和XISS等索引方法相比,DBXI能够缩短路径查询的响应时间.  相似文献   

18.
While the information published in the form of XML-compliant documents keeps fast mounting up, efficient and effective query processing and optimization for XML have now become more important than ever. This article reports our recent advances in XML structured-document query optimization. In this article, we elaborate on a novel approach and the techniques developed for XML query optimization. Our approach performs heuristic-based algebraic transformations on XPath queries, represented as PAT algebraic expressions, to achieve query optimization. This article first presents a comprehensive set of general equivalences with regard to XML documents and XML queries. Based on these equivalences, we developed a large set of deterministic algebraic transformation rules for XML query optimization. Our approach is unique, in that it performs exclusively deterministic transformations on queries for fast optimization. The deterministic nature of the proposed approach straightforwardly renders high optimization efficiency and simplicity in implementation. Our approach is a logical-level one, which is independent of any particular storage model. Therefore, the optimizers developed based on our approach can be easily adapted to a broad range of XML data/information servers to achieve fast query optimization. Experimental study confirms the validity and effectiveness of the proposed approach.  相似文献   

19.
XML流数据在互联网领域有着广阔的应用,海量流数据的高性能处理与查询需求的多样性给对XML流数据的查询处理技术提出了更高的要求,针对XML流数据上的XPath查询,以下推转换机(Pushdown Transducer)为基础,提出一种新的查询处理方法。该方法支持包含PC轴、AD轴同时包含多重存在谓词、值谓词和嵌套谓词的XPath查询,覆盖XPath查询的核心部分。该方法能够满足用户复杂的查询需求,同时具有较高的性能。  相似文献   

20.
《Information Systems》2001,26(6):445-475
The rapid increase in end-user computing calls into question the suitability of existing database query languages (DBQLs). Because the typical DB end-user is not a DB specialist, it is essential that DBQLs use concepts that are as close as possible to those in the end-users’ cognitive mental model and adopt interface techniques that are suited to end-users’ abilities. Concept-based query languages are well suited for this. This realization has motivated further research in conceptual, or semantic, query approaches. However, the primary focus in this field has been on semantic query optimization, not on query formulation. In this study, we address ourselves to the problem of formulation of queries using concepts. We propose a concept-based query language, called the conceptual query language (CQL), which allows for the conceptual abstraction of database queries and exploits the rich semantics of data models to ease and facilitate query formulation.The CQL approach uses the relationship semantics of semantic data models to render transparent the technical complexities of existing DB query languages. Association semantics are also used to automatically construct query graphs and pseudo-natural language explanations of queries, and to generate SQL codes. A set theoretic formalism for conceptual queries is developed and used. This paper discusses the design of CQL, its expressive power, its implementation, and the strategies for CQL query processing. The implementation of a CQL prototype is briefly discussed in this paper. User experiments were carried out extensively and showed the advantage of CQL over alternative languages such as SQL.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号