共查询到20条相似文献,搜索用时 15 毫秒
1.
胡昔祥 《计算机工程与应用》2007,43(29):101-103
介绍了一种面向大规模分布式应用的发布订阅中间件系统,系统采用一种结合了下推树和自下而上树自动机的XPath订阅快速匹配算法,支持XPath多谓词和分支特性。系统事件代理P2P网络节点之间的事件或订阅消息路由采用了扩展的Chord路由协议和订阅聚合、覆盖等多种优化措施。实验结果表明,系统具有较好的效率和性能,能满足面向大规模分布式应用的要求。 相似文献
2.
采用索引技术,对输入的XML文档建立一个双索引结构来改进YFilter算法,优化XML文档过滤性能。藉助索引结构,该算法超前搜索元素结点在文档中的结构信息,预先排除不能保证得到任何匹配结果的元素结点,以避免大量不必要的查询处理。实验结果显示,当输入的XML文档较大时,该算法有较好的过滤性能。 相似文献
3.
4.
XML发布/订阅数据流系统基于共享的多查询连接算法 总被引:1,自引:0,他引:1
XML的发布/订阅系统中的XML多查询连接,涉及到多个XML文件之间关系的订阅的处理,包括了对XPath路径模式的评测,对XML文档之间的比较计算以及对系统时间进行管理等方面。基于共享的连接算法,通过适当的组织,使得多个订阅之间相同的变量连接计算结果得以被重复利用,较大程度地减少了开销很大的连接计算,从而较大地提高系统效率。实验结果表明,基于共享的算法取得了良好的实际效果,能适合于百万以上订阅的场合。 相似文献
5.
Chen Songting Li Hua-Gang Tatemura Jun'ichi Hsiung Wang-Pin Agrawal Divyakant Candan K. Sel uk 《Knowledge and Data Engineering, IEEE Transactions on》2008,20(12):1627-1640
An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex Generalized-Tree-Pattern (GTP) queries. Our filtering mechanism is based on a novel Tree-of-Path (TOP) encoding scheme, which compactly represents the path matches for the entire document. First, we show that the TOP encodings can be efficiently produced via a shared bottom-up path matching. Second, with the aid of this TOP encoding, we can 1) achieve polynomial time and space complexity for post processing, 2) avoid redundant predicate evaluations, 3) allow an efficient duplicate-free and merge join-based algorithm for merging multiple encoded path matches and 4) simplify the processing of GTP queries. Overall our approach maximizes the sharing opportunity across queries by exploiting the suffix as well as prefix sharing. At the same time, our TOP encodings allow efficient post processing for GTP queries. Extensive performance studies show that our GFilter solution not only achieves significantly better filtering performance than state-of-the-art algorithms, but also is capable of efficiently filtering the more complex GTP queries. 相似文献
6.
7.
8.
《Information and Software Technology》2006,48(8):708-716
Providing efficient access to XML documents becomes crucial in XML database systems. More and more concurrency control protocols for XML database systems were proposed in the past few years. Being an important language for addressing data in XML documents, XPath expressions are the basis of several query languages, such as XQurey and XSLT. In this paper, we propose a lock-based concurrency control protocol, called XLP, for transactions accessing XML data by the XPath model. XLP is based on the XPath model and has the features of rich lock modes, low lock conflict and lock conversion. XLP is also proved to ensure conflict serializability. In sum, there are three major contributions in this paper. The proposed XLP supports most XPath axes, rather than simple path expressions only. Conflict conditions and rules in the XPath model are analyzed and derived. Moreover, a lightweighted lock mode, P-lock, is invented and integrated into XLP for better concurrency. 相似文献
9.
为了提高已有的语义发布/订阅系统中事件与订阅匹配的时间效率,提出了基于MapReduce[1]的语义发布/订阅系统[2]。对语义发布/订阅系统的处理流程进行了认知与分解,指出订阅与事件的匹配时间效率是系统必须要解决的问题,在语义发布/订阅模型的基础上设计了新的匹配模块。对过去已完成的匹配方案进行了分解,采用基于MapReduce的并行处理技术[3]对事件与订阅的匹配进行并行处理,从而提高系统的时间效率。该系统通过加入多台具有相同处理能力的处理机,搭建了一个能够并行处理的运行环境,验证了该系统的准确性和有效性。 相似文献
10.
XML access control models proposed in the literature enforce access restrictions directly on the structure and content of an XML document. Therefore access authorization rules (authorizations, for short), which specify access rights of users on information within an XML document, must be revised if they do not match with changed structure of the XML document. In this paper, we present two authorization translation problems. The first is a problem of translating instance-level authorizations for an XML document. The second is a problem of translating schema-level authorizations for a collection of XML documents conforming to a DTD. For the first problem, we propose an algorithm that translates instance-level authorizations of a source XML document into those for a transformed XML document by using instance-tree mapping from the transformed document instance to the source document instance. For the second problem, we propose an algorithm that translates value-independent schema-level authorizations of non-recursive source DTD into those for a non-recursive target DTD by using schema-tree mapping from the target DTD to the source DTD. The goal of authorization translation is to preserve authorization equivalence at instance node level of the source document. The XML access control models use path expressions of XPath to locate data in XML documents. We define property of the path expressions (called node-reducible path expressions) that we can transform schema-level authorizations of value-independent type by schema-tree mapping. To compute authorizations on instances of schema elements of the target DTD, we need to identify the schema elements whose instances are located by a node-reducible path expression of a value-independent schema-level authorization. We give an algorithm that carries out path fragment containment test to identify the schema elements whose instances are located by a node-reducible path expression. 相似文献
11.
针对基于内容的发布/订阅系统中消息异构导致匹配准确率降低的问题,本文通过定义事件和订阅属性间的语义关系,利用语义转换模块,提出一种支持语义的发布/订阅系统的设计方法,并将其应用于SINEA发布订阅系统中。实验结果表明,该方法可在一定程度上支持语义异构,提高匹配的准确度。 相似文献
12.
Recently, there has been growing interest in streaming XML data. Much of the work on streaming XML data has been focused on efficient filtering. Filtering systems deliver XML documents to interested users. The burden of extracting the XML fragments of interest from XML documents is placed on users. In this paper, we propose XTREAM which evaluates multiple queries in conjunction with the read-once nature of streaming data. In contrast to the previous work, XTREAM supports a wide class of XPath queries including tree shaped expressions, order based predicates, and nested predicates. In addition, to improve the efficiency and scalability of XTREAM, we devise an optimization technique called Query Compaction. Experimental results with real-life and synthetic XML data demonstrate the efficiency and scalability of XTREAM. 相似文献
13.
对XPath,XLink和XPointer的分析研究 总被引:2,自引:0,他引:2
XML是针对网络应用的一项新技术。当越来越多的信息存为XML文档时,就需要构造一种方式,通过接口获取信息。这就需要一种方法来确定文档各个部分之间的关系,以及访问一个与其他资源有关的文档的内部各部分。XPath,XLink和XPointer这3种语言都可以用来访问数据。其中,XPointer用来确定文档个别部分的位置,XPath与XSLT和XPointer一起使用来对XML文档各部分进行定位,而XLink则用来与XML文档链接。 相似文献
14.
Temporal XML: modeling, indexing, and query processing 总被引:1,自引:0,他引:1
Flavio Rizzolo Alejandro A. Vaisman 《The VLDB Journal The International Journal on Very Large Data Bases》2008,17(5):1179-1212
In this paper we address the problem of modeling and implementing temporal data in XML. We propose a data model for tracking
historical information in an XML document and for recovering the state of the document as of any given time. We study the
temporal constraints imposed by the data model, and present algorithms for validating a temporal XML document against these
constraints, along with methods for fixing inconsistent documents. In addition, we discuss different ways of mapping the abstract
representation into a temporal XML document, and introduce TXPath, a temporal XML query language that extends XPath 2.0. In
the second part of the paper, we present our approach for summarizing and indexing temporal XML documents. In particular we
show that by indexing continuous paths, i.e., paths that are valid continuously during a certain interval in a temporal XML graph, we can dramatically increase
query performance. To achieve this, we introduce a new class of summaries, denoted TSummary, that adds the time dimension to the well-known path summarization schemes. Within this framework, we present two new summaries:
LCP and Interval summaries. The indexing scheme, denoted TempIndex, integrates these summaries with additional data structures. We give a
query processing strategy based on TempIndex and a type of ancestor-descendant encoding, denoted temporal interval encoding.
We present a persistent implementation of TempIndex, and a comparison against a system based on a non-temporal path index,
and one based on DOM. Finally, we sketch a language for updates, and show that the cost of updating the index is compatible
with real-world requirements. 相似文献
15.
Vasil Slavov Praveen Rao 《The VLDB Journal The International Journal on Very Large Data Bases》2014,23(1):51-76
In this paper, we address the problem of cardinality estimation of XPath queries over XML data stored in a distributed, Internet-scale environment such as a large-scale, data sharing system designed to foster innovations in biomedical and health informatics. The cardinality estimate of XPath expressions is useful in XQuery optimization, designing IR-style relevance ranking schemes, and statistical hypothesis testing. We present a novel gossip algorithm called XGossip, which given an XPath query estimates the number of XML documents in the network that contain a match for the query. XGossip is designed to be scalable, decentralized, and robust to failures—properties that are desirable in a large-scale distributed system. XGossip employs a novel divide-and-conquer strategy for load balancing and reducing the bandwidth consumption. We conduct theoretical analysis of XGossip in terms of accuracy of cardinality estimation, message complexity, and bandwidth consumption. We present a comprehensive performance evaluation of XGossip on Amazon EC2 using a heterogeneous collection of XML documents. 相似文献
16.
17.
如何在XML数据流上高效地执行XPath查询,是XML数据流管理的关键问题。DTD结构信息对提高XML查询效率有很大帮助,已有的大部分算法没有利用这一资源。提出了一种使用DTD进行XML数据流查询处理的方法,具有以下特征:利用树自动机表示XPath;通过XPath树自动机与DTD树匹配,预先标识不匹配查询结构的DTD节点;给出一种利用DTD的XML流索引方法DBXSI;执行查询时,根据流索引信息直接跳过某些与查询不匹配的节点及子树。实验结果表明:该方法可有效支持Xpath查询,效率优于传统算法。 相似文献
18.
Internet上的化学数据库是宝贵的化学信息资源,如何有效地利用这些数据是化学深层网所要解决的问题。本文总结了化学深层网的特点,基于XML技术实现从数据库检索返回的半结构化HTML页面中提取数据的目标,使之成为可供程序直接调用做进一步计算的数据。在数据提取过程中,先采用JTidy规范化HTML,得到格式上完整、内容无误的XHTML文档,利用包含着XPath路径语言的XSLT数据转换模板实现数据转换和提取。其中XPath表达式的优劣决定了XSLT数据转换模板能否长久有效地提取化学数据,文中着重介绍了如何编辑健壮的XPath表达式,强调了XPath表达式应利用内容和属性特征实现对源树中数据的定位,并尽可能地降低表达式之间的耦合度,前瞻性地预测化学站点可能出现的变化并在XSLT数据转换模板中采取相应的措施以提高表达式的长期有效性。为创建化学深层网数据提取的XSLT数据提取模板提供方法指导。 相似文献
19.
Kirkegaard C. Moller A. Schwartzbach M.I. 《IEEE transactions on pattern analysis and machine intelligence》2004,30(3):181-192
XML documents generated dynamically by programs are typically represented as text strings or DOM trees. This is a low-level approach for several reasons: 1) traversing and modifying such structures can be tedious and error prone, 2) although schema languages, e.g., DTD, allow classes of XML documents to be defined, there are generally no automatic mechanisms for statically checking that a program transforms from one class to another as intended. We introduce XACT, a high-level approach for Java using XML templates as a first-class data type with operations for manipulating XML values based on XPath. In addition to an efficient runtime representation, the data type permits static type checking using DTD schemas as types. By specifying schemes for the input and output of a program, our analysis algorithm will statically verify that valid input data is always transformed into valid output data and that the operations are used consistently. 相似文献