首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
建立高效的索引来快速定位满足要求的节点是提高XML数据查询效率的一个必要手段.文中以降低复杂度和提高查询效率为目标,以基于路径的XML索引原理为基础,提出了一种新型的基于Dewey编码的索引结构RTL-Index.RTL-Index通过对文档节点编码来表示结构信息,利用前缀路径匹配操作完成结构查询,支持含通配符" 和后代轴"//"的查询以及兄弟节点无序的模式树的查询.仿真实验结果表明RTL-Index索引具有较低的时间和空间复杂度,解决了XML文档分支路径查找问题,是一种较为有效的XML索引结构.  相似文献   

2.
针对照明系统故障诊断专家库中故障诊断信息的特点,提出用XML文档来构建专家库的思想。在研究无序树包含匹配的基础上,提出了一种改进的基于XML文档树型结构编码的XML树匹配算法。并通过实例阐述算法在城市照明故障诊断系统中的应用。实验结果表明,这种设计思想和算法在故障诊断信息查询匹配过程中具有较高的查全率和查准率,能够有效降低照明系统中的故障发生率。  相似文献   

3.
张剑妹  陶世群 《计算机应用》2008,28(11):2961-2963
树模式查询被广泛地应用XML数据查询中。树模式查询的一致性判断可以避免不必要的计算,节省查询时间,从而提高查询效率。给出了查询一致性的定义,基于子路径的概念,提出文档类型定义(DTD)约束下的树模式查询的一致性判断算法,并对算法的时间复杂度进行了分析。通过分析比较,该算法是有效的。  相似文献   

4.
针对当前XML文档信息查询算法的不足,提出一种基于有效路径权重的树匹配算法。在保持XML文档树有效结点和树结构的基础上,树根结点信息最重要,随着树深度增加,结点信息重要性逐渐减弱的特点,按照路径层次自动计算路径权重,并赋予相应路径,根据树结点的有效信息和树结构的有效路径计算树的匹配度。在大规模XML文档查询方面,实验验证了该算法在保证较高查准率和查全率的基础上,有效提高了查询效率。  相似文献   

5.
孙军梅 《计算机工程》2005,31(12):183-184
基于XML的数据查询问题可转化为无序树匹配问题,无序树匹配问题已经被证明是NP难问题,该文提出应用遗传算法对XML文档进行查询优化,并通过实验验证了遗传算法在提高查询效率中的可行性和有效性。  相似文献   

6.
李英俊  宗金良  孙志胜 《计算机应用》2006,26(10):2405-2407
提出了EXN-Tree的概念,将XML文档树的节点映射到EXN-Tree,依据EXN-Tree的节点编码生成XML文档树节点数据结构。基于此新型的节点编码结构,就无序无索引节点集和有序有索引节点集两种情况下的XML结构连接算法展开研究,提出了一系列的结构连接算法,解决了无序无索引节点集和有序有索引节点集两种情况下的XML结构连接。分析表明该算法的I/O复杂性优于已有算法,具有良好的性能。  相似文献   

7.
目前大部分XML查询语言都使用树模式来匹配待查询的XML文档树以得到所需要的、与模式树相吻合的查询结果,此效率在很大程度上取决于XML模式树的大小,那么尽可能快速地查找并删除查询模式树中的冗余节点就变得十分重要。重点讨论DTD约束下树模式的最小化问题,将DTD兄弟约束SC拓展成扩展兄弟约束ESC,使其能够表达DTD约束中的祖先-后代关系;并指出只包含{ESC,/,//,[],*}的查询树模式的最小化问题的复杂度是指数级的,且当模式树是分支受限的时候,其最小化问题的复杂度是多项式时间的;最后给出了一个多项式时间的受限分支的模式树最小化算法。  相似文献   

8.
汪万根 《计算机工程》2009,35(8):107-109
针对在XML文档树模型中进行后兄弟节点查询时内存消耗大、匹配效率低等缺陷,提出一种基于XML数据流与栈的后兄弟查询算法。采用SAX解析器与结构连接方法,对XML文档中所有已知节点与后兄弟节点进行精确匹配并输出。结果表明,该算法具有适用范围广、占用系统资源少、匹配效率高等优势。  相似文献   

9.
针对支持查询的XML数据压缩方法存在的路径和数据重复等问题,通过去除XML数据中的重复路径,简化XML数据结构,提出结构标记树的概念及其生成算法,设计一种基于结构标记树的可查询XML数据压缩方法SSTQC,对XML数据进行压缩和组织查询。SSTQC一次扫描XML文档,具有较好的的压缩性能和查询效率。  相似文献   

10.
基于XML的软件构件查询匹配算法研究   总被引:33,自引:0,他引:33       下载免费PDF全文
在研究无序树包含匹配的基础上,提出一种新的基于XML的软件构件查询匹配算法.该算法可以在保持较高构件查准率的前提下,显著地提高构件的查全率,并提供对布尔查询的支持.此外,通过合理地设定约束条件以及利用动态规划的方法,将计算查询匹配代价的算法时间复杂度限定为多项式级,确保构件查询具有足够的查询效率.最后,通过在构件库原型系统RCRS上进行的一系列实验,进一步证明了新的查询匹配算法在软件构件查询实际应用中的可行性和有效性.  相似文献   

11.
在XML的树模型基础上,提出查询是一个有序的带标记树、数据库是一个有序的带标记树集合的思想,对于查询的回答是一个或几个从查询树结点到数据库结点的同态映射;对一般意义下的XML树模型进行了形式化改造,并且基于改造后的XML树模型构造了查询;最后,阐述了这一工作的意义。  相似文献   

12.
In the past decade, XML has emerged as the standard language for information exchanging over the Internet. Due to its tree-structure paradigm, XML is superior for its capability of storing, querying, and manipulating complex data. Therefore, discovering frequent tree patterns over tree-structured data has become an interesting topic for XML data management. In this paper, we propose a tree mining algorithm, named BUXMiner, for finding a special class of frequent trees, called rooted unordered trees, from a tree-structured database. BUXMiner employs an efficient bottom-up approach to enumerate all candidate trees over a compact global tree guide and computes the frequent trees based on the tree guide. In addition to BUXMiner, we also propose a mining approach called BUMXMiner to discover the maximal frequent rooted unordered trees. We compare BUXMiner with previous tree-structure mining algorithms, namely XQPMinerTID and FastXMiner, which were also proposed to discover rooted unordered trees. The experimental results show that our algorithm outperforms XQPMinerTID and FastXMiner in terms of efficiency. The performance results from real-world applications also indicate the usefulness of our proposed tree mining algorithms in a variety of web applications, such as analysis of web page access patterns and mining frequent XML query patterns for caching.  相似文献   

13.
Storing and querying XML documents using a RDBMS is a challenging problem since one needs to resolve the conflict between the hierarchical, ordered nature of the XML data model and the flat, unordered nature of the relational data model. This conflict can be resolved by the following XML-to-Relational mappings: schema mapping, data mapping and query mapping. In this paper, we propose: (i) a lossless schema mapping algorithm to generate a database schema from a DTD, which makes several improvements over existing algorithms, (ii) two linear data mapping algorithms based on DOM and SAX, respectively, to map ordered XML data to relational data. To our best knowledge, there is no published linear schema-based data mapping algorithm for mapping ordered XML data to relational data. Experimental results are presented to show that our algorithms are efficient and scalable.  相似文献   

14.
XML tree structures can conveniently be represented using ordered unranked trees. Due to the repetitiveness of XML markup these trees can be compressed effectively using dictionary-based methods, such as minimal directed acyclic graphs (DAGs) or straight-line context-free (SLCF) tree grammars. While minimal SLCF tree grammars are in general smaller than minimal DAGs, they cannot be computed in polynomial time unless P=NPP=NP. Here, we present a new linear time algorithm for computing small SLCF tree grammars, called TreeRePair, and show that it greatly outperforms the best known previous algorithm BPLEX. TreeRePair is a generalization to trees of Larsson and Moffat's RePair string compression algorithm. SLCF tree grammars can be used as efficient memory representations of trees. Using TreeRePair, we are able to produce the smallest queryable memory representation of ordered trees that we are aware of. Our investigations over a large corpus of commonly used XML documents show that tree traversals over TreeRePair grammars are 14 times slower than over pointer structures and 5 times slower than over succinct trees, while memory consumption is only 1/43 and 1/6, respectively. With respect to file compression we are able to show that a Huffman-based coding of TreeRePair grammars gives compression ratios comparable to the best known XML file compressors.  相似文献   

15.
XML documents are often viewed as trees (basically the parse tree of the document), and queries over such documents typically test for ancestor relationships among tree nodes. Search engines process such queries using an index structure summarizing the ancestor relations. In the index, each document item (tree node) is identified using some logical id (node label), such that, given two labels, the engine can determine the ancestor relationship between the corresponding nodes. The length of the labels is a main factor of the index size. Therefore, reducing this length, even by a constant factor, is a critical issue. In this work we consider the following problem. Given a rooted XML tree T, label the nodes of T in the most compact way such that given the labels of two nodes, one can determine in constant time, by looking at the labels only, whether one node is an ancestor of the other. Labelings currently being used are all variants of the following interval scheme. Number the leaves say from left to right and label each node with a pair consisting of the numbers of its smallest and largest leaf descendants. An ancestor query then amounts to an interval containment test on the labels. The maximum label length using this scheme is 2 log n, where n is the number of nodes in the tree. (All logarithms in this paper are to base 2.) The focus of this work is finding a scheme that works best in practice on real XML data. We suggest an orthogonal prefix-based approach, where the labeling is such that an ancestor query roughly amounts to testing whether one label is a prefix of the other. We present several new labeling schemes based on this approach and analyze their performance both theoretically and empirically.  相似文献   

16.
针对连续不确定XML数据概率阈值范围查询,提出一种新的CUXI索引树。该索引树的构建方法是借鉴U树对空间数据自顶向下递归构建索引树的思想,将连续不确定XML文档中具有相同父亲的叶子节点构建二维数据矩形,在聚类的基础上来构建相应的CUXI索引树,其中叶子节点存储连续不确定数据辅助信息。为了提高查询效率,对连续不确定数据制定了过滤策略,通过遍历索引树过滤掉不满足查询范围的子树。理论和实验结果表明,此索引技术可提高查询处理的性能。  相似文献   

17.
改造XML树模型是提高XML查询效率的重要方法。通过分析现有的索引算法,对XML树模型进行了改造,提出了基于Signature的索引策略(s-DOM)。采用该策略预处理XML文档可以大大缩小搜索范围,从而提高了查询的效率。  相似文献   

18.
Modern applications face the challenge of dealing with structured and semi-structured data. They have to deal with complex objects, most of them presenting some kind of internal structure, which often forms a hierarchy. Though XML documents are the most known, chemical compounds, CAD drawings, web-sites and many other applications have to deal with similar problems. In such environments, ordered and unordered tree pattern matching are the fundamental search operations. One of the main thrusts of research activities for tree pattern matching is the class of holistic approaches. Their ultimate goal is to evaluate a query twig as a whole by relying on sequential access patterns and non trivial auxiliary storage structures, typically stored in main memory. Based on the pre/post-order ranks of individual tree nodes, we establish strong theoretical bases as a foundation for correct and efficient holistic pattern matching algorithms. In particular, we define and prove sufficient and necessary conditions to minimize the amount of data retained in memory, thus introducing a correct and complete framework on which different holistic solutions can be compared. We also show how these rules can be applied for building algorithms for ordered and unordered tree-pattern matching. Thanks to the above theoretical achievements, each holistic algorithm gains in efficiency as it is directly implemented on the adopted numbering scheme, avoids expensive matching refinements and keeps memory requirements stable. An experimental analysis and comparison with previous approaches confirms the superiority of our approach tested on synthetic as well as real-life data sets.  相似文献   

19.
T. Matsui 《Algorithmica》1997,18(4):530-543
In this paper we propose an algorithm for generating all the spanning trees in undirected graphs. The algorithm requires O (n+m+ τ n) time where the given graph has n vertices, m edges, and τ spanning trees. For outputting all the spanning trees explicitly, this time complexity is optimal. Our algorithm follows a special rooted tree structure on the skeleton graph of the spanning tree polytope. The rule by which the rooted tree structure is traversed is irrelevant to the time complexity. In this sense, our algorithm is flexible. If we employ the depth-first search rule, we can save the memory requirement to O (n+m). A breadth-first implementation requires as much as O (m+ τ n) space, but when a parallel computer is available, this might have an advantage. When a given graph is weighted, the best-first search rule provides a ranking algorithm for the minimum spanning tree problem. The ranking algorithm requires O (n+ m + τ n) time and O (m+ τ n) space when we have a minimum spanning tree. Received January 21, 1995; revised February 19, 1996.  相似文献   

20.
We investigate the typechecking problem for XML transformations: statically verifying that every answer to a transformation conforms to a given output schema, for inputs satisfying a given input schema. As typechecking quickly turns undecidable for query languages capable of testing equality of data values, we return to the limited framework where we abstract XML documents as labeled ordered trees. We focus on simple top-down recursive transformations motivated by XSLT and structural recursion on trees. We parameterize the problem by several restrictions on the transformations (deleting, non-deleting, bounded width) and consider both tree automata and DTDs as input and output schemas. The complexity of the typechecking problems in this scenario ranges from PTIME to EXPTIME.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号