首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
通过比较基于可能世界模型的概率数据在关系数据模型和XML数据模型中的表示方法,根据概率属性与普通属性的关系把概率关系模式分为1NF和3NF,根据分布节点与普通节点的关系把概率XML模式也分为1NF和3NF,以扩展的概率DTD文件为例设计了概率关系模式和概率XML模式之间的转换算法。实例分析结果表明该算法是有效的,也为现存的概率关系数据与概率XML数据之间提供了一种有效的模式转换方法。  相似文献   

2.
目前查询连续概率XML数据多采用离散化方法,需要处理大量直方图分段,查询效率较低。本文提出了一种基于p-文档模型的连续概率XML数据查询处理技术,首先利用cont节点扩展p-文档模型支持任意的连续分布,在cont节点中编码概率密度函数以及他们的参数;其次采用twig模式匹配找到符合用户要求的路径;然后根据要查询的连续分布类型确定概率查询应该使用符号表示法、积分法或直方图近似法:标准连续分布通过符号表示法中的参数或复杂的累积分布函数计算查询结果,满足积分条件的非标准连续分布采用积分法,其它情况采用直方图近似法。实验结果表明,该方法在概率查询的精确度以及响应时间上比现有方法更高效。  相似文献   

3.
一种非归并不确定XML小枝模式查询算法   总被引:1,自引:1,他引:0  
针对目前不确定XML小枝模式查询需要存储大量中间结果和归并中间结果的情况,提出一种非归并不确定XML小枝模式查询算法ProTwigList。该算法查询之前通过Tag+Level流进行剪枝,以减少待处理节点的数目;并扩展了区间编码来对剪枝后剩余的普通节点进行编码,用一定规则对分布节点进行标识;查询时采用公共分布节点路径的方法处理分布结点,最后结合最低公共祖先节点的概率计算查询结果的概率值。理论分析和实验结果证明了ProTwigList算法的查询效率。  相似文献   

4.
一种基于加权特征的可能模糊聚类方法*   总被引:1,自引:1,他引:0  
利用数据点特征权重的概率约束关系和可能分布,提出了分别建立在概率和可能加权特征方式之上的改进可能模糊聚类的两种模型。其中建立在可能约束之上的改进PCM算法扩展了原算法,具有更广泛的适用性。实验结果表明,算法能够实现不同概率权重或可能分布特征条件下的模糊聚类,扩展了改进的PCM算法,适用性更广。与PCM及其改进算法相比,聚类的效果较为明显。  相似文献   

5.
一种概率XML数据树的化简算法*   总被引:2,自引:2,他引:0  
针对概率XML数据树分布节点冗余的问题,提出一种化简概率XML数据树的算法。通过分析概率XML数据树中的路径类型,把概率XML数据树划分为稀疏和紧凑两种形式结构,通过消除概率级联、计算绝对路径的相容类集合和等价类集合等过程把前者变换为后者。理论研究和实例分析表明,该化简算法是有效的,能够解决概率XML数据树的化简问题。  相似文献   

6.
为了解决连续不确定XML高效的top-k查询,提出CProTJFast算法.该算法基于P-文档模型,扩展PEDewey(probabilistic extended Dewey)编码支持连续分布类型节点的编码,采用路径概率下限值进行节点过滤,并针对连续概率密度函数制定过滤策略,从而在计算连续节点概率之前过滤掉不参与结果的节点.实验结果表明,采用连续节点过滤策略的CProTJFast算法有效地提高了连续不确定XML的top-k查询效率.  相似文献   

7.
提出了一种无线传感器网络中基于移动代理带证据权的D S融合算法。引入证据权对证据进行修正以降低冲突数据对融合结果的影响。采用三级D S组合规则进行融合决策:节点级融合计算单个节点时间域融合检测概率;簇内级融合计算簇内节点间空间域融合检测概率获取局部决策结果;簇间级融合计算簇间的融合检测概率获取最终的全局决策结果。仿真结果表明,本算法能以较小的能耗代价获取准确的融合结果并有效降低冲突数据对融合结果的影响。  相似文献   

8.
面向不确定图的概率可达查询   总被引:1,自引:0,他引:1  
图的可达性查询被广泛应用于生物网络、社会网络、本体网络、RDF数据库和XML数据库等.由于对数据操作时引入的噪声和错误使这些图数据具有不确定性,已经有大量的针对不确定RDF和XML数据库的研究.文中使用可能世界语义模型构建不确定图,基于该模型,研究了概率可达查询(PR).处理PR查询是#P完全问题,对此文中首先给出一个基本随机算法,可快速地估算出可达概率,并且该值有很高的精确度.进一步,文中为随机算法引入条件分布(称为"条件随机算法"),采用图的不相交路径集和割集作为条件概率分布,因此改进的随机算法可准确地并且是在多项式时间内处理查询.最后基于真实不确定图数据的大量实验结果验证了文中的设计.  相似文献   

9.
为准确地推断可扩展标记语言(XML)关键字检索中的用户查询目标,提出一种目标节点推断方法。在获取目标节点时,考虑相应类型下XML节点出现的频率,以及用户输入关键字所在的不同位置对目标节点类型的影响,赋予词频不同的权重参数,同时引入XReal中XML文档树的层次信息,进行目标节点的推断。实验结果表明,该方法可以得到更准确的目标节点,提高查询准确率。  相似文献   

10.
摘要为了解决XML查询的信息过载问题,提出了基于条件偏好的XML多查询结果排序方法。该方法把用户指定的内容查询谓词作为上下文条件,然后在原始XML数据和查询历史上利用概率信息检索模型推测当前用户偏好,评估结果元素中被查询指定的属性单元值与未指定的属性单元值之间的关联关系以及未指定的属性单元值与用户偏好之间的相关程度,进而构建查询结果元素打分函数;在此基础上,利用打分函数计算结果元素的排序分值,并以此对查询结果进行排序。实验结果表明,提出的排序方法具有较高的排序准确性,能够较好地满足用户需求和偏好。  相似文献   

11.
The flexibility of XML data model allows a more natural representation of uncertain data compared with the relational model. Matching twig pattern against XML data is a fundamental problem in querying information from XML documents. For a probabilistic XML document, each twig answer has a probabilistic value because of the uncertainty of data. The twig answers that have small probabilistic value are useless to the users, and usually users only want to get the answers with the k largest probabilistic values. To this end, existing algorithms for ordinary XML documents cannot be directly applicable due to the need for handling probability distributional nodes and efficient calculation of top-k probabilities of answers in probabilistic XML. In this paper, we address the problem of finding twig answers with top-k probabilistic values against probabilistic XML documents directly. We propose a new encoding scheme called PEDewey for probabilistic XML in this paper. Based on this encoding scheme, we then design two algorithms for finding answers of top-k probabilities for twig queries. One is called ProTJFast, to process probabilistic XML data based on element streams in document order, and the other is called PTopKTwig, based on the element streams ordered by the path probability values. Experiments have been conducted to study the performance of these algorithms.  相似文献   

12.
根据概率数据的描述形式对概率数据分为基于关系的概率数据模型和基于XML的概率数据模型两类。基于关系的概率数据模型是为每个元组引入概率标记属性表示不确定性,使元组的存储、查询处理变得复杂;基于XML的概率数据模型是在普通XML树中添加表示概率属性结点,能够表示多粒度的概率信息。设计了映射为关系的概率XML数据的与PDTD无关的PXRel和PXParent两种存储模式,并通过实验验证了其有效性。  相似文献   

13.
Jian Liu  Z. M. Ma  Li Yan 《World Wide Web》2013,16(3):325-353
As the next generation language of the Internet, XML has been the de-facto standard of information exchange over the web. A core operation for XML query processing is to find all the occurrences of a twig pattern in an XML database. In addition, the study of probabilistic data has become an emerging topic for various applications on the Web. Therefore, researching the combination of XML twig pattern and probabilistic data is quite significant. In prior work of probabilistic XML, the answers of a given twig query are always complete. However, complete answers with low probabilities may be deemed irrelevant while incomplete answers with high probabilities are of great significance because incomplete answers may be the potential answers that interest the users. Different from complete evaluation, evaluating incomplete twigs in probabilistic XML introduces some new challenges. On one hand, incomplete queries do not only obtain complete matches, but also return answers that contain considerable incomplete matches. On the other hand, the processing of incomplete evaluation is more complicated. It is obvious that a ranking approach should be adopted along with evaluating incomplete answers. In this paper, we propose an efficient algorithm to handle the problem of querying incomplete twigs over the probabilistic XML database. We also present a novel algorithm for ranking the incomplete answers. The experimental results show that our proposed algorithms can improve the performance of querying and ranking incomplete twigs significantly.  相似文献   

14.
目前大部分XML查询语言都使用树模式来匹配待查询的XML文档树以得到所需要的、与模式树相吻合的查询结果,此效率在很大程度上取决于XML模式树的大小,那么尽可能快速地查找并删除查询模式树中的冗余节点就变得十分重要。重点讨论DTD约束下树模式的最小化问题,将DTD兄弟约束SC拓展成扩展兄弟约束ESC,使其能够表达DTD约束中的祖先-后代关系;并指出只包含{ESC,/,//,[],*}的查询树模式的最小化问题的复杂度是指数级的,且当模式树是分支受限的时候,其最小化问题的复杂度是多项式时间的;最后给出了一个多项式时间的受限分支的模式树最小化算法。  相似文献   

15.
When data sources are virtually integrated, there is no common and centralized method to maintain global consistency, so inconsistencies with regard to global integrity constraints are very likely to occur. In this paper, we consider the problem of defining and computing consistent query answers when queries are posed to virtual XML data integration systems, which are specified following the local-as-view approach. We propose a powerful XML constraint model to define global constraints, which can express keys and functional dependencies, and which also extends the newly introduced conditional functional dependencies to XML. We provide an approach to defining XML views, which supports not only edge-path mappings but also data-value bindings to express the join operator. We give formal definitions of repair and consistent query answers with the XML data integration settings. Given a query on the global system, we present a two-step method to compute consistent query answers. First, the given query is transformed using the global constraints, such that to run the transformed query on the original global system will generate exactly the consistent query answers. Because the global instance is not materialized, the query on the global instance is then rewritten in the form of queries on the underlying data sources by reversing rules in view definitions. We illustrate that the XPath query transformations can be implemented in XQuery. Finally, we implement prototypes of our method and evaluate our algorithms in the experiments.  相似文献   

16.
在XML的树模型基础上,提出查询是一个有序的带标记树、数据库是一个有序的带标记树集合的思想,对于查询的回答是一个或几个从查询树结点到数据库结点的同态映射;对一般意义下的XML树模型进行了形式化改造,并且基于改造后的XML树模型构造了查询;最后,阐述了这一工作的意义。  相似文献   

17.
Keyword search is the most popular technique of searching information from XML (eXtensible markup language) document. It enables users to easily access XML data without learning the structure query language or studying the complex data schemas. Existing traditional keyword query methods are mainly based on LCA (lowest common ancestor) semantics, in which the returned results match all keywords at the granularity of elements. In many practical applications, information is often uncertain and vague. As a result, how to identify useful information from fuzzy data is becoming an important research topic. In this paper, we focus on the issue of keyword querying on fuzzy XML data at the granularity of objects. By introducing the concept of “object tree”, we propose the query semantics for keyword query at object-level. We find the minimum whole matching result object trees which contain all keywords and the partial matching result object trees which contain partial keywords, and return the root nodes of these result object trees as query results. For effectively and accurately identifying the top-K answers with the highest scores, we propose a score mechanism with the consideration of tf*idf document relevance, users’ preference and possibilities of results. We propose a stack-based algorithm named object-stack to obtain the top-K answers with the highest scores. Experimental results show that the object-stack algorithm outperforms the traditional XML keyword query algorithms significantly, and it can get high quality of query results with high search efficiency on the fuzzy XML document.  相似文献   

18.
概率XML文件是概率数据的网络数据交换和表示标准,元素取值及其概率的查询与计算是概率XML文件的重要研究内容.概率XML文件树是一种有效的概率XML文件的数据模型,定义了概率XML文件树的基本路径和扩展路径,提出了根据可能世界原理将概率XML文件树分解为普通子XML树的集合的算法,根据路径分析原理将概率XML文件树分解为子概率XML树的集合的算法和相应的查询与计算结点及结点集合概率的算法,并通过实验进行了比较分析.实验结果表明:这两种方法是有效的;与前一种方法比较,后一种方法适合较大的概率XML文件树、结点及结点集合的概率的查询,计算过程较简单.  相似文献   

19.
不确定海量数据存储与记录的广泛应用及其在XML上的扩展,使XML的关联事件概率的数据模型研究成为研究热点,以描述复杂事件的概率数据模型为目标,在当前已有概率模型的基础上,提出了多维不确定概率模型空间的概念,基于多个概率模型进行统一建模,并把单维XML概率节点引申到多维空间,进而定义了统一的空间查询方式,为复杂概率数据建模和查询优化提供了一种新颖的理论方法。  相似文献   

20.
近年来, XML数据查询成为一个重要的研究课题。处理小枝查询是XML查询实现的核心操作,针对小枝模式查询,提出了一种改进的小枝模式匹配算法。该算法通过剪去无用的数据流以减少待处理结点的数目,从而节省处理时间,提高查询的准确率。实验结果表明,该算法能够有效提高查询效率。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号