首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Uncertainty in deductive databases and logic programming has been modeled using a variety of (numeric and non-numeric) formalisms in the past, including probabilistic, possibilistic, and fuzzy set-theoretic approaches, and many valued logic programming. In this paper, we consider a hybrid approach to the modeling of uncertainty in deductive databases. Our model, called deductive IST (DIST) is based on an extension of the Information Source Tracking (IST) model, recently proposed for relational databases. The DIST model permits uncertainty to be modeled and manipulated in essentially qualitative terms with an option to convert qualitative expressions of uncertainty into numeric form (e.g., probabilities). An uncertain deductive database is modeled as a Horn clause program in the DIST framework, where each fact and rule is annotated with an expression indicating the “sources” contributing to this information and their nature of contribution. (1) We show that positive DIST programs enjoy the least model/least fixpoint semantics analogous to classical logic programs. (2) We show that top-down (e.g., SLD-resolution) and bottom-up (e.g., magic sets rewriting followed by semi-naive evaluation) query processing strategies developed for datalog can be easily extended to DIST programs. (3) Results and techniques for handling negation as failure in classical logic programming can be easily extended to DIST. As an illustration of this, we show how stratified negation can be so extended. We next study the problem of query optimization in such databases and establish the following results. (4) We formulate query containment in qualitative as well as quantitative terms. Intuitively, our qualitative sense of containment would say a query Q1 is contained in a query Q2 provided for every input database D, for every tuple t, t ε Q2(D) holds in every “situation” in which t ε Q1(D) is true. The quantitative notion of containment would say Q1 is contained in Q2 provided on every input, the certainty associated with any tuple computed by Q1 is no more than the certainty associated with the same tuple by Q2 on the given input. We also prove that qualitative and quantitative notions of containment (both absolute and uniform versions) coincide. (5) We establish necessary and sufficient conditions for the qualitative containment of conjunctive queries. (6) We extend the well-known chase technique to develop a test for uniform containment and equivalence of positive DIST programs. (7) Finally, we prove that the complexity of testing containment of conjunctive DIST queries remains the same as in the classical case when number of information sources is regarded as a constant (so, it's NP-complete in the size of the queries). We also show that testing containment of conjunctive queries is co-NP-complete in the number of information sources.  相似文献   

2.
This paper studies the most similar maximal clique query(MSMCQ).Given a graph G and a set of nodes Q,MSMCQ is to find the maximal clique of G having the largest similarity with Q.MSMCQ has many real applications including advertising industry,public security,task crowdsourcing and social network,etc.MSMCQ can be studied as a special case of the general set similarity query(SSQ).However,the MCs of G has several specialties from the general sets.Based on the specialties of MCs,we propose a novel index,namely MCIndex.MCIndex outperforms the state-of-the-art SSQ method significantly in terms of the number of candidates and the query time.Specifically,we first construct an inverted indexⅠfor all the MCs of G.Since the MCs in a posting list often have a lot of overlaps,MCIndex selects some pivots to cluster the MCs with a small radius.Given a query Q,we compute the distance from the pivots to Q.The clusters of the pivots assured not answer can be pruned by our distance based pruning rule.Since it is NP-hard to construct a minimum MCIndex,we propose to construct a minimal MCIndex onⅠ(v)with an approximation ratio 1+ln|Ⅰ(v)|.Since the MCs have properties that are inherent of graph structure,we further propose a S Index within each cluster of a MCIndex and a structure based pruning rule.S Index can significantly reduce the number of candidates.Since the sizes of intersections between Q and many MCs need to be computed during the query evaluation,we also propose a binary representation of MCs to improve the efficiency of the intersection size computation.Our extensive experiments confirm the effectiveness and efficiency of our proposed techniques on several real-world datasets.  相似文献   

3.
The importance of query processing over uncertain data has recently arisen due to its wide usage in many real-world applications. In the context of uncertain databases, previous works have studied many query types such as nearest neighbor query, range query, top-k query, skyline query, and similarity join. In this paper, we focus on another important query, namely, probabilistic group nearest neighbor (PGNN) query, in the uncertain database, which also has many applications. Specifically, given a set, Q, of query points, a PGNN query retrieves data objects that minimize the aggregate distance (e.g., sum, min, and max) to query set Q. Due to the inherent uncertainty of data objects, previous techniques to answer group nearest neighbor (GNN) query cannot be directly applied to our PGNN problem. Motivated by this, we propose effective pruning methods, namely, spatial pruning and probabilistic pruning, to reduce the PGNN search space, which can be seamlessly integrated into our PGNN query procedure. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed approach, in terms of the wall clock time and the speed-up ratio against linear scan.  相似文献   

4.
K.  Wen-Syan  M.   《Data & Knowledge Engineering》2000,35(3):259-298
Since media-based evaluation yields similarity values, results to a multimedia database query, Q(Y1,…,Yn), is defined as an ordered list SQ of n-tuples of the form X1,…,Xn. The query Q itself is composed of a set of fuzzy and crisp predicates, constants, variables, and conjunction, disjunction, and negation operators. Since many multimedia applications require partial matches, SQ includes results which do not satisfy all predicates. Due to the ranking and partial match requirements, traditional query processing techniques do not apply to multimedia databases. In this paper, we first focus on the problem of “given a multimedia query which consists of multiple fuzzy and crisp predicates, providing the user with a meaningful final ranking”. More specifically, we study the problem of merging similarity values in queries with multiple fuzzy predicates. We describe the essential multimedia retrieval semantics, compare these with the known approaches, and propose a semantics which captures the requirements of multimedia retrieval problem. We then build on these results in answering the related problem of “given a multimedia query which consists of multiple fuzzy and crisp predicates, finding an efficient way to process the query.” We develop an algorithm to efficiently process queries with unordered fuzzy predicates (sub-queries). Although this algorithm can work with different fuzzy semantics, it benefits from the statistical properties of the semantics proposed in this paper. We also present experimental results for evaluating the proposed algorithm in terms of quality of results and search space reduction.  相似文献   

5.
The aim of this paper is a modification of Minker's Generalized Closed World Assumption that would allow application of the “negation as failure rule” with respect to a set P of (not necessarily all) predicates of a database DB. A careful closure procedure is introduced which, when applied to a database DB, produces a new database DB*, that is used to answer queries about predicates from DB. It is shown that DB* is consistent iff DB is consistent. If P is the set of all predicates from DB and DB does not contain functional symbols, then DB* coincides with Minker's GCWA. The soundness and completeness of the careful closure procedure with respect to a minimal model style semantic is shown. As an inference engine associated with DB* we propose a query evaluation procedure QEP* which is a combination of a method of splitting an indefinite database DB into a disjunction of Horn databases and Clark's query evaluation procedure QEP. Soundness of QEP* with respect to DB* is shown for a broad class of databases.  相似文献   

6.
Fragmentation has been used to distribute the contents of a database across the sites of a distributed database system. During run time, the system must determine which fragments can be used to answer each query. This process requires solving the predicate implication problem. In order to speed processing, it is desirable to do as much preprocessing as possible on the prestored fragments, without knowledge of the run-time query. In this paper, performing preprocessing on database fragments to speed later run-time implication checking is investigated. The investigation is based on a new concept, separation among predicates. When two predicates are properly separated, their union cannot be implied by any other conjunctive predicate unless one of them is implied by the conjunctive predicate. A polynomial time algorithm for checking the pair-wise separation among a collection of fragment predicates is introduced and its complexity is theoretically analyzed. The separation checking algorithm is accompanied by a query processing algorithm which makes use of the result of the separation properties of the fragments to speed real time query processing. The two algorithms presented are scalable according to available preprocessing time in the sense that the preprocessing algorithm can be run for shorter periods to produce partial preprocessing that can still be used by the query processing algorithm.  相似文献   

7.
The problem of finding a rectilinear minimum bend path (RMBP) between two designated points inside a rectilinear polygon has applications in robotics and motion planning. In this paper, we present efficient algorithms to solve the query version of the RMBP problem for special classes of rectilinear polygons given their visibility graphs. Specifically, we show that given an unweighted graph G = (V, E), with ¦V¦ = N and ¦E¦ = M, algorithms to preprocess G in linear space and time such that the shortest distance queries — queries asking for the distance between any pair of nodes in the graph — can be answered in constant time and space are presented in this paper. For the case of a chordal graph G, our algorithms give a distance which is at most one away from the actual shortest distance. When G is a K-chordal graph, our algorithm produces an exact shortest distance in O(K) time. We also present a non-trivial parallel implementation of the sequential preprocessing algorithm for the CREW-PRAM model which runs in O(log2 N) time using O(N + M) processors. After the preprocessing, we can answer the queries in constant time using a single processor.  相似文献   

8.
Intensional answers are conditions that tuples of values must satisfy to belong to the usual extensional answer of a query addressed to a deductive database. The authors review the concept of intensional answers and introduce a general method for generating them as logical consequences of the query and of deduction rules. The authors show how integrity constraints can filter out inadequate answers and produce simpler and more informative answers. An efficient organization for the combination of answers and constraints is described. The introduction of negation in queries and in the body of deduction rules is discussed. Beyond the mechanics of answer generation, the interest of the approach also depends on a strategy for selecting answers to a user submitting a query. This requires techniques for user modeling and dialogue management similar to those required for expert systems  相似文献   

9.
10.
Semijoin is a relational operator used in many relational query processing algorithms. Semijoins can be used to “reduce” the database by delimitting portions of the database that contain data relevant to a given query. For some queries, there exist sequences of semijoins that delimit the exact portions of the database needed to answer the query. Such sequences are called full reducers.

This paper considers a class of queries called natural inequality queries (NI queries), and characterizes a subclass for which full reducers exist. We also present an efficient algorithm that decides whether an NI query lies within this subclass, and constructs a full reducer for the query. The NI queries are a subset of the aggregate-free, conjunctive queries of QUEL, and permit join clauses to include <, , =, , >.  相似文献   


11.
Under the bag-theoretic semantics relations are bags of tuples, that is, a tuple may have any number of duplicates. Under this semantics, a conjunctive query is bag-contained in a conjunctive query , denoted , if for all databases , , the result of applying to , is a subbag of . It is not known whether testing is decidable. In this paper we prove that can be tested on a finite set of canonical databases built from the body of . Using that result we give a procedure that decides the bag-containment problem of conjunctive queries in a large number of cases. Received: 27 September 1995 / 19 June 1996  相似文献   

12.
In this paper we study queries over relational databases with integrity constraints (ICs). The main problem we analyze is OWA query answering, i.e., query answering over a database with ICs under open-world assumption. The kinds of ICs that we consider are inclusion dependencies and functional dependencies, in particular key dependencies; the query languages we consider are conjunctive queries and unions of conjunctive queries. We present results about the decidability of OWA query answering under ICs. In particular, we study OWA query answering both over finite databases and over unrestricted databases, and identify the cases in which such a problem is finitely controllable, i.e., when OWA query answering over finite databases coincides with OWA query answering over unrestricted databases. Moreover, we are able to easily turn the above results into new results about implication of ICs and query containment under ICs, due to the deep relationship between OWA query answering and these two classical problems in database theory. In particular, we close two long-standing open problems in query containment, since we prove finite controllability of containment of conjunctive queries both under arbitrary inclusion dependencies and under key and foreign key dependencies. The results of our investigation are very relevant in many research areas which have recently dealt with databases under an incomplete information assumption: e.g., data integration, data exchange, view-based information access, ontology-based information systems, and peer data management systems.  相似文献   

13.
In this paper new methods of discretization (integer approximation) of algebraic spatial curves in the form of intersecting surfaces P(x, y, z) = 0 and Q(x, y, z) = 0 are analyzed.

The use of homogeneous cubical grids G(h3) to discretize a curve is the essence of the method. Two new algorithms of discretization (on 6-connected grid G6c(h3) and 26-connected grid G26(h3)) are presented based on the method above. Implementation of the algorithms for algebraic spatial curves is suggested. The elaborated algorithms are adjusted for application in computer graphics and numerical control of machine tools.  相似文献   


14.
Nonrecursive incremental evaluation of Datalog queries   总被引:1,自引:0,他引:1  
We consider the problem of repeatedly evaluating the same (computationally expensive) query to a database that is being updated between successive query requests. In this situation, it should be possible to use the difference between successive database states and the answer to the query in one state to reduce the cost of evaluating the query in the next state. We use nonrecursive Datalog (which are unions of conjunctive queries) to compute the differences, and call this process incremental query evaluation using conjunctive queries. After formalizing the notion of incremental query evaluation using conjunctive queries, we give an algorithm that constructs, for each regular chain query (including transitive closure as a special case), a nonrecursive Datalog program to compute the difference between the answer after an update and the answer before the update. We then extend this result to weakly regular queries, which are regular chain programs augmented with conjunctive queries having the so-called Cartesian-closed increment property, and to the case of unbounded-set insertions where the sets are binary Cartesian products. Finally, we show that the class of conjunctive queries with the Cartesian-closed increment property is decidable.Parts of the results in this paper appeared as extended abstracts in theProceedings of the 1992 International Conference on Database Theory (LNCS 646, Springer-Verlag), and in theProceedings of the 1993 International Workshop on Database Programming Languages (Workshops in Computing, Springer-Verlag).Guozhu Dong gratefully acknowledges support of the Australian Research Council through research grants, and the Centre for Intelligen Decision Systems.Work by Jianwen Su was supported in part by NSF Grants IRI-9109520 and IRI-9117094.  相似文献   

15.
In data applications such as information integration, there can be limited access patterns to relations, i.e., binding patterns require values to be specified for certain attributes in order to retrieve data from a relation. As a consequence, we cannot retrieve all tuples from these relations. In this article we study the problem of computing the complete answer to a query, i.e., the answer that could be computed if all the tuples could be retrieved. A query is stable if for any instance of the relations in the query, its complete answer can be computed using the access patterns permitted by the relations. We study the problem of testing stability of various classes of queries, including conjunctive queries, unions of conjunctive queries, and conjunctive queries with arithmetic comparisons. We give algorithms and complexity results for these classes of queries. We show that stability of datalog programs is undecidable, and give a sufficient condition for stability of datalog queries. Finally, we study data-dependent computability of the complete answer to a nonstable query, and propose a decision tree for guiding the process to compute the complete answer.Received: 6 December 2001, Accepted: 25 November 2002, Published online: 3 April 2003Chen Li: This article combines and integrates some content in the technical report at Stanford University [25] and the paper presented in the 8th International Conference on Database Theory (ICDT), London, UK, January, 2001 [28]. In addition to the prior materials, this article contains more results and complete proofs that were not included in the original reports.  相似文献   

16.
The problem of Proximity Searching in Metric Spaces consists in finding the elements of a set which are close to a given query under some similarity criterion. In this paper we present a new methodology to solve this problem, which uses a t-spanner G′(VE) as the representation of the metric database. A t-spanner is a subgraph G′(VE) of a graph G(VA), such that E  A and G′ approximates the shortest path costs over G within a precision factor t.

Our key idea is to regard the t-spanner as an approximation to the complete graph of distances among the objects, and to use it as a compact device to simulate the large matrix of distances required by successful search algorithms such as AESA. The t-spanner properties imply that we can use shortest paths over G′ to estimate any distance with bounded-error factor t.

For this sake, several t-spanner construction, updating, and search algorithms are proposed and experimentally evaluated. We show that our technique is competitive against current approaches. For example, in a metric space of documents our search time is only 9% over AESA, yet we need just 4% of its space requirement. Similar results are obtained in other metric spaces.

Finally, we conjecture that the essential metric space property to obtain good t-spanner performance is the existence of clusters of elements, and enough empirical evidence is given to support this claim. This property holds in most real-world metric spaces, so we expect that t-spanners will display good behavior in most practical applications. Furthermore, we show that t-spanners have a great potential for improvements.  相似文献   


17.
A method, APEX, for query evaluation in deductive databases presented in this work is based on discovering of axioms and facts relevant to a given query. The notion of relevancy and migration of facts is derived from an analysis of data flow in the system. APEX is complete, and incorporates efficient query evaluation heuristics. Operation of APEX is illustrated by sample databases involving non-linear recursive axioms and cyclic relations. Main virtues of the method are its generality and adaptivity: it imposes no restrictions on the structure of axioms or the contents of relations, and it employs the knowledge of the actual data acquired at each step of a query evaluation.  相似文献   

18.
Let H be a fixed undirected graph. An H-colouring of an undirected graph G is a homomorphism from G to H. If the vertices of G are partially ordered then there is a generic non-deterministic greedy algorithm which computes all lexicographically first maximal H-colourable subgraphs of G. We show that the complexity of deciding whether a given vertex of G is in a lexicographically first maximal H-colourable subgraph of G is NP-complete, if H is bipartite, and Σ2p-complete, if H is non-bipartite. This result complements Hell and Ne et il's seminal dichotomy result that the standard H-colouring problem is in P, if H is bipartite, and NP-complete, if H is non-bipartite. Our proofs use the basic techniques established by Hell and Ne et il, combinatorially adapted to our scenario.  相似文献   

19.
Partial information in databases can arise when information from several databases is combined. Even if each database is complete for some “world”, the combined databases will not be, and answers to queries against such combined databases can only be approximated. In this paper we describe various situations in which a precise answer cannot be obtained for a query asked against multiple databases. Based on an analysis of these situations, we propose a classification of constructs that can be used to model approximations.

The main goal of the paper is to study several formal models of approximations and their semantics. In particular, we obtain universality properties for these models of approximations. Universality properties suggest syntax for languages with approximations based on the operations which are naturally associated with them. We prove universality properties for most of the approximation constructs. Then we design languages built around datatypes given by the approximation constructs. A straightforward approach results in languages that have a number of limitations. In an attempt to overcome those limitations, we explain how all the languages can be embedded into a language for conjunctive and disjunctive sets from Libkin and Wong (1996) and demonstrate its usefulness in querying independent databases. We also discuss the semantics of approximation constructs and the relationship between them.  相似文献   


20.
When answering queries using external information sources, the contents of the queries can be described by views. To answer a query, we must rewrite it using the set of views presented by the sources. When the external information sources also have the ability to answer some (perhaps limited) sets of queries that require performing operations on their data, the set of views presented by the source may be infinite (albeit encoded in some finite fashion). Previous work on answering queries using views has only considered the case where the set of views is finite. In order to exploit the ability of information sources to answer more complex queries, we consider the problem of answering conjunctive queries using infinite sets of conjunctive views. Our first result is that an infinite set of conjunctive views can be partitioned into a finite number of equivalence classes, such that picking one view from every nonempty class is sufficient to determine whether the query can be answered using the views. Second, we show how to compute the set of equivalence classes for sets of conjunctive views encoded by a datalog program. Furthermore, we extend our results to the case when the query and the views use the built-in predicates <, ⩽, =, and ≠, and they are interpreted over a dense domain. Finally, we extend our results to conjunctive queries and views with the built-in predicates <, ⩽, and = interpreted over the integers. In doing so we present a result of independent interest, namely, an algorithm to minimize such queries.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号