An important feature of database technology of the nineties is the use of parallelism for speeding up the execution of complex queries. This technology is being tested in several experimental database architectures and a few commercial systems for conventional select-project-join queries. In particular, hash-based fragmentation is used to distribute data to disks under the control of different processors in order to perform selections and joins in parallel. With the development of new query languages, and in particular with the definition of transitive closure queries and of more general logic programming queries, the new dimension of recursion has been added to query processing. Recursive queries are complex; at the same time, their regular structure is particularly suited for parallel execution, and parallelism may give a high efficiency gain. We survey the approaches to parallel execution of recursive queries that have been presented in the recent literature. We observe that research on parallel execution of recursive queries is separated into two distinct subareas, one focused on the transitive closure of Relational Algebra expressions, the other one focused on optimization of more general Datalog queries. Though the subareas seem radically different because of the approach and formalism used, they have many common features. This is not surprising, because most typical Datalog queries can be solved by means of the transitive closure of simple algebraic expressions. We first analyze the relationship between the transitive closure of expressions in Relational Algebra and Datalog programs. We then review sequential methods for evaluating transitive closure, distinguishing iterative and direct methods. We address the parallelization of these methods, by discussing various forms of parallelization. Data fragmentation plays an important role in obtaining parallel execution; we describe hash-based and semantic fragmentation. Finally, we consider Datalog queries, and present general methods for parallel rule execution; we recognize the similarities between these methods and the methods reviewed previously, when the former are applied to linear Datalog queries. We also provide a quantitative analysis that shows the impact of the initial data distribution on the performance of methods.
Recommended by: Patrick Valduriez 相似文献
Various relation-based systems, concerned with the qualitative representation and processing of spatial knowledge, have been developed in numerous application domains. In this article, we identify the common concepts underlying qualitative spatial knowledge representation, we compare the representational properties of the different systems, and we outline the computational tasks involved in relation-based spatial information processing. We also describesymbolic spatial indexes, relation-based structures that combine several ideas in spatial knowledge representation. A symbolic spatial index is an array that preserves only a set of spatial relations among distinct objects in an image, called the modeling space; the index array discards information, such as shape and size of objects, and irrelevant spatial relations. The construction of a symbolic spatial index from an input image can be thought of as a transformation that keeps only a set of representative points needed to define the relations of the modeling space. By keeping the relative arrangements of the representative points in symbolic spatial indexes and discarding all other points, we maintain enough information to answer queries regarding the spatial relations of the modeling space without the need to access the initial image or an object database. Symbolic spatial indexes can be used to solve problems involving route planning, composition of spatial relations, and update operations. 相似文献
We introduce a semantic data model to capture the hierarchical, spatial, temporal, and evolutionary semantics of images in pictorial databases. This model mimics the user's conceptual view of the image content, providing the framework and guidelines for preprocessing to extract image features. Based on the model constructs, a spatial evolutionary query language (SEQL), which provides direct image object manipulation capabilities, is presented. With semantic information captured in the model, spatial evolutionary queries are answered efficiently. Using an object-oriented platform, a prototype medical-image management system was implemented at UCLA to demonstrate the feasibility of the proposed approach. 相似文献
This research investigates and approach to query processing in a multidatabase system that uses an objectoriented model to capture the semantics of other data models. The object-oriented model is used to construct a global schema, defining an integrated view of the different schemas in the environment. The model is also used as a self-describing model to build a meta-database for storing information about the global schema. A unique aspect of this work is that the object-oriented model is used to describe the different data models of the multidatabase environment, thereby extending the meta database with semantic information about the local schemas. With the global and local schemas all represented in an object-oriented form, structural mappings between the global schema and each local schema are then easily supported. An object algebra then provides a query language for expressing global queries, using the structural mappings to translate object algebra queries into SQL queries over local relational schema. The advantage of using an object algebra is that the object-oriented database can be viewed as a blackboard for temporary storage of local data and for establishing relationships between different databases. The object algebra can be used to directly retrieve temporarily-stored data from the object-oriented database or to transparently retrieve data from local sources using the translation process described in this paper. 相似文献
The combination of SGML and database technology allows to refine both declarative and navigational access mechanisms for
structured document collection: with regard to declarative access, the user can formulate complex information needs without
knowing a query language, the respective document type definition (DTD) or the underlying modelling. Navigational access is
eased by hyperlink-rendition mechanisms going beyond plain link-integrity checking. With our approach, the database-internal
representation of documents is configurable. It allows for an efficient implementation of operations, because DTD knowledge
is not needed for document structure recognition. We show how the number of method invocations and the cost of parsing can
be significantly reduced.
Edited by Y.C. Tay. Received April 22, 1996 / Accepted March 16, 1997 相似文献
LDL is one of the recently proposed logical query languages, which incorporate set, for data and knowledge base systems. Since LDL programs can simulate negation, they are not monotonic in general. On the other hand, there are monotonic LDL programs. This paper addresses the natural question of “When are the generally nonmonotonic LDL programs monotonic?” and investigates related topics such as useful applications for monotonicity. We discuss four kinds of monotonicity, and examine two of them in depth. The first of the two, called “ω-monotonicity”, is shown to be undecidable even when limited to single-stratum programs. The second, called “uniform monotonicity”, is shown to implyω-monotonicity. We characterize the uniform monotonicity of a program (i) by a relationship between its Bancilhon-Khoshafian semantics and its LDL semantics, and (ii) with a useful property called subset completion independence. Characterization (ii) implies that uniformly monotonie programs can be evaluated more efficiently by discarding dominated facts. Finally, we provide some necessary and/or sufficient, syntactic conditions for uniform monotonicity. The conditions pinpoint (a) enumerated set terms, (b) negations of membership and inclusion, and (c) sharing of set terms as the main source for nonuniform monotonicity. 相似文献
This paper concerns the following problem: given a set of multi-attribute records, a fixed number of buckets and a two-disk system, arrange the records into the buckets and then store the buckets between the disks in such a way that, over all possible orthogonal range queries (ORQs), the disk access concurrency is maximized. We shall adopt the multiple key hashing (MKH) method for arranging records into buckets and use the disk modulo (DM) allocation method for storing buckets onto disks. Since the DM allocation method has been shown to be superior to any other allocation methods for allocating an MKH file onto a two-disk system for answering ORQs, the real issue is knowing how to determine an optimal way for organizing the records into buckets based upon the MKH concept.
A performance formula that can be used to evaluate the average response time, over all possible ORQs, of an MKH file in a two-disk system using the DM allocation method is first presented. Based upon this formula, it is shown that our design problem is related to a notoriously difficult problem, namely the Prime Number Problem. Then a performance lower bound and an efficient algorithm for designing optimal MKH files in certain cases are presented. It is pointed out that in some cases the optimal MKH file for ORQs in a two-disk system using the DM allocation method is identical to the optimal MKH file for ORQs in a single-disk system and the optimal average response time in a two-disk system is slightly greater than one half of that in a single-disk system. 相似文献