首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Vertical partitioning is a design technique for reducing the number of disk accesses to execute a given set of queries by minimizing the number of irrelevant instance variables accessed. This is accomplished by grouping the frequently accessed instance variables as vertical class fragments. The complexity of object-oriented database models due to subclass hierarchy and class composition hierarchy complicates the definition and representation of vertical partitioning of the classes, which makes the problem of vertical partitioning in OODBs very challenging. In this paper, we develop a comprehensive analytical cost model for processing of queries on vertically partitioned OODB classes. A set of analytical evaluation results is presented to show the effect of vertical partitioning, and to study the trade-off between the projection ratio versus selectivity factor vis-a-vis sequential versus index access. Furthermore, an empirical experimental prototype supporting vertical class partitioning has been implemented on a commercial OODB tool kit to validate our analytical cost model.  相似文献   

2.
Object-oriented database management systems (OODBMSs) provide rich facilities for the modeling and processing of structural as well as behavioral properties of complex application objects. However, due to their inherent generality and continuously evolving functionalities, efficient implementations are important for these OODBMSs to support the present and future applications, particularly when the databases are very large. In this paper, we present several parallel, multi-wavefront algorithms based on two processing approaches, i.e., identification and elimination approaches, to verify association patterns specified in queries. Both approaches allow more processors to operate concurrently on a query than the traditional tree-structured query processing approach, thus introducing a higher degree of parallelism in query processing. A heuristic method is presented for partitioning an object-oriented database (OODB). The main consideration for partitioning the database is load balancing. This method also tries to reduce the communication time by reducing the length of the path that wavefronts need to be propagated. Multiple wavefront algorithms based on the two approaches for tree-structured queries have been implemented on an nCUBE 2 parallel computer. The implementation of the query processor allows multiple queries to be executed simultaneously. This implementation provides an environment for evaluating the algorithms and the heuristic method for partitioning the database. The evaluation results are presented in this paper.Recommended by: Patrick Valduriez  相似文献   

3.
Complex object-oriented queries generally consist of path expressions and explicit join operations. Since explicit join operations have been acknowledged as the most expensive operations, query executions normally start from the path expressions. Each path expression may form a sub-query. There are two existing strategies to sub-queries processing: ‘serial’ and ‘parallel’ execution scheduling strategies. Serial sub-queries execution corresponds to an execution of the sub-queries one-by-one, whereas parallel sub-queries execution corresponds to simultaneous execution of the sub-queries. When a sub-query is being processed, parallelization techniques may be applied. In this paper, we focus on the scheduling issues of the sub-queries, rather than the parallelization of the sub-queries themselves. Rules are formulated to guide the parallel query execution process. Our analysis shows that when there is no load skew, the serial scheduling strategy is preferred, otherwise the parallel scheduling strategy should be used.  相似文献   

4.
Advanced application domains such as computer-aided design, computer-aided software engineering, and office automation are characterized by their need to store, retrieve, and manage large quantities of data having complex structures. A number of object-oriented database management systems (OODBMS) are currently available that can effectively capture and process the complex data. The existing implementations of OODBMS outperform relational systems by maintaining and querying cross-references among related objects. However, the existing OODBMS still do not meet the efficiency requirements of advanced applications that require the execution of complex queries involving the retrieval of a large number of data objects and relationships among them. Parallel execution can significantly improve the performance of complex OO queries. In this paper, we analyze the performance of parallel OO query processing algorithms for various benchmark application domains. The application domains are characterized by specific mixes of queries of different semantic complexities. The performance of the application domains has been analyzed for various system and data parameters by running parallel programs on a 32-node transputer based parallel machine developed at the IBM Research Center at Yorktown Heights. The parallel processing algorithms, data routing techniques, and query management and control strategies have been implemented to obtain accurate estimation of controlling and processing overheads. However, generation of large complex databases for the study was impractical. Hence, the data used in the simulation have been parameterized. The parallel OO query processing algorithms analyzed in this study are based on a query graph approach rather than the traditional query tree approach. Using the query graph approach, a query is processed by simultaneously initiating the execution at several object classes, thereby, improving the parallelism. During processing, the algorithms avoid the execution of time-consuming join operations by making use of the object references among the objects. Further, the algorithms do not generate any temporary data, thereby, reducing disk accesses. This is accomplished by marking the selected objects and by employing a two-phase query processing strategy.  相似文献   

5.
Currently relational databases are widely used, while object-oriented databases are emerging as a new generation of database technology. This paper presents a methodology to provide effective sharing of information in object-oriented databases and relational databases. The object-oriented data model is selected as a common data model to build an integrated view of the diverse databases. An object-oriented query language is used as a standard query language. A method is developed to transform a relational data definition to an equivalent object-oriented data definition and to integrate local data definitions. Two distributed query processing methods are derived. One is for general queries and the other for a special class of restricted queries. Using the methods developed, it is possible to access distributed object-oriented databases and relational databases such that the locations and the structural differences of the databases are transparent to users.  相似文献   

6.
With the trend of cloud computing, outsourcing databases to third party service providers is becoming a common practice for data owners to decrease the cost of managing and maintaining databases in-house. In conjunction, due to the popularity of location-based-services (LBS), the need for spatial data (e.g., gazetteers, vector data) is increasing dramatically. Consequently, there is a noticeably new tendency of outsourcing spatial datasets by data collectors. Two main challenges with outsourcing datasets are to keep the data private (from the data provider) and to ensure the integrity of the query result (for the clients). Unfortunately, most of the techniques proposed for privacy and integrity do not extend to spatial data in a straightforward manner. Hence, recent studies proposed various techniques to support either privacy or integrity (but not both) on spatial datasets. In this paper, for the first time, we propose a technique that can ensure both privacy and integrity for outsourced spatial data. In particular, we first use a one-way spatial transformation method based on Hilbert curves, which encrypts the spatial data before outsourcing and, hence, ensures its privacy. Next, by probabilistically replicating a portion of the data and encrypting it with a different encryption key, we devise a technique for the client to audit the trustworthiness of the query results. We show the applicability of our approach for both k-nearest-neighbor queries and spatial range queries, which are the building blocks of any LBS application. We also design solutions to guarantee the freshness of outsourced spatial databases. Finally, we evaluate the validity and performance of our algorithms with security analyses and extensive simulations.  相似文献   

7.
Constraints play an important role in the efficient query evaluation in deductive databases. Constraint-based query evaluation in deductive databases is investigated, with emphasis on linear recursions with function symbols. Constraints are grouped into three classes: rule constraints, integrity constraints, and query constraints. Techniques are developed for the maximal use of different kinds of constraints in rule compilation and query evaluation. The study on the roles of different classes of constraints in set-oriented evaluation of linear recursions shows the following: rule constraints should be integrated with their corresponding deduction rules in the compilation of recursions; integrity constraints, including finiteness constraints and monotonicity constraints, should be used in the analysis of finite evaluability and termination for specific queries; and query constraints, which are often useful in search space reduction and termination, should be transformed, when necessary, and should be pushed into the compiled chains as deeply as possible for efficient evaluation. The constraint-based query-processing technique integrates query-independent compilation and chain-based query evaluation methods and demonstrates its great promise in deductive query evaluation  相似文献   

8.
We describe a framework for supporting arbitrarily complex SQL queries with “uncertain” predicates. The query semantics is based on a probabilistic model and the results are ranked, much like in Information Retrieval. Our main focus is query evaluation. We describe an optimization algorithm that can compute efficiently most queries. We show, however, that the data complexity of some queries is #P-complete, which implies that these queries do not admit any efficient evaluation methods. For these queries we describe both an approximation algorithm and a Monte-Carlo simulation algorithm.  相似文献   

9.
Two types of parallel processing and optimization algorithms for processing object-oriented databases are the hybrid-hash pointer-based (HHP) algorithms and multi-wavefront (MWF) algorithms. We analyze these two algorithms and develop analytical formulas to capture their main performance features. We study their performance in three application environments, characterized by large databases having many object classes, each of which, respectively, (1) contains a large number of instances; (2) contains a relatively small number of instances; and (3) is of varying size. A horizontal data partitioning strategy is used in (1). A class-per-node assignment strategy is used in (2). In (3), object classes are partitioned horizontally and assigned to a varying number of processors depending on their different sizes. The MWF algorithm has three distinguishing features which contribute to its better performance: (a) a two-phase processing strategy, (b) vertical partitioning of horizontal segments, and (c) dynamic determination of the collision point in MWF propagations, which results in an optimized query execution plan. If these features are adopted by an HHP algorithm, its performance is comparable with that of the MWF algorithm because the difference in CPU time between them is negligible. The computing environment is a network of workstations having a shared-nothing architecture. The schema and some queries selected from the OO7 benchmark are used in the performance analyses and comparisons. The queries are modified slightly in different data environments in order to reflect the features of diverse database applications  相似文献   

10.
A minimal framework for an object-oriented query language standard should (1) include a formal definition of a high-level data model and the syntax and semantics of associated query languages, (2) provide the functionality of relational query languages, and (3) support proofs of correctness of transformations for logical query optimization. In this paper, a high-level conceptual model for object-oriented query processing is discussed; the model includes widely-used structural abstractions such as the isa relationship, associations (properties) between complex objects and complex objects/values, and inheritance of properties. A formal, algebraic query language for the model, inspired by relational algebra, is presented. Operators of the algebra allow queries based on values, queries that manipulate entire objects, and queries that construct new objects from existing objects/values. All queries retain connections to existing database objects, providing logical access paths to data. Each query result is a class, so the algebra has the closure property. The intensional and extensional results of query operators are summarized. Two forms of logical query optimization supported by the query algebra are outlined: algebraic transformations and classifier-based optimizations (optimizations which employ inclusion and exclusion dependencies between classes).  相似文献   

11.
Medical information systems are becoming popular. Furthermore, their use is expected to be expanded in future. In order to easily extract information from the underlying databases, easy to use man-machine interfaces are required. This paper concerns the doctor interface of the ARPIA medical information system based on a relational database management system. The whole system is currently used inside the Outpatient Pediatric Clinic of the Catholic University of Rome, Italy. The interface provides logical data independence towards the underlying relational database. The paper contains some background about this kind of interfaces, and also, it sketches the software architecture of the implemented prototype.  相似文献   

12.
Dataflow query execution in a parallel main-memory environment   总被引:2,自引:0,他引:2  
In this paper, the performance and characteristics of the execution of various join-trees on a parallel DBMS are studied. The results of this study are a step into the direction of the design of a query optimization strategy that is fit for parallel execution of complex queries.Among others, synchronization issues are identified to limit the performance gain from parallelism. A new hash-join algorithm is introduced that has fewer synchronization constraints than the known hash-join algorithms. Also, the behavior of individual join operations in a join-tree is studied in a simulation experiment. The results show that the introduced Pipelining hash-join algorithm yields a better performance for multi-join queries. The format of the optimal join-tree appears to depend on the size of the operands of the join: A multi-join between small operands performs best with a bushy schedule; larger operands are better off with a linear schedule. The results from the simulation study are confirmed with an analytic model for dataflow query execution.  相似文献   

13.
System-guided view integration for object-oriented databases   总被引:1,自引:0,他引:1  
Some of the shortcomings of current view integration methodologies, namely, a low emphasis on full-scale automated systems, a lack of algorithmic specifications of the integration activities, inattention to the design of databases with new properties such as databases for computer-aided design, and insufficient experience with data models with a rich set of type and abstraction mechanisms, are attacked simultaneously. The focus is on design databases for software engineering applications. The approach relies on a semantic model based on structural object-orientation with various features tailored to these applications. The expressiveness of the model is used to take the first steps toward algorithmic solutions, and it is demonstrated how corresponding tools could be embedded methodically within the view integration process and technically within a database design environment. The central ideal is to compute so-called assumption predicates that express suggested similarities between structures in two schemas to be integrated, and then have a human integrator confirm or reject them. The basic method is exemplified for the CERM data model that includes molecular aggregation, generalization, and versioning  相似文献   

14.
Volcano-an extensible and parallel query evaluation system   总被引:2,自引:0,他引:2  
To investigate the interactions of extensibility and parallelism in database query processing, we have developed a new dataflow query execution system called Volcano. The Volcano effort provides a rich environment for research and education in database systems design, heuristics for query optimization, parallel query execution, and resource allocation. Volcano uses a standard interface between algebra operators, allowing easy addition of new operators and operator implementations. Operations on individual items, e.g., predicates, are imported into the query processing operators using support functions. The semantics of support functions is not prescribed; any data type including complex objects and any operation can be realized. Thus, Volcano is extensible with new operators, algorithms, data types, and type-specific methods. Volcano includes two novel meta-operators. The choose-plan meta-operator supports dynamic query evaluation plans that allow delaying selected optimization decisions until run-time, e.g., for embedded queries with free variables. The exchange meta-operator supports intra-operator parallelism on partitioned datasets and both vertical and horizontal inter-operator parallelism, translating between demand-driven dataflow within processes and data-driven dataflow between processes. All operators, with the exception of the exchange operator, have been designed and implemented in a single-process environment, and parallelized using the exchange operator. Even operators not yet designed can be parallelized using this new operator if they use and provide the interator interface. Thus, the issues of data manipulation and parallelism have become orthogonal, making Volcano the first implemented query execution engine that effectively combines extensibility and parallelism  相似文献   

15.
Considers the applicability of algorithm based fault tolerance (ABET) to massively parallel scientific computation. Existing ABET schemes can provide effective fault tolerance at a low cost For computation on matrices of moderate size; however, the methods do not scale well to floating-point operations on large systems. This short note proposes the use of a partitioned linear encoding scheme to provide scalability. Matrix algorithms employing this scheme are presented and compared to current ABET schemes. It is shown that the partitioned scheme provides scalable linear codes with improved numerical properties with only a small increase in hardware and time overhead  相似文献   

16.
We introduce a fuzzy set theoretic approach for dealing with uncertainty in images in the context of spatial and topological relations existing among the objects in the image. We propose an object-oriented graph theoretic model for representing an image and this model allows us to assess the similarity between images using the concept of (fuzzy) graph matching. Sufficient flexibility has been provided in the similarity algorithm so that different features of an image may be independently focused upon.  相似文献   

17.
An experimental library of image processing for multiprocessor computers SSCC_PIPL is described in this paper. The principles of formation, adopted architectural solutions, and results of test experiments are presented.  相似文献   

18.
We present a technique for transferring query optimization techniques, developed for relational databases, into object databases. We demonstrate this technique for ODMG database schemas defined in ODL and object queries expressed in OQL. The object schema is represented using a logical representation (Datalog). Semantic knowledge about the object data model, e.g., class hierarchy information, relationship between objects, etc., as well as semantic knowledge about a particular schema and application domain are expressed as integrity constraints. An OQL object query is represented as a logic query and query optimization is performed in the Datalog representation. We obtain equivalent (optimized) logic queries, and subsequently obtain equivalent (optimized) OQL queries for each equivalent logic query. We present one optimization technique for semantic query optimization (SQO) based on the residue technique of U. Charavarthy et al. (1990; 1986; 1988). We show that our technique generalizes previous research on SQO for object databases. We handle a large class of OQL queries, including queries with constructors and methods. We demonstrate how SQO can be used to eliminate queries which contain contradictions and simplify queries, e.g., by eliminating joins, or by reducing the access scope for evaluating a query to some specific subclass(es). We also demonstrate how the definition of a method or integrity constraints describing the method, can be used in optimizing a query with a method  相似文献   

19.
This paper presents a query processing strategy for the content-based video query language named CVQL. By CVQL, users can flexibly specify query predicates by the spatial and temporal relationships of the content objects. The query processing strategy evaluates the predicates and returns qualified videos or frames as results. Before the evaluation of the predicates, a preprocessing is performed to avoid unnecessary accessing of videos which are impossible to be the answers. The preprocessing checks the existence of the content objects specified in the predicates to eliminate unqualified videos. For the evaluation of the predicates, an M-index is designed based on the analysis of the behaviors of the content objects. The M-index is employed to avoid frame-by-frame evaluation of the predicates. Experimental results are presented to illustrate the performance of this approach  相似文献   

20.
We consider the problem of retrieving consistent answers over databases that might be inconsistent with respect to a set of integrity constraints. In particular, we concentrate on sets of constraints that consist of key dependencies, and we give an algorithm that computes the consistent answers for a large and practical class of conjunctive queries. Given a query q, the algorithm returns a first-order query Q (called a query rewriting) such that for every (potentially inconsistent) database I, the consistent answers for q can be obtained by evaluating Q directly on I.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号