期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Solving Local Cost Estimation Problem for Global Query Optimization in Multidatabase Systems

Qiang Zhu Per-åke Larson 《Distributed and Parallel Databases》1998,6(4):373-421

To meet users' growing needs for accessing pre-existing heterogeneous databases, a multidatabase system (MDBS) integrating multiple databases has attracted many researchers recently. A key feature of an MDBS is local autonomy. For a query retrieving data from multiple databases, global query optimization should be performed to achieve good system performance. There are a number of new challenges for global query optimization in an MDBS. Among them, a major one is that some local optimization information, such as local cost parameters, may not be available at the global level because of local autonomy. It creates difficulties for finding a good decomposition of a global query during query optimization. To tackle this challenge, a new query sampling method is proposed in this paper. The idea is to group component queries into homogeneous classes, draw a sample of queries from each class, and use observed costs of sample queries to derive a cost formula for each class by multiple regression. The derived formulas can be used to estimate the cost of a query during query optimization. The relevant issues, such as query classification rules, sampling procedures, and cost model development and validation, are explored in this paper. To verify the feasibility of the method, experiments were conducted on three commercial database management systems supported in an MDBS. Experimental results demonstrate that the proposed method is quite promising in estimating local cost parameters in an MDBS. 相似文献

2.

Fuzzy Statistics Estimation in Supporting Multidatabase Query Optimization

Chih-Ping Wei Olivia R. Liu Sheng Paul Jen-Hwa Hu 《Electronic Commerce Research》2002,2(3):287-316

Advances in networking and database technology have made global information sharing a reality. Multidatabase systems (MDBSs) represent a promising approach to addressing the challenges of achieving interoperability among multiple pre-existing databases that are highly autonomous and possibly heterogeneous. The performance of an MDBS is greatly dependent on effectiveness of multidatabase query optimization (MQO). However, the unavailability of and uncertainty in the statistics essential to query optimization have made multidatabase query optimization (MQO) significantly more challenging than distributed query optimization. This research undertook to develop a fuzzy statistics-based MQO approach to addressing statistics estimation and uncertainty problems in an MDBS environment. We analyzed the statistics needed in an MDBS environment and classified them into three categories: point-based, distribution-function-based and dependency-based. Fuzzy numbers were adopted to represent point-based statistics, and a fuzzy polynomial regression method was developed for estimating distribution function-based statistics (i.e., attribute or join selectivity) from a set of subquery results. For dependency-based statistics, a fuzzy regression method was employed for estimating logical-parameter-based local cost functions. Furthermore, methods for ranking the fuzzy numbers that are fundamental to fuzzy-statistics-based MQO were also discussed. The proposed fuzzy statistics estimation methods were illustrated using examples to demonstrate its applicability in supporting MQO. 相似文献

3.

Cost Estimation for Queries Experiencing Multiple Contention States in Dynamic Multidatabase Environments

Qiang Zhu Satyanarayana Motheramgari Yu Sun 《Knowledge and Information Systems》2003,5(1):26-49

Accurate query cost estimation is crucial to query optimization in a multidatabase system. Several estimation techniques for a static environment have been suggested in the literature. To develop a cost model for a dynamic environment, we recently introduced a multistate query-sampling method. It has been shown that this technique is promising in estimating the cost ofa query run in any given contention state for a dynamic environment. In this paper, we study a new problem on how to estimate the cost of a large query that may experience multiple contention states. Following the discussion of limitations for two simple approaches, i.e., single state analysis and average cost analysis, we propose two novel techniques to tackle this challenge. The first one, called fractional analysis, is suitable for a gradually and smoothly changing environment, while the second one, called the probabilistic approach, is developed for a rapidly and randomly changing environment. The former estimates a query cost by analyzing its fractions, and the latter estimates a query cost based on Markov chain theory. The related issues including cost formula development, error analysis, and comparison among different approaches are discussed. Experiments demonstrate that the proposed techniques are quite promising in solving the new problem. Received 5 January 2001 / Revised 6 June 2001 / Accepted in revised form 9 July 2001 Correspondence and offprint requests to: Qiang Zhu, Department of Computer and Information Science, The University of Michigan – Dearborn, Dearborn, MI 48128, USA. Email: qzhu@umich.eduau 相似文献

4.

Evolutionary techniques for updating query cost models in a dynamic multidatabase environment

Amira?Rahal Email author Qiang?Zhu Per-?ke?Larson 《The VLDB Journal The International Journal on Very Large Data Bases》2004,13(2):162-176

Deriving local cost models for query optimization in a dynamic multidatabase system (MDBS) is a challenging issue. In this paper, we study how to evolve a query cost model to capture a slowly-changing dynamic MDBS environment so that the cost model is kept up-to-date all the time. Two novel evolutionary techniques, i.e., the shifting method and the block-moving method, are proposed. The former updates a cost model by taking up-to-date information from a new sample query into consideration at each step, while the latter considers a block (batch) of new sample queries at each step. The relevant issues, including derivation of recurrence updating formulas, development of efficient algorithms, analysis and comparison of complexities, and design of an integrated scheme to apply the two methods adaptively, are studied. Our theoretical and experimental results demonstrate that the proposed techniques are quite promising in maintaining accurate cost models efficiently for a slowly changing dynamic MDBS environment. Besides the application to MDBSs, the proposed techniques can also be applied to the automatic maintenance of cost models in self-managing database systems.Received: 25 November 2002, Accepted: 20 May 2003, Published online: 30 September 2003Edited by: L. LiuResearch supported by the US National Science Foundation under Grant # IIS-9811980 and The University of Michigan under OVPR and UMD grants. 相似文献

5.

Query optimization in multidatabase systems considering schemaconflicts

Chiang Lee Chia-Jung Chen 《Knowledge and Data Engineering, IEEE Transactions on》1997,9(6):941-955

In a multidatabase system, the participating databases are autonomous. The schemas of these databases may be different in various ways, while the same information is represented. A global query issued against the global database needs to be translated to a proper form before it can be executed in a local database. Since data requested by a query (or a part of a query) is sometimes available in multiple sites, the site (database) that processes the query with the least cost is the desired query processing site. The authors study the effect of differences in schemas on the cost of query processing in a multidatabase environment. They first classify schema conflicts to different types. For each type of conflict, they show how much more or less complex a translated query can become in comparison with the originally user-issued global query. Based on this observation, they propose an analytical method that considers the conflicts between local databases and finds the database(s) that renders the least execution cost in processing a global query. This research introduces a new level of query optimization (termed the schema-level optimization) in multidatabase environments. The results provide a new dimension of enhancement for the capability of a query optimizer in multidatabase systems 相似文献

6.

Multidatabase Query Optimization

Cem Evrendilek Asuman Dogac Sena Nural Fatma Ozcan 《Distributed and Parallel Databases》1997,5(1):77-114

A multidatabase system (MDBS) allows the users to simultaneously access heterogeneous,and autonomous databases using an integrated schema and a single global query language. The query optimization problem in MDBSs is quite different from the query optimization problem in distributed homogeneous databases due to schema heterogeneity and autonomy of local database systems. In this work, we consider the optimization of query distribution in case of data replication and the optimization of intersite joins, that is, the join of the results returned by the local sitesin response to the global subqueries. The algorithms presented for the optimization of intersite joins try to maximize the parallelism in execution and take the federated nature of the problem into account. It has also been shown through a comparativeperformance study that the proposed intersite join optimization algorithms are efficient.The approach presented can easily be generalized to any operation required for intersite query processing. The query optimization scheme presentedin this paper is being implemented within the scopeof a multidatabase system which is based on OMG‘sobject management architecture. 相似文献

7.

Semantic query optimization for query plans of heterogeneousmultidatabase systems

Chun-Nan Hsu Knoblock C.A. 《Knowledge and Data Engineering, IEEE Transactions on》2000,12(6):959-978

New applications of information systems need to integrate a large number of heterogeneous databases over computer networks. Answering a query in these applications usually involves selecting relevant information sources and generating a query plan to combine the data automatically. As significant progress has been made in source selection and plan generation, the critical issue has been shifting to query optimization. This paper presents a semantic query optimization (SQO) approach to optimizing query plans of heterogeneous multidatabase systems. This approach provides global optimization for query plans as well as local optimization for subqueries that retrieve data from individual database sources. An important feature of our local optimization algorithm is that we prove necessary and sufficient conditions to eliminate an unnecessary join in a conjunctive query of arbitrary join topology. This feature allows our optimizer to utilize more expressive relational rules to provide a wider range of possible optimizations than previous work in SQO. The local optimization algorithm also features a new data structure called AND-OR implication graphs to facilitate the search for optimal queries. These features allow the global optimization to effectively use semantic knowledge to reduce the data transmission cost. We have implemented this approach in the PESTO (Plan Enhancement by SemanTic Optimization) query plan optimizer as a part of the SIMS information mediator. Experimental results demonstrate that PESTO can provide significant savings in query execution cost over query plan execution without optimization 相似文献

8.

多数据库系统的数据模式集成与查询处理 总被引：2，自引：0，他引：2

陶世群《电脑开发与应用》2003,16(12):27-28,34

在分析了多数据库系统数据模式体系结构的基础上 ,讨论了多数据库查询处理问题 :查询分解、查询转换和查询优化。给出了全局查询分解算法和全局查询优化算法相似文献

9.

Overview of multidatabase transaction management 总被引：8，自引：0，他引：8

Yuri Breitbart Ph.D. Hector Garcia-Molina Ph.D. Avi Silberschatz Ph.D. 《The VLDB Journal The International Journal on Very Large Data Bases》1992,1(2):181-239

A multidatabase system (MDBS) is a facility that allows users access to data located in multiple autonomous database management systems (DBMSs). In such a system,global transactions are executed under the control of the MDBS. Independently,local transactions are executed under the control of the local DBMSs. Each local DBMS integrated by the MDBS may employ a different transaction management scheme. In addition, each local DBMS has complete control over all transactions (global and local) executing at its site, including the ability to abort at any point any of the transactions executing at its site. Typically, no design or internal DBMS structure changes are allowed in order to accommodate the MDBS. Furthermore, the local DBMSs may not be aware of each other and, as a consequence, cannot coordinate their actions. Thus, traditional techniques for ensuring transaction atomicity and consistency in homogeneous distributed database systems may not be appropriate for an MDBS environment. The objective of this article is to provide a brief review of the most current work in the area of multidatabase transaction management. We first define the problem and argue that the multidatabase research will become increasingly important in the coming years. We then outline basic research issues in multidatabase transaction management and review recent results in the area. We conclude with a discussion of open problems and practical implications of this research. 相似文献

10.

Query Optimization in Multidatabase Systems

D.K. Subramanian K. Subramanian 《Distributed and Parallel Databases》1998,6(2):183-210

Global query execution in a multidatabase system can be done parallelly, as all the local databases are independent. In this paper, a cost model that considers parallel execution of subqueries for a global query is developed. In order to obtain maximum parallelism in query execution, it is required to find a query execution plan that is represented in the form of a bushy tree and this query tree should be balanced to the maximal possible extent with respect to execution time. A new bottom up approach called Agglomerative Approach (AA) is proposed to construct balanced bushy trees with respect to execution time. By the deterministic nature of this approach, it generates local optimal solutions. This local minima problem will be severe in the case of graph queries, i.e., queries that are represented with a graph structure. A Simulated annealing Approach (SA) is employed to obtain a (near) optimal solution. These approaches (AA and SA) are suitable for handling on-line and off-line queries respectively. A Hybrid Approach (HA), that is an integration of AA and SA, is proposed to optimize queries for which the estimated time to be spent on optimization is known a priori. Results obtained with AA and SA on both tree and graph structured queries are presented. 相似文献

11.

Shared-nothing并行数据库系统查询优化技术 总被引：15，自引：0，他引：15

文继荣陈红王珊《计算机学报》2000,23(1):28-38

查询优化是并行数据库系统的核心技术。该文介绍作者自行研制的一个Ｓｈａｒｅｄ－ｎｏｔｈｉｎｇ并行数据库系统ＰＢＡＳＥ／２中独特的两阶段优化策略。为了缩减并行相称优化庞大的搜索空间,ＰＢＡＳＥ／２将并行查询优化划分为顺序优化和并行化两个在阶段。在顺序优化阶段对并行化后的通信代价进行预先估算,将通信开销加入顺序优化的代价模型,同时对动态规划搜索算法进行了修正和扩展,保证了顺序优化阶段得到的最小代价计划在相似文献

12.

基于适应性的网格数据库中子查询的节点优化调度

胡乃静赵亮胡金华《计算机科学》2007,34(9):95-98

网格数据库是数据库技术和网格技术相结合后新的研究领域，网格的动态变化特性对数据库查询优化技术提出了适应性的要求。本文提出了基于Petri网描述的子查询计划模型TNSN，通过扩展子查询及其节点的数据关联关系的描述，建立了子查询进行适应性优化调度的查询计划模型；进一步提出了考虑变化的参数在内的耗费估算模型，并在TNSN和耗费模型的基础上提出了适应性优化算法，保证了查询处理过程中可以根据网格参数的变化情况对查询进行适应性调整，最后给出了实验验证。相似文献

13.

多数据库系统中的关键技术 总被引：11，自引：0，他引：11

下载免费PDF全文

韩伟红贾焰《计算机工程与科学》1999,21(6):49-52

本文主要介绍多数据库系统（ＭＤＢＳ）中的几个关键技术,包括ＭＤＢＳ的设计原则及体系结构、异构模式消解、查询处理、事务处理等方面的问题。相似文献

14.

Data model and query evaluation in global information systems 总被引：2，自引：1，他引：1

Alon Y. Levy Divesh Srivastava Thomas Kirk 《Journal of Intelligent Information Systems》1995,5(2):121-143

相似文献

15.

多库中并发控制的研究和实现 总被引：2，自引：0，他引：2

下载免费PDF全文

吴婷婷贾焰《计算机工程与科学》2001,23(3):51-54

多库是一组分布在多个结点上自制的数据库系统的集合,每个局部数据库系统可能使用不同的并发控制协议。自制和异构为保证多库系统的全局可串性带来了困难。本文首先描述了多库中全局可串性问题,然后介绍了并发控制服务CCS系统的特点及系统的逻辑流程,最后讨论了系统中各个接口对象的设计及实现。相似文献

16.

层次结构的多数据库系统中事务执行的正确性准则 总被引：2，自引：1，他引：1

陈国宁李陶深《计算机工程》2005,31(6):52-54

研究了层次式多数据库中事务执行的正确性问题.给出了层次式多数据库的定义和结构以及建立在其上的事务结构,根据多数据的特点提出了一种层次式多数据库中事务执行正确性准则,并举例说明其应用,最后给出了该标准的评价以及应用展望. 相似文献

17.

An Object Algebra Approach to Multidatabase Query Decomposition in Donají

Lavariega Juan C. Urban Susan D. 《Distributed and Parallel Databases》2002,12(1):27-71

This paper presents an approach to query decomposition in a multidatabase environment. The unique aspect of this approach is that it is based on performing transformations over an object algebra that can be used as the basis for a global query language. In the paper, we first present our multidatabase environment and semantic framework, where a global conceptual schema based on the Object Data Management Group standard encompasses the information from heterogeneous data sources that include relational databases as well as object-oriented databases and flat file sources. The meta-data about the global schema is enhanced with information about virtual classes as well as virtual relationships and inheritance hierarchies that exist between multiple sources. The AQUA object algebra is used as the formal foundation for manipulation of the query expression over the multidatabase. AQUA is enhanced with distribution operators for dealing with data distribution issues. During query decomposition we perform an extensive analysis of traversals for path expressions that involve virtual relationships and hierarchies for access to several heterogeneous sources. The distribution operators defined in algebraic terms enhance the global algebra expression with semantic information about the structure, distribution, and localization of the data sources relevant to the solution of the query. By using an object algebra as the basis for query processing, we are able to define algebraic transformations and exploit rewriting techniques during the decomposition phase. Our use of an object algebra also provides a formal and uniform representation for dealing with an object-oriented approach to multidatabase query processing. As part of our query processing discussion, we include an overview of a global object identification approach for relating semantically equivalent objects from diverse data sources, illustrating how knowledge about global object identity is used in the decomposition and assembly processes. 相似文献

18.

Global committability in multidatabase systems 总被引：1，自引：0，他引：1

Elmagarmid A.K. Jin Jing Won Kim Bukhres O. Zhang A. 《Knowledge and Data Engineering, IEEE Transactions on》1996,8(5):816-824

Develops a formal basis for research into the reliability aspects of transaction processing in multidatabase systems (MDBSs). We define a new correctness notion called `global committability' for the correct unilateral commit and the retry recovery of global transactions in an autonomous MDBS environment. This notion makes it easier to ensure the isolation property of global transactions when the retry approach is applied. The formalization work illustrates that the conventional serializability and recoverability notions are not sufficient to specify the correct execution (i.e. isolated execution and recovery) of global transactions when the unilateral commit and the retry recovery are used to ensure the atomicity of global transactions. This work is significant because the unilateral commit and the retry recovery are an attractive complementary means to the undo recovery (whose correct schedule is specified by the conventional recoverability notion) for advanced transaction applications with the characteristics of site autonomy and long-lived execution 相似文献

19.

Dynamic and fast processing of queries on large-scale RDF data

Pingpeng Yuan Changfeng Xie Hai Jin Ling Liu Guang Yang Xuanhua Shi 《Knowledge and Information Systems》2014,41(2):311-334

As RDF data continue to gain popularity, we witness the fast growing trend of RDF datasets in both the number of RDF repositories and the size of RDF datasets. Many known RDF datasets contain billions of RDF triples (subject, predicate and object). One of the grant challenges for managing these huge RDF data is how to execute RDF queries efficiently. In this paper, we address the query processing problems against the billion triple challenges. We first identify some causes for the problems of existing query optimization schemes, such as large intermediate results, initial query cost estimation errors. Then, we present our block-oriented dynamic query plan generation approach powered with pipelining execution. Our approach consists of two phases. In the first phase, a near-optimal execution plan for queries is chosen by identifying the processing blocks of queries. We group the join patterns sharing a join variable into building blocks of the query plan since executing them first provides opportunities to reduce the size of intermediate results generated. In the second phase, we further optimize the initial pipelining for a given query plan. We employ optimization techniques, such as sideways information passing and semi-join, to further reduce the size of intermediate results, improve the query processing cost estimation and speed up the performance of query execution. Experimental results on several RDF datasets of over a billion triples demonstrate that our approach outperforms existing RDF query engines that rely on dynamic programming based static query processing strategies. 相似文献

20.

An intelligent query processing for distributed ontologies

Jihyun Lee Author Vitae Jun-Ki Min^{Author Vitae} 《Journal of Systems and Software》2010,83(1):85-95

In this paper, we propose an intelligent distributed query processing method considering the characteristics of a distributed ontology environment. We suggest more general models of the distributed ontology query and the semantic mapping among distributed ontologies compared with the previous works. Our approach rewrites a distributed ontology query into multiple distributed ontology queries using the semantic mapping, and we can obtain the integrated answer through the execution of these queries. Furthermore, we propose a distributed ontology query processing algorithm with several query optimization techniques: pruning rules to remove unnecessary queries, a cost model considering site load balancing and caching, and a heuristic strategy for scheduling plans to be executed at a local site. Finally, experimental results show that our optimization techniques are effective to reduce the response time. 相似文献