首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
洪晓光  鲁平 《计算机工程与应用》2003,39(20):185-188,200
基于Internet上数据分布的特点,人们提出了合并查询(FusionQuery)的概念并且给出了相应的查询方案和相应的查询优化算法。但查询方案没有对数据量的分布情况加以考虑,这导致大量的冗余数据在各个数据源之间反复的传递,这样就大大地降低了方案的执行效率。目前多数研究集中于数据在均匀分布状态下的优化。但实际应用中数据多为不均匀分布,并且有很大的偏差。该文以此给出了在不均匀数据分布上的查询优化方案。并分析了其性能和比较。  相似文献   

2.
基于Web的异质数据库共享集成的建模研究   总被引:7,自引:0,他引:7  
基于Web的异质数据库是由若干自然分布、管理自治、模式异构的成员数据库组成的,旨在为用户提供一个完整的数据库视图和一致的访问接口.本文在分析异质数据库共享集成系统的特征和功能需求的基础上,提出了一种基于Web的包括表示层、WWW服务器层、事务管理层和数据库层的四层体系结构,并就全局数据字典、一致命名服务、事务调度等关键问题作了分析与论述.本文提出的建模方法和实现技术在实际中得到了应用.  相似文献   

3.
Query Processing and Optimization on the Web   总被引:2,自引:0,他引:2  
The advent of the Internet and the Web and their subsequent ubiquity have brought forth opportunities to connect information sources across all types of boundaries (local, regional, organizational, etc.). Examples of such information sources include databases, XML documents, and other unstructured sources. Uniformly querying those information sources has been extensively investigated. A major challenge relates to query optimization. Indeed, querying multiple information sources scattered on the Web raises several barriers for achieving efficiency. This is due to the characteristics of Web information sources that include volatility, heterogeneity, and autonomy. Those characteristics impede a straightforward application of classical query optimization techniques. They add new dimensions to the optimization problem such as the choice of objective function, selection of relevant information sources, limited query capabilities, and unpredictable events. In this paper, we survey the current research on fundamental problems to efficiently process queries over Web data integration systems. We also outline a classification for optimization techniques and a framework for evaluating them.  相似文献   

4.
基于XQuery的异构数据源查询处理   总被引:2,自引:0,他引:2       下载免费PDF全文
严小泉  刘渊 《计算机工程》2009,35(14):87-89
异构数据源的集成问题是当前数据处理领域内研究的热点,它能更有效地利用信息资源,更好地实现数据共享。介绍一种基于Mediator-Wrapper中间层的异构数据源集成系统框架,对XQuery查询处理过程及其关键问题,如查询分解和优化技术进行深入研究,并结合实例进一步说明异构数据源中查询分解和优化的具体实现。  相似文献   

5.
Query Decomposition for a Distributed Object-Oriented Mediator System   总被引:2,自引:0,他引:2  
The mediator-wrapper approach to integrate data from heterogeneous data sources has usually been centralized in the sense that a single mediator system is placed between a number of data sources and applications. As the number of data sources increases, the centralized mediator architecture becomes an administrative and performance bottleneck. This paper presents a query decomposition algorithm for a distributed mediation architecture where the communication among the mediators is on a higher level than the communication between a mediator and a data source. Some of the salient features of the proposed approach are: (i) exploring query execution schedules that contain data flow to the sources, necessary when integrating object-oriented sources that provide services (programs) and not only data; (ii) handling of functions with multiple implementations at more than one mediator or source; (iii) multi-phase query decomposition using a combination of heuristics and cost-based strategies; (iv) query plan tree rebalancing by distributed query recompilation.  相似文献   

6.
In this paper we describe two optimization techniques that are specially tailored for information gathering. The first is a greedy minimization algorithm that minimizes an information gathering plan by removing redundant and overlapping information sources without loss of completeness. We then discuss a set of heuristics that guide the greedy minimization algorithm so as to remove costlier information sources first. In contrast to previous work, our approach can handle recursive query plans that arise commonly in the presence of constrained sources. Second, we present a method for ordering the access to sources to reduce the execution cost. This problem differs significantly from the traditional database query optimization problem as sources on the Internet have a variety of access limitations and the execution cost in information gathering is affected both by network traffic and by the connection setup costs. Furthermore, because of the autonomous and decentralized nature of the Web, very little cost statistics about the sources may be available. In this paper, we propose a heuristic algorithm for ordering source calls that takes these constraints into account. Specifically, our algorithm takes both access costs and traffic costs into account, and is able to operate with very coarse statistics about sources (i.e., without depending on full source statistics). Finally, we will discuss implementation and empirical evaluation of these methods in Emerac, our prototype information gathering system.  相似文献   

7.
李剑 《软件学报》2008,19(2):369-378
在Web应用环境中,可以通过RDF(S)形式描述企业领域内分布信息资源的语义,以提高信息查询的准确性.提出了描述分布异构RDF(S)的分布RDF(S)模型,并基于这一模型给出了实现分布RDF(S)查询的方法,此查询方法既能实现实例层次的查询,也能实现概念层次的查询.基于这一方法,用户能够以统一的形式来查询,获取相关的信息资源,同时还可以实现分布RDF(S)的集成.  相似文献   

8.
异构数据源集成系统查询分解和优化的实现   总被引:54,自引:0,他引:54  
王宁  王能斌 《软件学报》2000,11(2):222-228
通用异构数据源集成系统需要集成包括WWW在内的各种数据源,有些数据源既无规则的模式结构,又无强有力的查询功能,给全局查询的分解和优化造成一定的困难.异构数据源集成系统Versatile一方面利用局部动态字典的模板操作构造集成系统全局动态字典,作为查询分解和优化的依据.一方面采用基于缓存和数据源能力的查询分解和优化策略,以便充分利用数据源的查询能力,简化包装器的设计,并取得较高的查询效率.  相似文献   

9.
Much information is nowadays stored electronically in document bases. Users retrieve information from these document bases by browsing and querying. While a large number of tools are available nowadays, not much work has been done on tools that support queries involving all the characteristics of documents as well as the use of domain knowledge during the search for information. In this paper we propose a query language that allows for querying documents using content information, information about the logical structure of the documents as well as information about properties of the documents. Domain knowledge is taken into account during the search as well. We also present an architecture for a system supporting such a language and we describe a prototype implementation together with test results.  相似文献   

10.
Flexible distributed query processing capabilities are an important prerequisite for building scalable Internet applications, such as electronic Business-to-Business (B2B) market places. Architecting an electronic market place in a conventional data warehouse-like approach by integrating all the data from all participating enterprises in one centralized repository incurs severe problems: stale data, data security threats, administration overhead, inflexibility during query processing, etc. In this paper we present a new framework for dynamic distributed query processing based on so-called HyperQueries which are essentially query evaluation sub-plans sitting behind hyperlinks. Our approach facilitates the pre-materialization of static data at the market place whereas the dynamic data remains at the data sources. In contrast to traditional data integration systems, our approach executes essential (dynamic) parts of the data-integrating views at the data sources. The other, more static parts of the data are integrated à priori at the central portal, e.g., the market place. The portal serves as an intermediary between clients and data providers which execute their sub-queries referenced via hyperlinks. The hyperlinks are embedded as attribute values within data objects of the intermediarys database. Retrieving such a virtual object will execute the referenced HyperQuery in order to materialize the missing data. We illustrate the flexibility of this distributed query processing architecture in the context of B2B electronic market places with an example derived from the car manufacturing industry.Based on these HyperQueries, we propose a reference architecture for building scalable and dynamic electronic market places. All administrative tasks in such a distributed B2B market place are modeled as Web services and are initiated decentrally by the participants. Thus, sensitive data remains under the full control of the data providers. We describe optimization and implementation issues to obtain an efficient and highly flexible data integration platform for electronic market places. All proposed techniques have been fully implemented in our QueryFlow prototype system which served as the platform for our performance evaluation.  相似文献   

11.
12.
13.
《Real》1996,2(3):139-152
Images are being generated at an ever increasing rate by diverse military and civilian sources. A content-based image retrieval system is required to utilize information from the image repositories effectively. Content-based retrieval is characterized by several generic query classes. With the existence of the information superhighway, image repositories are evolving in a decentralized fashion on the Internet. This necessitates network transparent distributed access in addition to the content-based retrieval capability.Images stored in low-level formats such as vector and raster are referred to as physical images. Constructing interactive responses to user queries using physical images is not practical and robust. To overcome this problem, we introduce the notion of logical features and describe various features to enable content-based query processing in a distributed environment. We describe a tool named SemCap for extracting the logical features semi-automatically. We also propose an architecture and an application level communication protocol for distributed content-based retrieval. We describe the prototype implementation of the architecture and demonstrate its versatility on two distributed image collections.  相似文献   

14.
Mediator systems integrate distributed, heterogeneous and autonomous data sources, but their effective use requires the solution of hard query optimization problems. This is usually done in two phases: the selection of a set of data sources is similar to a set covering problem, and their ordering into a feasible and efficient query is a capability restricted join order problem. However, a two-phase approach is unlikely to find optimum queries. We describe a new single-phase approach that, under a simple cost model, can be encoded and solved as a SAT problem. Results on artificial benchmarks indicate that this is an interesting problem from the encoding and search viewpoints, and we use them to address three of the ten SAT challenges posed by Selman, Kautz and McAllester in 1997.  相似文献   

15.
We describe the Enosys XML integration platform, focusing on the query language, algebra, and architecture of its query processor. The platform enables the development of eBusiness applications in customer relationship management, e-commerce, supply chain management, and decision support. These applications often require that data be integrated dynamically from multiple information sources. The Enosys platform allows one to build (virtual and/or materialized) integrated XML views of multiple sources, using XML queries as view definitions. During run-time, the application issues XML queries against the views. Queries and views are translated into the XCQL algebra and are combined into a single algebra expression/plan. Query plan composition and query plan decomposition challenges are faced in this process. Finally, the query processor lazily evaluates the result, using an appropriate adaptation of relational database iterator models to XML. The paper describes the platform architecture and components, the supported XML query language and the query processor architecture. It focuses on the underlying XML query algebra, which differs from the algebras that have been considered by W3C in that it is particularly tuned to semistructured data and to optimization and efficient evaluation in a system that follows the conventional architecture of database systems.  相似文献   

16.
Web信息集成系统中查询的处理   总被引:1,自引:0,他引:1  
为了有效地实现对Web上异构数据源的统一查询处理,提出了一个基于本体的异构数据源集成系统模型OBIISM,引入本体解决各数据源语义层上的异构,通过两级查询重写将用户提交的查询转化为对数据源的查询,为查询异构数据源提供了一个语义统一的接口.  相似文献   

17.
18.
以实现分布式查询的正确性、透明性及优化性为目标,针对粮食储备管理系统的分布式查询处理需求,系统地研究分布式查询处理器的总体设计、线程控制、消息通信、分布式查询优化等问题以及实现技术, 弥补了SQL Server数据库中分布式查询功能的不足.  相似文献   

19.
20.
多校区院校地理位置分散造成数据共享不便,不能及时汇总、统计和查询,不方便统一规划和部署,不能迅速提供决策依据,从而使学校的信息管理成为棘手之事。为方便统一管理,设计和开发基于分布式数据库结构的集信息管理、统计分析、网上信息发布和采集于一体的高校管理信息系统。阐述了多校区信息管理系统设计方案中的分布式数据库技术,包括系统中分布式数据库的设计原则;系统结构、数据复制技术与数据更新及分布式查询等内容。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号