首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
纯Peer to Peer环境下有效的Top-k查询   总被引:19,自引:2,他引:19       下载免费PDF全文
何盈捷  王珊  杜小勇 《软件学报》2005,16(4):540-552
目前大多数的Peer-to-Peer(P2P)系统只支持基于文件标识的搜索,用户不能根据文件的内容进行搜索.Top-k查询被广泛地应用于搜索引擎中,获得了巨大的成功.可是,由于P2P系统是一个动态的、分散的系统,在纯的P2P环境下进行top-k查询是具有挑战性的.提出了一种基于直方图的分层top-k查询算法.首先,采用层次化的方法实现分布式的top-k查询,将结果的合并和排序分散到P2P网络中的各个节点上,充分利用了网络中的资源.其次,根据节点返回的结果为节点构建直方图,利用直方图估计节点可能的分数上限,对节点进行选择,提高了查询效率.实验证明,top-k查询提高了查询效果,而直方图则提高了查询效率.  相似文献   

2.
The advent of the World Wide Web has made an enormous amount of information available to everyone and the widespread use of digital equipment enables end-users (peers) to produce their own digital content. This vast amount of information requires scalable data management systems. Peer-to-peer (P2P) systems have so far been well established in several application areas, with file-sharing being the most prominent. The next challenge that needs to be addressed is (more complex) data sharing, management and query processing, thus facilitating the delivery of a wide spectrum of novel data-centric applications to the end-user, while providing high Quality-of-Service. In this paper, we propose a self-organizing P2P system that is capable to identify peers with similar content and intentionally assign them to the same super-peer. During content retrieval, fewer super-peers need to be contacted and therefore efficient similarity search is supported, in terms of reduced network traffic and contacted peers. Our approach increases the responsiveness and reliability of a P2P system and we demonstrate the advantages of our approach using large-scale simulations.  相似文献   

3.
徐林昊  钱卫宁  周傲英 《软件学报》2007,18(6):1443-1455
对等计算数据管理中的一个重要问题是如何有效地支持多维数据空间上的相似性搜索.现有的非结构化对等计算数据共享系统仅支持简单的查询处理方法,即匹配查询处理.将近似技术和路由索引结合在一起,设计了一种简单、有效的索引结构EVARI(扩展近似向量路由索引).利用EVARI,每个节点不仅可以在本地共享的数据集上处理范围查询,而且还可以将查询转发给最有希望获得查询结果的邻居节点.为了建立EVARI,每个节点使用空间划分技术概括本地的共享内容,并与邻居节点交换概要信息.而且,每个节点都可以重新配置自己的邻居节点,使得相关节点位置相互邻近,优化了系统资源配置,提升了系统性能.仿真实验证明了该方法的良好性能.  相似文献   

4.
In recent years there has been a significant interest in peer-to-peer (P2P) environments in the community of data management. However, almost all work, so far, is focused on exact query processing in current P2P data systems. The autonomy of peers also is not considered enough. In addition, the system cost is very high because the information publishing method of shared data is based on each document instead of document set. In this paper, abstract indices (AbIx) are presented to implement content-based approximate queries in centralized, distributed and structured P2P data systems. It can be used to search as few peers as possible but get as many returns satisfying users' queries as possible on the guarantee of high autonomy of peers. Also, abstract indices have low system cost, can improve the query processing speed, and support very frequent updates and the set information publishing method. In order to verify the effectiveness of abstract indices, a simulator of 10,000 peers, over 3 million documents is made, and several metrics are proposed. The experimental results show that abstract indices work well in various P2P data systems.  相似文献   

5.
目前大多数P2P系统只提供文件的共享,缺乏数据管理能力.基于关系数据库上的关键搜索,本文提出了一种在P2P环境下共享数据库的新框架,其中每个节点上的数据库被看成是一个文档集,用户不用考虑数据库的模式结构信念,简化了不同节点数据库模式间的映射过程,能更好地适应P2P的分散和动态特性.将基于直方图的分层Top-k查询算法扩展到P2P环境下的数据库管理系统上,文档集和数据库的查询被统一起来,一致对待.在查询处理期间,直方图可以自动更新,同时根据查询结果,邻居节点可以自调整,具有自适应性.实验结果表明,基于关键词的数据库共享突破了传统的数据库共享模式,简化了数据访问方式,而基于直方图的Top-k查询算法提高了查询效率.  相似文献   

6.
P2P数据管理   总被引:14,自引:1,他引:14  
余敏  李战怀  张龙波 《软件学报》2006,17(8):1717-1730
P2P(peer-to-peer)技术是未来重构分布式体系结构的关键技术,拥有广阔的应用前景.P2P系统的大多数问题都可归结为数据放置和检索问题,因此,P2P数据管理成为数据库领域活跃的研究课题.当前,P2P数据管理主要有信息检索、数据库查询和连续查询3个子领域,取得了许多研究成果.在介绍P2P技术的优点后,指出了P2P数据管理研究的目标.然后针对上述3个方面,论述P2P数据管理研究的现状,着重讨论了P2P数据库查询的索引构造策略、语义异构的解决方法、查询语义、查询处理策略、查询类型和查询优化技术.通过比较,指出了现状与目标的差距,提出了需要进一步研究的问题.  相似文献   

7.
Peer-to-Peer (P2P) computing has recently attracted a great deal of research attention. In a P2P system, a large number of nodes can potentially be pooled together to share their resources, information, and services. However, existing unstructured P2P systems lack support for content-based search over data objects which are generally represented by high-dimensional feature vectors. In this paper, we propose an efficient and effective indexing mechanism to facilitate high-dimensional similarity query in unstructured P2P systems, named Linking Identical Neighborly Partitions (LINP), which combines both space partitioning technique and routing index technique. With the aid of LINP, each peer can not only process similarity query efficiently over its local data, but also can route the query to the promising peers which may contain the desired data. In the proposed scheme, each peer summarizes its local data using the space partitioning technique, and exchanges the summarized index with its neighboring peers to construct routing indices. Furthermore, to improve the system performance with peer updates, we propose an extension of the LINP, named LINP+, where each peer can reconfigure its neighboring peers to keep relevant peers nearby. The performance of our proposed scheme is evaluated over both synthetic and real-life high-dimensional datasets, and experimental results show the superiority of our proposed scheme.  相似文献   

8.
With the increasing popularity of the peer-to-peer (P2P) computing paradigm, many general range query schemes for distributed hash table (DHT)-based P2P systems have been proposed in recent years. Although those schemes can provide range query capability without modifying the underlying DHTs, they have the query delay depending on both the scale of the system and the size of the query space or the specific query, and thus cannot guarantee to return the query results in a bounded delay. In this paper, we propose Armada, an efficient range query processing scheme to support delay-bounded single-attribute and multiple-attribute range queries. It is the first delay-bounded general range query scheme on constant-degree DHTs, and can return the results for any range query within 2logN hops in a P2P system with N peers. Results of analysis and simulations show that the average delay in Armada is less than logN, and the average message cost of single-attribute range queries is about logN+2n 2 (n is the number of peers that intersect with the query). These results are very close to the lower bounds on delay and message cost of range queries over constant-degree DHTs.  相似文献   

9.
A sliding-window k-NN query (k-NN/w query) continuously monitors incoming data stream objects within a sliding window to identify k closest objects to a query. It enables effective filtering of data objects streaming in at high rates from potentially distributed sources, and offers means to control the rate of object insertions into result streams. Therefore k-NN/w processing systems may be regarded as one of the prospective solutions for the information overload problem in applications that require processing of structured data in real-time, such as the Sensor Web. Existing k-NN/w processing systems are mainly centralized and cannot cope with multiple data streams, where data sources are scattered over the Internet. In this paper, we propose a solution for distributed continuous k-NN/w processing of structured data from distributed streams. We define a k-NN/w processing model for such setting, and design a distributed k-NN/w processing system on top of the Content-Addressable Network (CAN) overlay. An extensive evaluation using both real and synthetic data sets demonstrates the feasibility of the proposed solution because it balances the load among the peers, while the messaging overhead within the P2P network remains reasonable. Moreover, our results clearly show the solution is scalable for an increasing number of queries and peers.  相似文献   

10.
While Peer-to-Peer (P2P) model gains significant attention in distributed computing, it is also expected to be a powerful model for information sharing. P2P systems are expected to provide exhaustive reliable computational resources and scalable accessibility. The data management and distribution in such systems requires storage, replication, data modeling, indexing, querying, retrieval, streaming, and topology management. While a lot of data management strategies have been proposed through the last years, these strategies have not been investigated with respect to a common model for P2P systems. However, since the services provided by the P2P systems are so diverse, it is very challenging to come up with a common layer-based model for all P2P systems. In this paper, we firstly propose a conceptual model for P2P systems, and then provide a classification and summary of data management and distribution strategies by referring to this model. The horizontal layers of the model correspond to modules of a P2P system whereas the columns are related to the services provided. The modules include base P2P service, storage, indexing, logical, service, and application modules. The services include security, querying, publish, join/leave, collaboration, and streaming. The paper concludes by providing a comprehensive list of data management and distribution strategies used in the existing P2P systems.  相似文献   

11.
在云计算环境中既能同时保护数据隐私和用户查询隐私,又能提供给用户满足需求的查询结果是云计算中面向隐私保护的查询处理的关键问题。对云计算中面向隐私保护的查询处理技术的若干关键问题进行了全面的调研,包括数据库索引技术与查询优化、基于加密的隐私保护技术、基于安全多方计算的隐私保护技术以及查询结果完整性验证技术。分析了云计算中面向隐私保护的查询处理技术的挑战性问题,指明了未来研究方向。  相似文献   

12.
Peer-to-Peer Desktop Grid (P2PDG) has emerged as a pervasive cyber-infrastructure tackling many large-scale applications with high impacts. As a burgeoning research area, P2PDG can support numerous applications, including scientific computing, file sharing, web services, and virtual organization for collaborative activities and projects. To handle trustworthiness issues of these services, trust and reputation schemes are proposed to establish trust among peers in P2PDG. In this paper, we propose a robust group trust management system, called H-Trust, inspired by the H-index aggregation technique. Leveraging the robustness of the H-index algorithm under incomplete and uncertain circumstances, H-Trust offers a robust personalized reputation evaluation mechanism for both individual and group trusts with minimal communication and computation overheads. We present the H-Trust scheme in five phases, including trust recording, local trust evaluation, trust query phase, spatial-temporal update phase, and group reputation evaluation phases. The rationale for its design, the analysis of the algorithm are further investigated. To validate the performance of H-Trust scheme, we designed the H-Trust simulator HTrust-Sim to conduct multi-agent-based simulations. Simulation results demonstrate that H-Trust is robust and can identify and isolate malicious peers in large scale systems even when a large portion of peers are malicious.  相似文献   

13.
P2P streaming systems, such as PPLive and PPStream, have become popular services with the widespread deployment of broadband networks. However, P2P streaming systems still face free-riding problems, similar to those that have been observed in P2P file sharing systems. Thus, one important problem in providing streaming services is that of providing appropriate incentives for peers to contribute their upload capacity. To this end, we propose the use of advertisements as an incentive for peers to contribute upload capacity. In the proposed framework, peers enjoy the same quality of streamed media, with the difference in quality of service being achieved through different amounts of advertisements viewed, based on the resource contributions to the system. Moreover, since calculating peers’ contributions accurately is important to successfully deploying such systems, we design a token-based framework to address this problem. An extensive simulation-based study is performed to evaluate the proposed approach. The results demonstrate that our approach provides appropriate incentives for peers to contribute their resources. Furthermore, we explore several characteristics of the token-based mechanism which can provide system developers with insight into efficient development of such systems.  相似文献   

14.
Active XML (AXML) as intensional data aims to exploit potential computing powers of XML, Web services and P2P architecture. It is considered a powerful extension of XML to deal with dynamic XML data from autonomous and heterogeneous data sources on a very large scale via Web services. However, AXML is still at an immature stage and various issues need to be investigated before it can be accepted widely. This paper will focus on two issues facing the current AXML system, namely the representation and the query process. We propose superior representation and improved query evaluation for AXML. For justification purposes, we compare our proposed algorithms with the existing algorithms.  相似文献   

15.
This paper proposes a two-level P2P caching strategy for Web search queries. The design is suitable for a fully distributed service platform based on managed peer boxes (set-top-box or DSL/cable modem) located at the edge of the network, where both boxes and access bandwidth to those boxes are controlled and managed by an ISP provider. Our solution significantly reduces user query traffic going outside of the ISP provider to get query results from the respective Web search engine. Web users are usually very reactive to worldwide events which cause highly dynamic query traffic patterns leading to load imbalance across peers. Our solution contains a strategy to quickly ease imbalance on peers and spread communication flow among participating peers. Each peer maintains a local result cache used to keep the answers for queries originated in the peer itself and queries for which the peer is responsible for by contacting the Web search engine on-demand. When query traffic is predominantly routed to a few responsible peers our strategy replicates the role of “being responsible for” to neighboring peers so that they can absorb query traffic. This is a fairly slow and adaptive process that we call mid-term load balancing. To achieve a short-term fair distribution of queries we introduce a location cache in each peer which keeps pointers to peers that have already requested the same queries in the recent past. This lets these peers share their query answers with newly requesting peers. This process is fast as these popular queries are usually cached in the first DHT hop of a requesting peer which quickly tends to redistribute load among more and more peers.  相似文献   

16.
This paper looks at the processing of skyline queries on peer-to-peer (P2P) networks. We propose Skyframe, a framework for efficient skyline query processing in P2P systems, which addresses the challenges of quick response time, low network communication cost and query load balancing among peers. Skyframe consists of two querying methods: one is optimized for network communication while the other focuses on query response time. These methods are different in the way in which the query search space is defined. In particular, the first method uses a high dominating point that has a large dominating region to prune the search space to achieve a low cost in network communication. On the other hand, the second method relaxes the search space in order to allow parallel query processing to speed up query response. Skyframe achieves query load balancing by both query load conscious data space splitting/merging during the join/departure of nodes and dynamic load migration. We further show how to apply Skyframe to both the P2P systems supporting multi-dimensional indexing and the P2P systems supporting single-dimensional indexing. Finally, we have conducted extensive experiments on both real and synthetic data sets over two existing P2P systems: CAN (Ratnasamy in A scalable content-addressable network. In: Proceedings of SIGCOMM Conference, pp. 161–172, 2001) and BATON (Jagadish et al. in A balanced tree structure for peer-to-peer networks. In: Proceedings of VLDB Conference, pp. 661–672, 2005) to evaluate the effectiveness and scalability of Skyframe.  相似文献   

17.
有效的多关键字查询路由是P2PWeb搜索中的一个关键问题。文章提出一种基于收益代价比的查询处理方法。该方法基于DHT的P2P覆盖网,挖掘关键字的关联性和节点间覆盖度和重叠度。利用最小独立置换进行重叠检测,因此避免了对相同记录的冗余路由。实验证明了该方法显著减少了查询时间,同时提高了查全率和查准率。  相似文献   

18.
Together with advanced positioning and mobile technologies, P2P query processing has attracted a growing interest number of location-aware applications such as answering kNN queries in mobile ad hoc networks. It not only overcomes drawbacks of centralized systems, for example single point of failure and bottleneck issues, but more importantly harnesses power of peers’ collaboration. In this research, we propose a pure mobile P2P query processing scheme which primarily focuses on the search and validation algorithm for kNN queries. The proposed scheme is designed for pure mobile P2P environments with the absence of the base station support. Compared with centralized and hybrid systems, our system can reduce energy consumption more than six times by making use of data sharing from peers in a reasonable mean latency of processing time for networks with high density of moving objects as can be seen in the simulation results.  相似文献   

19.
The increasing use of mobile communications has raised many issues of decision support and resource allocation. A crucial problem is how to solve queries of Reverse Nearest Neighbour (RNN). An RNN query returns all objects that consider the query object as their nearest neighbour. Existing methods mostly rely on a centralised base station. However, mobile P2P systems offer many benefits, including self-organisation, fault-tolerance and load-balancing. In this study, we propose and evaluate 3 distinct P2P algorithms focusing on bichromatic RNN queries, in which mobile query peers and static objects of interest are of two different categories, based on a time-out mechanism and a boundary polygon around the mobile query peers. The Brute-Force Search Algorithm provides a naive approach to exploit shared information among peers whereas two other Boundary Search Algorithms filter a number of peers involved in query processing. The algorithms are evaluated in the MiXiM simulation framework with both real and synthetic datasets. The results show the practical feasibility of the P2P approach for solving bichromatic RNN queries for mobile networks.  相似文献   

20.
Clustering is one of the important data mining issues, especially for large and distributed data analysis. Distributed computing environments such as Peer-to-Peer (P2P) networks involve separated/scattered data sources, distributed among the peers. According to unpredictable growth and dynamic nature of P2P networks, data of peers are constantly changing. Due to the high volume of computing and communications and privacy concerns, processing of these types of data should be applied in a distributed way and without central management. Today, most applications of P2P systems focus on unstructured P2P systems. In unstructured P2P networks, spreading gossip is a simple and efficient method of communication, which can adapt to dynamic conditions in these networks. Recently, some algorithms with different pros and cons have been proposed for data clustering in P2P networks. In this paper, by combining a novel method for extracting the representative data, a gossip-based protocol and a new centralized clustering method, a Gossip Based Distributed Clustering algorithm for P2P networks called GBDC-P2P is proposed. The GBDC-P2P algorithm is suitable for data clustering in unstructured P2P networks and it adapts to the dynamic conditions of these networks. In the GBDC-P2P algorithm, peers perform data clustering operation with a distributed approach only through communications with their neighbours. The GBDC-P2P does not need to rely on a central server and it performs asynchronously. Evaluation results demonstrate the superior performance of the GBDC-P2P algorithm. Also, a comparative analysis with other well-established methods illustrates the efficiency of the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号