首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Clustering is one of the important data mining issues, especially for large and distributed data analysis. Distributed computing environments such as Peer-to-Peer (P2P) networks involve separated/scattered data sources, distributed among the peers. According to unpredictable growth and dynamic nature of P2P networks, data of peers are constantly changing. Due to the high volume of computing and communications and privacy concerns, processing of these types of data should be applied in a distributed way and without central management. Today, most applications of P2P systems focus on unstructured P2P systems. In unstructured P2P networks, spreading gossip is a simple and efficient method of communication, which can adapt to dynamic conditions in these networks. Recently, some algorithms with different pros and cons have been proposed for data clustering in P2P networks. In this paper, by combining a novel method for extracting the representative data, a gossip-based protocol and a new centralized clustering method, a Gossip Based Distributed Clustering algorithm for P2P networks called GBDC-P2P is proposed. The GBDC-P2P algorithm is suitable for data clustering in unstructured P2P networks and it adapts to the dynamic conditions of these networks. In the GBDC-P2P algorithm, peers perform data clustering operation with a distributed approach only through communications with their neighbours. The GBDC-P2P does not need to rely on a central server and it performs asynchronously. Evaluation results demonstrate the superior performance of the GBDC-P2P algorithm. Also, a comparative analysis with other well-established methods illustrates the efficiency of the proposed method.  相似文献   

2.
There are two basic concerns for supporting multi-dimensional range query in P2P overlay networks. The first is to preserve data locality in the process of data space partitioning, and the second is the maintenance of data locality among data ranges with an exponentially expanding and extending rate. The first problem has been well addressed by using recursive decomposition schemes, such as Quad-tree, K-d tree, Z-order, and Hilbert curve. On the other hand, the second problem has been recently identified by our novel data structure: HD Tree. In this paper, we explore how data locality can be easily maintained, and how range query can be efficiently supported in HD Tree. This is done by introducing two basic routing strategies: hierarchical routing and distributed routing. Although hierarchical routing can be applied to any two nodes in the P2P system, it generates high volume traffic toward nodes near the root, and has very limited options to cope with node failure. On the other hand, distributed routing concerns source and destination pairs only at the same depth, but traffic load is bound to some nodes at two neighboring depths, and multiple options can be found to redirect a routing request. Because HD Tree supports multiple routes between any two nodes in the P2P system, routing in HD Tree is very flexible; it can be designed for many purposes, like fault tolerance, or dynamic load balancing. Distributed routing oriented combined routing (DROCR) algorithm is one such routing strategy implemented so far. It is a hybrid algorithm combining advantages from both hierarchical routing and distributed routing. The experimental results show that DROCR algorithm achieves considerable performance gain over the equivalent tree routing at the highest depth examined. For supporting multi-dimensional range query, the experimental results indicate that the exponentially expanding and extending rate have been effectively controlled and minimized by HD Tree overlay structure and DROCR routing.  相似文献   

3.
Efficient and scalable search on scale-free P2P networks   总被引:1,自引:0,他引:1  
Unstructured peer-to-peer (P2P) systems (e.g. Gnutella) are characterized by uneven distributions of node connectivity and file sharing. The existence of “hub” nodes that have a large number of connections and “generous” nodes that share many files significantly influences performance of information search over P2P file-sharing networks. In this paper, we present a novel Scalable Peer-to-Peer Search (SP2PS) method with low maintenance overhead for resource discovery in scale-free P2P networks. Different from existing search methods which employ one heuristic to direct searches, SP2PS achieves better performance by considering both of the number of shared files and the connectivity of each neighbouring node. SP2PS enables peer nodes to forward queries to the neighbours that are more likely to have the requested files and also can help in finding the requested files in the future hops. The proposed method has been simulated in different power-law networks with different forwarding degrees and distances. From our analytic and simulation results, SP2PS achieves better performance when compared to other related methods.
David WebsterEmail:
  相似文献   

4.
分析了局域网内网用户大流量下载的特点,将数据挖掘技术应用于发现用户大规模P2P下载中,建立了检测大规模P2P下载的模型,根据领域知识,有效克服了关联算法在这个领域中的局限性,并优化了数据挖掘中的关联算法,通过实验证明优化的算法提高了检测的效率.最后将挖掘出的规则与防火墙系统联动,拒绝了局域网中大规模的下载,提高了校园网的利用效率.  相似文献   

5.
Peer-to-peer (P2P) networks are beginning to form the infrastructure of future applications. Heavy network traffic limits the scalability of P2P networks. Indexing is a method to reduce this traffic. But indexes tend to become large with the growth of the network. Also, limiting the size of these indexes causes loss of indexing information. In this paper we introduce a novel ontology based index (OI) which limits the size of the indexes without sacrificing indexing information. We show that the method can be employed by many P2P networks. The OI sits on top of routing and maintenance modules of a P2P network and enhances it. The OI prunes branches of search trees which have no chance to proceed to a response. Also the OI guarantees that an enhanced routing algorithm and its basic version have the same result set for a given search query. This means that the OI reduces traffic without reducing quality of service. To measure the performance of the OI we apply it on Chord (DHT based) and HyperCup (non-DHT based) P2P networks and show that it reduces the networks’ traffic significantly.  相似文献   

6.
It is a common situation nowadays that business groups own different companies that operate in an autonomous way. Nevertheless, these companies must be requested to provide the headquarters with summarized information for decision-making. An architecture for cooperative interchange of decision-making information seems to be a natural solution for this problem. We propose the use of a peer-to-peer (P2P) architecture for addressing the problem of processing OLAP data in a distributed environment, in a way that all companies involved can maintain full autonomy over the use of its own data resources. In a scenario like this, data exchange between peers occurs when one of them, in the role of a local peer, receives a query and, for answering it, requests data available in other nodes, denoted acquaintances. No global schema is assumed to exist for any data under this computing paradigm. Henceforth, data provided by an acquaintance of a local peer must be adapted, in a manner that answers to queries posed by local peer users conform the view those users have of their data. Because multidimensional data normally consist of a collection of views of aggregated data, a careful translation process is needed in this case, in order to transform any summary concept that appears in a peer acquaintance into a summary concept meaningful to the requesting peer. We first present a model for multidimensional data distributed in a P2P network, and a query rewriting technique, that allows a local peer to propagate OLAP queries among its acquaintances, obtaining a meaningful and correct answer. Mappings are performed using a novel technique called revise and map, based on belief revision concepts. Revising a dimension instance allows to produce consistent aggregations when an OLAP query is answered at more than one node. We then describe an implementation of a P2P system for answering OLAP queries over a network of data warehouses. We apply our proposal to a real-world case study of an insurance group. Finally, we report the results of an experimental evaluation of our implementation, and discuss the issues that must be accounted for in this setting.  相似文献   

7.
Nowadays, as the mobile services become widely used, there is a strong demand for mobile support in P2P search techniques. In this paper, we introduce a new cost model for searching multi-dimensional data in mobile P2P environment and propose a novel multi-dimensional mobile P2P search framework called MIME. MIME models the physical node layout in a two-dimensional plane and keeps records of the locations of the nodes to construct a proximity-aware P2P overlay. MIME is able to employ two different split schemes for the construction of the overlay. We propose query processing techniques for such P2P overlay. In addition, we employ a novel expanding method for tuning the performance of KNN queries in MIME. We also discuss two adaptive features incorporated into MIME to support mobility: an update algorithm that makes dynamic updates to the overlay, and a cache mechanism that reduces the load of data migration during the updates. The experimental results show that the proposed techniques are effective, and that MIME achieves significant performance improvements in Point, Range, and KNN queries compared to the conventional system.  相似文献   

8.
P2P数据管理系统已经成为对等计算领域的研究重点。语义异构是P2P数据管理系统的首要问题。为了解决此问题,在每个数据源节点对共享的数据表的表名和属性名分别定义一系列关键字作为语义映射的媒介,具有相同关键字的异构数据源之间自动建立映射关系。这些关键字就形成了共享数据的外模式。但在节点内部,没有将外模式真正地物化为视图。定义好的关键字使用外模式描述文件分布到整个网络中。在查询的过程中,找到外模式描述文件后,立即将查询请求中的所有别名转换为真实的数据表名和属性名,从而既可以方便地按照任意名字找到需要的数据表,又可以减少数据备份的数量,简化查询算法,提高系统效率。  相似文献   

9.
This paper analyzes reciprocation strategies in peer-to-peer networks from the point of view of the resulting resource allocation. Our stated aim is to achieve through decentralized interactions a weighted proportionally fair allocation. We analyze the desirable properties of such allocation, as well as an ideal proportional reciprocity algorithm to achieve it, using tools of convex optimization. We then seek suitable approximations to the ideal allocation which impose practical constraints on the problem: numbers of open connections per peer, with transport layer-induced bandwidth sharing, and the need of random exploration of the peer-to-peer swarm. Our solution in terms of a Gibbs sampler dynamics characterized by a suitable energy function is implemented in simulation, comparing favorably with a number of alternatives.  相似文献   

10.
XML instances are not necessarily self-contained but may have connections to remote XML data residing on other servers. In this paper, we show that—in spite of its minor support and use in the XML world—the XLink language provides a powerful mechanism for expressing such links both from the modeling point of view and for actually querying interlinked XML data: in our dbxlink approach, the links are not seen as explicit links (where the users must be aware of the links and traverse them explicitly in their queries), but define views that combine into a logical, transparent XML model which serves as an external schema and can be queried by XPath/XQuery. We motivate the underlying modeling and give a concise and declarative specification as an XML-to-XML mapping. We also describe the implementation of the model as an extension of the eXist [eXist: an Open Source Native XML Database, http://exist-db.org/] XML database system. The approach can be applied both for distribution of data and for integration of data from autonomous sources.  相似文献   

11.
Scalable search and retrieval over numerous web document collections distributed across different sites can be achieved by adopting a peer-to-peer (P2P) communication model. Terms and their document frequencies are the main components of text information retrieval and as such need to be computed, aggregated, and distributed throughout the system. This is a challenging problem in the context of unstructured P2P networks, since the local document collections may not reflect the global collection in an accurate way. This might happen due to skews in the distribution of documents to peers. Moreover, central assembly of the total information is not a scalable solution due to the excessive cost of storage and maintenance, and because of issues related to digital rights management. In this paper, we present an efficient hybrid approach for aggregation of document frequencies using a hierarchical overlay network for a carefully selected set of the most important terms, together with gossip-based aggregation for the remaining terms in the collections. Furthermore, we present a cost analysis to compute the communication cost of hybrid aggregation. We conduct experiments on three document collections, in order to evaluate the quality of the proposed hybrid aggregation.  相似文献   

12.
Modeling and querying moving objects in networks   总被引:11,自引:0,他引:11  
Moving objects databases have become an important research issue in recent years. For modeling and querying moving objects, there exists a comprehensive framework of abstract data types to describe objects moving freely in the 2D plane, providing data types such as moving point or moving region. However, in many applications people or vehicles move along transportation networks. It makes a lot of sense to model the network explicitly and to describe movements relative to the network rather than unconstrained space, because then it is much easier to formulate in queries relationships between moving objects and the network. Moreover, such models can be better supported in indexing and query processing. In this paper, we extend the ADT approach by modeling networks explicitly and providing data types for static and moving network positions and regions. In a highway network, example entities corresponding to these data types are motels, construction areas, cars, and traffic jams. The network model is not too simplistic; it allows one to distinguish simple roads and divided highways and to describe the possible traversals of junctions precisely. The new types and operations are integrated seamlessly into the ADT framework to achieve a relatively simple, consistent and powerful overall model and query language for constrained and unconstrained movement.  相似文献   

13.
There are two major building blocks in operating a peer-to-peer (P2P) video-on-demand (VOD) network: supplier discovery and content delivery. Supplier discovery concerns the discovery of peer nodes in the network that can provide the streaming data blocks needed for playing by a local node. The more suppliers one can discover, the higher the chance of locating quality suppliers for delivering contents smoothly to ensure uninterrupted playback. The key to supplier discovery is to establish and track the supply-demand relationship among the peers. For P2P VOD, the supply-demand relationship is determined by the buffer contents of the peers. Unfortunately, the buffer contents change rapidly as peers play the video, especially under VCR operations. The challenge is to track all the dynamic relationships in an efficient way. In this paper, we propose an Overlapping Relation Network (ORN). The idea is to track the dynamic supply-demand relationship by tracking the overlapping of peers’ buffer contents. As long as peers play the video at the same rate, the overlapping relationship is stable and can be used for low-cost supplier discovery. Extensive analyses and simulation experiments show that in most cases the ORN can discover more than 96% of the suppliers in the network, resulting in a streaming continuity that is superior to that of other approaches.  相似文献   

14.
Free riding is a common phenomenon in peer-to-peer (P2P) file sharing networks. Although several mechanisms have been proposed to handle free riding—mostly to exclude free riders, few of them have been adopted in a practical system. This may be attributed to the fact that the mechanisms are often nontrivial, and that completely eliminating free riders could jeopardize the sheer power of the network arising from the huge volume of its participants. Rather than excluding free riders, we incorporate and utilize them to provide global index service to the files shared in the network, as well as to relay messages in the search process. The simulation results indicate that our mechanism not only can shift the query processing load from non-free riders to free riders, but can also significantly boost the search efficiency of a plain Gnutella. Moreover, the mechanism is quite resilient to high free riding ratio.  相似文献   

15.
File replication is a widely used technique for high performance in peer-to-peer content delivery networks. A file replication technique should be efficient and at the same time facilitates efficient file consistency maintenance. However, most traditional methods do not consider nodes’ available capacity and physical location in file replication, leading to high overhead for both file replication and consistency maintenance. This paper presents a proactive low-overhead file replication scheme, namely Plover. By making file replicas among physically close nodes based on nodes’ available capacities, Plover not only achieves high efficiency in file replication but also supports low-cost and timely consistency maintenance. It also includes an efficient file query redirection algorithm for load balancing between replica nodes. Theoretical analysis and simulation results demonstrate the effectiveness of Plover in comparison with other file replication schemes. It dramatically reduces the overhead of both file replication and consistency maintenance compared to other schemes. In addition, it yields significant improvements in reduction of overloaded nodes.  相似文献   

16.
Visualized cognitive knowledge map integration for P2P networks   总被引:2,自引:0,他引:2  
This study proposes a visualized cognitive knowledge map integration system, called VisCog, to facilitate knowledge management on P2P networks. By using the SOM (self-organized map)-like model, Egocentric SOM (ESOM), VisCog can merge the other peers' knowledge artifacts (e.g., documents) under a focal peer's knowledge structure and visually present the cognitive knowledge map of the P2P network. The experimental results from evaluating VisCog performance show that VisCog can retain an individual peer's knowledge structure while articulating with those of other peers to build its cognitive knowledge map.  相似文献   

17.
In this paper, we propose a novel decentralized resource maintenance strategy for peer-to-peer (P2P) distributed storage networks. Our strategy relies on the Wuala overlay network architecture, (The WUALA Project). While the latter is based, for the resource distribution among peers, on the use of erasure codes, e.g., Reed–Solomon codes, here we investigate the system behavior when a simple randomized network coding strategy is applied. We propose to replace the Wuala regular and centralized strategy for resource maintenance with a decentralized strategy, where users regenerate new fragments sporadically, namely every time a resource is retrieved. Both strategies are analyzed, analytically and through simulations, in the presence of either erasure and network coding. It will be shown that the novel sporadic maintenance strategy, when used with randomized network coding, leads to a fully decentralized solution with management complexity much lower than common centralized solutions.  相似文献   

18.
刘天鹏  周娅 《计算机应用》2008,28(1):162-164,
在分析了现有分布式数据挖掘算法的运行机制和P2P技术具有无中心、不同步等特点的基础上,通过扩展经典K-mean算法的迭代过程,设计了一种能够用于P2P网络的分布式数据挖掘算法。该算法只需要在直接相连的节点间传递数据,并且能使每个节点上的数据按照全局聚类的结果聚合。最后用模拟实验验证了该算法的有效性。  相似文献   

19.
It is necessary to construct an effective trust model to build trust relationship between peers in peer-to-peer (P2P) network and enhance the security and reliability of P2P systems. The current trust models only focus on the consumers evaluation to a transaction, which may be abused by malicious peers to exaggerate or slander the provider deliberately. In this paper, we propose a novel trust model based on mutual evaluation, called METrust, to suppress the peers malicious behavior, such as dishonest evaluation and strategic attack. METrust considers the factors including mutual evaluation, similarity risk, time window, incentive, and punishment mechanism. The trust value is composed of the direct trust value and the recommendation trust value. In order to inhibit dishonest evaluation, both participants should give evaluation information based on peers own experiences about the transaction while computing the direct trust value. In view of this, the mutual evaluation consistency factor and its time decay function are proposed. Besides, to reduce the risk of computing the recommendation trust based on the recommendations of friend peers, the similarity risk is introduced to measure the uncertainty of the similarity computing, while similarity is used to measure credibility. The experimental results show that METrust is effective, and it has advantages in the inhibition of the various malicious behaviors.  相似文献   

20.
This paper describes a technique for clustering homogeneously distributed data in a peer-to-peer environment like sensor networks. The proposed technique is based on the principles of the K-Means algorithm. It works in a localized asynchronous manner by communicating with the neighboring nodes. The paper offers extensive theoretical analysis of the algorithm that bounds the error in the distributed clustering process compared to the centralized approach that requires downloading all the observed data to a single site. Experimental results show that, in contrast to the case when all the data is transmitted to a central location for application of the conventional clustering algorithm, the communication cost (an important consideration in sensor networks which are typically equipped with limited battery power) of the proposed approach is significantly smaller. At the same time, the accuracy of the obtained centroids is high and the number of samples which are incorrectly labeled is also small.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号