首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 27 毫秒
1.
Small-world (SW) networks possess two properties, namely low diameter and high clustering coefficient, which are often desired by large-scale peer-to-peer networks. Prior studies have shown that the construction of an SW network can be based on a d-regular graph, and each node in the graph maintains d local neighbors and a small constant number of long-distance contacts. However, it is commonly understood that it is difficult to construct a short route in an SW network, given source (s) and target (t) nodes, though an SW network guarantees that a short route from s to t exists. Prior work in [1] proposed a navigable SW network for a d-dimensional lattice such that a simple localized routing algorithm can be devised to route a message from s to t using O(log^{2}{cal X}) hops, where {cal X} is the number of nodes in the network. In this paper, we present a novel navigable SW network based on a hierarchical model. Compared to previous efforts, the novelty of our study presents 1) that our network construction based on a hierarchical model is decentralized, 2) that routing a message between any two nodes in our SW network takes logarithmic hopcount in expectation, 3) that our SW network has high cluster coefficient, and 4) that the performance of our proposal is mathematically provable. We support the performance of our proposal in this study through rigorous, thorough performance analysis and extensive simulations.  相似文献   

2.
Popular distributed graph processing frameworks, such as Pregel and GraphLab, are based on the vertex-centric computation model, where users write their customized Compute function for each vertex to process the data iteratively. Vertices are evenly partitioned among the compute nodes, and the granularity of parallelism of the graph algorithm is normally tuned by adjusting the number of compute nodes. Vertex-centric model splits the computation into phases. Inside one specific phase, the computation proceeds as an embarrassingly parallel process, because no communication between compute nodes incurs. By default, current graph engine only handles one iteration of the algorithm in a phase. However, in this paper, we find that we can also tune the granularity of parallelism, by aggregating the computation of multiple iterations into one phase, which has a significant impact on the performance of the graph algorithm. In the ideal case, if all computations are handled in one phase, the whole algorithm turns into an embarrassingly parallel algorithm and the benefit of parallelism is maximized. Based on this observation, we propose two approaches, a function-based approach and a parameter-based approach, to automatically transform a Pregel algorithm into a new one with tunable granularity of parallelism. We study the cost of such transformation and the trade-off between the granularity of parallelism and the performance. We provide a new direction to tune the performance of parallel algorithms. Finally, the approaches are implemented in our graph processing system, N2, and we illustrate their performance using popular graph algorithms.  相似文献   

3.
To facilitate data mining and integration (DMI) processes in a generic way, we investigate a parallel pipeline streaming model. We model a DMI task as a streaming data-flow graph: a directed acyclic graph (DAG) of Processing Elements (PEs). The composition mechanism links PEs via data streams, which may be in memory, buffered via disks or inter-computer data-flows. This makes it possible to build arbitrary DAGs with pipelining and both data and task parallelisms, which provide room for performance enhancement. We have applied this approach to a real DMI case in the life sciences and implemented a prototype. To demonstrate feasibility of the modelled DMI task and assess the efficiency of the prototype, we have also built a performance evaluation model. The experimental evaluation results show that a linear speedup has been achieved with the increase of the number of distributed computing nodes in this case study.  相似文献   

4.
数据分布型sort-first并行图形绘制系统的研究与实现   总被引:10,自引:1,他引:10  
sort-first体系结构常用来构建高性能并行图形绘制系统,基于immediate-mode的数据集中型Sort-first系统,对网络带宽高度依赖,网络带宽和归属计算易成为系统瓶颈,提出了一个基于retain-mode的数据分布型并行绘制系统,工作原理是将几何数据分布于绘制结点,并利用帧间相似性动态调整绘制结点上的数据分布以适应视角的改变,有效地降低了数据分布所需的传输开销,系统利用Cell结构来控制并行粒度,实验结果显示能以相对较低的并行开销实现高分辨率显示和并行加速。  相似文献   

5.
A linear-dynamic network consists of a directed graph in which the nodes represent subsystems and the arcs model dynamic couplings. The local state of each subsystem evolves according to discrete linear dynamics that depend on the local state, local control signals, and control signals of upstream subsystems. Such networks appear in the model predictive control (MPC) of geographically distributed systems such as urban traffic networks and electric power grids. In this correspondence, we propose a decomposition of the quadratic MPC problem into a set of local subproblems that are solved iteratively by a network of agents. A distributed algorithm based on the method of feasible directions is developed for the agents to iterate toward a solution of the subproblems. The local iterations require relatively low effort to arrive at a solution but at the expense of high communication among neighboring agents and with a slower convergence rate.  相似文献   

6.
周纯英  曾诚  何鹏  张龑 《软件学报》2023,34(6):2509-2525
研究人员将软件系统中的关键类作为理解和维护一个系统的起点,而关键类上的缺陷对系统造成极大的安全隐患.因此,识别关键类可提高软件的可靠性和稳定性.常用识别方法是将软件系统抽象为一个类依赖网络,再根据定义好的度量指标和计算规则计算每个节点的重要性得分,如此基于非训练的框架得到的关键类,并没有充分利用软件网络的结构信息.针对这一问题,本文基于图神经网络技术提出了一种有监督的关键类识别方法.首先,将软件系统抽象为类粒度的软件网络,并利用网络嵌入学习方法node2vec得到类节点的表征向量,再通过一个全连接层将节点的表征向量转换为具体分值;然后,利用改进的图神经网络模型,综合考虑类节点之间的依赖方向和权重,进行节点分值的聚合操作;最后,模型输出每个类节点的最终得分并进行降序排序,从而实现关键类的识别.在八个Java开源软件系统上通过与基准方法实验对比,验证了本文方法的有效性.实验结果表明,在前10个候选关键类中,本文所提方法比最先进的方法提升了6.4%的召回率和3.5%的精确率.  相似文献   

7.
This paper describes a biased random-key genetic algorithm for a real-world wireless backhaul network design problem. This is a novel problem, closely related to variants of the Steiner tree problem and the facility location problem. Given a parameter h, we want to build a forest where each tree has at most h hops from the demand nodes, where traffic originates, to the root nodes where each tree is rooted. Candidate Steiner nodes do not have any demand but represent locations where we can install cellsites to cover the traffic and equipment to backhaul the traffic to the cellular core network. Each Steiner node can cover demand nodes within a given distance, subject to a capacity constraint. The aggregate set of constraints may make it impossible to cover or backhaul all demands. A revenue function computes the revenue associated with the total amount of traffic covered and backhauled to the root nodes. The objective of the problem is to build a forest that maximizes the difference between the total revenue and the cost associated with the installed equipment. Although we will have a forest when we consider only the backhaul links and root nodes, the addition of demand vertices can induce undirected cycles, resulting in a directed acyclic graph. We consider instances of this problem with several additional constraints that are motivated by the requirements of real-world telecommunication networks.  相似文献   

8.
In this paper we propose a parallel coordinate descent algorithm for solving smooth convex optimization problems with separable constraints that may arise, e.g. in distributed model predictive control (MPC) for linear network systems. Our algorithm is based on block coordinate descent updates in parallel and has a very simple iteration. We prove (sub)linear rate of convergence for the new algorithm under standard assumptions for smooth convex optimization. Further, our algorithm uses local information and thus is suitable for distributed implementations. Moreover, it has low iteration complexity, which makes it appropriate for embedded control. An MPC scheme based on this new parallel algorithm is derived, for which every subsystem in the network can compute feasible and stabilizing control inputs using distributed and cheap computations. For ensuring stability of the MPC scheme, we use a terminal cost formulation derived from a distributed synthesis. Preliminary numerical tests show better performance for our optimization algorithm than other existing methods.  相似文献   

9.
Network-based concurrent computing and interactive data visualization are two important components in industry applications of high-performance computing and communication. We propose an execution framework to build interactive remote visualization systems for real-world applications on heterogeneous parallel and distributed computers. Using a dataflow model of a commercial visualization software AVS in three case studies, we demonstrate a simple, effective, and modular approach to couple parallel simulation modules into an interactive remote visualization environment. The applications described in this paper are drawn from our industrial projects in financial modeling, computational electromagnetics and computational chemistry. Copyright © 1999 John Wiley & Sons, Ltd.  相似文献   

10.
Wei  Xing  Hu  Huiqi  Duan  Huichao  Qian  Weining  Zhou  Aoying 《World Wide Web》2019,22(6):2561-2587

To support the large-scale analytic for Web applications, the backend distributed data management system must provide the service for accessing massive data. Thus, the scan operation becomes a critical step. To improve the performance of scan operation, modern data management systems usually rely on the simple partitioned parallelism. Under the partitioned parallelism, tables are consist of several partitions, and each scan operation can access multiple partitions separately. It is a simple and effective solution for a single scan operation. In this paper, we consider managing multiple scan operations together, where the situation is no longer straightforward. To address the problem, we propose the parallel strategy to schedule batched scan operations together beyond the simple partitioned parallelism. For the sake of performance, first, we utilize replications to increase the parallelism and propose an effective load balancing strategy over replication nodes based on linear programming. Second, we propose an effective chunk-based scheduling algorithm for multi-threading parallelism on each node to guarantee all threads have even workloads under a qualified cost model. Finally, we integrate our parallel scan strategy into an open-sourced distributed data management system. Experimental evaluation shows our parallel scan strategy significantly improves the performance of scan operation.

  相似文献   

11.
In this study, a two‐node‐connected star problem (2NCSP) is introduced. We are given a simple graph and internal and external costs for each link of the graph. The goal is to find the minimum‐cost spanning subgraph, where the core is two‐node‐connected and the remaining external nodes are connected to the core. First, we show that the 2NCSP belongs to the class of NP‐hard computational problems. Therefore, a greedy randomized adaptive search procedure (GRASP) heuristic is developed, enriched with a variable neighborhood descent (VND). The neighborhood structures include exact integer linear programming models to find the best paths and two‐node‐connected replacements, as well as a shaking operation in order to prevent being trapped in a local minima. The ring star problem (RSP) represents a relevant model in network optimization, where the core is a ring instead of an arbitrary two‐node‐connected graph. We contrast our GRASP/VND methodology with a previous reference work on the RSP in order to highlight the effectiveness of our heuristic. The heuristic is competitive, and the best results produced for several instances so far are under study. In this study, a discussion of the results and trends for future work are provided.  相似文献   

12.
The problem of load balancing when executing parallel programs on computational systems with distributed memory is currently of great interest. The most general statement of this problem is that for one parallel loop: execution of a heterogeneous loop on a heterogeneous computational system. When stated in this way, the problem is NP-complete even in the case of two nodes, and no acceptable heuristics for solving it are found. Since the development of heuristics is a rather complicated task, we decided to examine the problem by elementary methods in order to refine (and, possibly, simplify) the original problem statement. The results of our studies are discussed in this paper. Estimates of efficiency of parallel loop execution as functions of the number of nodes of homogeneous and heterogeneous parallel computational systems are obtained. These estimates show that the use of heterogeneous parallel systems reduces the efficiency even in the case when their communication subsystems are scaleable (see the definition in Section 4). The use of local networks (heterogeneous parallel computational systems with nonscaleable communication subsystems) for parallel computations with heavy data exchange is not advantageous and is possible only for a small number of nodes (about five). An algorithm of optimal distribution of data between the nodes of a homogeneous or heterogeneous computational system is suggested. Results of numerical experiments substantiate the conclusions obtained.  相似文献   

13.
An efficient distributed algorithm for constructing small dominating sets   总被引:1,自引:0,他引:1  
The dominating set problem asks for a small subset D of nodes in a graph such that every node is either in D or adjacent to a node in D. This problem arises in a number of distributed network applications, where it is important to locate a small number of centers in the network such that every node is nearby at least one center. Finding a dominating set of minimum size is NP-complete, and the best known approximation is logarithmic in the maximum degree of the graph and is provided by the same simple greedy approach that gives the well-known logarithmic approximation result for the closely related set cover problem. We describe and analyze new randomized distributed algorithms for the dominating set problem that run in polylogarithmic time, independent of the diameter of the network, and that return a dominating set of size within a logarithmic factor from optimal, with high probability. In particular, our best algorithm runs in rounds with high probability, where n is the number of nodes, is one plus the maximum degree of any node, and each round involves a constant number of message exchanges among any two neighbors; the size of the dominating set obtained is within of the optimal in expectation and within of the optimal with high probability. We also describe generalizations to the weighted case and the case of multiple covering requirements. Received: January 2002 / Accepted: August 2002 RID="*" ID="*" Supported by NSF CAREER award NSF CCR-9983901 RID="*" ID="*" Supported by NSF CAREER award NSF CCR-9983901  相似文献   

14.
Peer-to-peer live media streaming over the Internet is becoming increasingly more popular, though it is still a challenging problem. Nodes should receive the stream with respect to intrinsic timing constraints, while the overlay should adapt to the changes in the network and the nodes should be incentivized to contribute their resources. In this work, we meet these contradictory requirements simultaneously, by introducing a distributed market model to build an efficient overlay for live media streaming. Using our market model, we construct two different overlay topologies, tree-based and mesh-based, which are the two dominant approaches to the media distribution. First, we build an approximately minimal height multiple-tree data dissemination overlay, called Sepidar. Next, we extend our model, in GLive, to make it more robust in dynamic networks by replacing the tree structure with a mesh. We show in simulation that the mesh-based overlay outperforms the multiple-tree overlay. We compare the performance of our two systems with the state-of-the-art NewCoolstreaming, and observe that they provide better playback continuity and lower playback latency than that of NewCoolstreaming under a variety of experimental scenarios. Although our distributed market model can be run against a random sample of nodes, we improve its convergence time by executing it against a sample of nodes taken from the Gradient overlay. The evaluations show that the streaming overlays converge faster when our market model works on top of the Gradient overlay.  相似文献   

15.
We demonstrate the feasibility of a distributed implementation of the Goldberg-Tarjan algorithm for finding the maximum flow in a network. Unlike other parallel implementations of this algorithm, where the network graph is partitioned among many processors, we partition the algorithm among processors arranged in a pipeline. The network graph data are distributed among the processors according to local requirements. The partitioned algorithm is implemented on six processors within a 15-processor pipelined message-passing multicomputer operating at 5 MHz. We used randomly generated networks with integer capacities as examples. Performance estimates based upon a six-processor pipelined implementation indicated a speedup between 3.8 and 5.9 over a single processor  相似文献   

16.
On Migrating Threads   总被引:1,自引:0,他引:1  
Based on the notion of persistent threads in Tycoon (Matthes and Schmidt, 1994), we investigate thread migration as a programming construct for building activity-oriented distributed applications. We first show how a straight-forward extension of a higher-order persistent language can be used to define activities that span multiple (semi-) autonomous nodes in heterogeneous networks. In particular, we discuss the intricate binding issues that arise in systems where threads are first-class language citizens that may access local and remote, mobile and immobile resources.We also describe how our system model can be understood as a promising generalization of the more static architecture of first-order and higher-order distributed object systems. Finally, we give some insight into the implementation of persistent and migrating threads and we explain how to represent bindings to ubiquitous resources present at each node visited by a migrating thread on the network to avoid excessive communication or storage costs.  相似文献   

17.
The bulk synchronous parallel (BSP) model is very user friendly for coding and debugging parallel graph algorithms. However, existing BSP-based distributed graph-processing frameworks, such as Pregel, GPS and Giraph, routinely suffer from high communication costs. These high communication costs mainly stem from the fine-grained message-passing communication model. In order to address this problem, we propose a new computation model with low communication costs, called LCC-BSP. We use this model to design and implement a high-performance distributed graph-processing framework called LCC-Graph. This framework eliminates high communication costs in existing distributed graph-processing frameworks. Moreover, LCC-Graph also balances the computation workloads among all compute nodes by optimizing graph partitioning, significantly reducing the computation time for each superstep. Evaluation of LCC-Graph on a 32-node cluster, driven by real-world graph datasets, shows that it significantly outperforms existing distributed graph-processing frameworks in terms of runtime, particularly when the system is supported by a high-bandwidth network. For example, LCC-Graph achieves an order of magnitude performance improvement over GPS and GraphLab.  相似文献   

18.
大图数据的处理与分析是近年来的热点研究问题,分布式图计算是目前处理大图数据的主流技术,但是存在诸多无法避免的问题,比如分布式计算的负载均衡和分布式实现的调试和优化仍然非常困难。另一方面,近几年的研究表明,通过设计合理的数据结构和处理模型,在单个PC上基于大容量磁盘的大图计算往往可以获得与分布式图计算相当的处理性能。文献[14]显示,GraphChi在单机上的处理性能与Spark在50台节点上处理性能相差无几。本文结合累加迭代计算和单机并行处理技术,提出流式处理的异步计算模型ASP。它实现了对磁盘的完全顺序访问,允许流式的顺序载入结构数据的同时进行异步更新计算。基于ASP模型,我们提出了一种流式处理的异步图处理框架S-Maiter,实现了高效率的基于外存的单机大图处理,通过I/O线程优化、内存资源监控、shard级优先级调度等优化技术,大大提高了系统处理大图数据的性能。实验结果表明,在处理大图数据(1300万顶点,5亿连边)时,仅仅需要1台PC机计算资源的S-Maiter与在16台PC上运行的分布式Maiter的性能几乎相当。并且S-Maiter比另外一个流行的单机大图处理系统GraphChi快1.5倍。  相似文献   

19.
丁尚  童鑫  陈艳  叶保留 《软件学报》2017,28(8):1940-1951
分布式存储系统为保证可靠性会采用一定存储冗余策略如多副本策略、纠删码策略.纠删码相对于副本具有存储开销小的优点,但节点修复网络开销大.针对修复网络开销优化,业界提出再生码与以简单再生码为代表的局部可修复码,显著降低了修复网络开销.然而,现有基于编码的分布式容错存储方案大都假设节点处于星型逻辑网络结构中,忽略了实际的物理网络拓扑结构和带宽信息.为实现拓扑感知的容错存储优化,相关研究在纠删码和再生码修复过程结合网络链路带宽能力,建立树型修复路径,进一步提高了修复效率.但由于编码和修复过程的差异性,上述工作并不适合于简单再生码修复.针对该问题,本文结合实际物理网络拓扑结构,将链路带宽能力引入到简单再生码的修复过程中,对带宽感知的简单再生码修复优化技术开展研究.论文建立了带宽感知节点修复时延模型,提出了基于最优瓶颈路径和最优修复树的并行修复树构建算法.并通过实验对所提算法性能进行了评估.实验结果表明,与星型修复方式相比,论文所提算法有效地降低了节点修复时延,提高了修复效率.  相似文献   

20.
Multicasting is an important issue for numerous applications in parallel and distributed computing. In multicasting, the same message is delivered from a source node to an arbitrary number of destination nodes. The star graph interconnection network has been recognized as an attractive alternative to the popular hypercube network. In this paper, we propose an efficient and deadlock-free tree-based multi-cast routing scheme for wormhole-routed star graph networks with hamiltonian path. In our proposed routing scheme, the router is with the input-buffer-based asynchronous replication mechanism that requires extra hardware cost. Meanwhile, the router simultaneously sends incoming flits on more than one outgoing channel. We perform simulation experiments with the network latency and the network traffic. Experimental results show that the proposed scheme reduces multicast latency more efficiently than other schemes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号