首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The AQuA architecture provides adaptive fault tolerance to CORBA applications by replicating objects and providing a high-level method that an application can use to specify its desired level of dependability. This paper presents the algorithms that AQUA uses, when an application's dependability requirements can change at runtime, to tolerate both value faults in applications and crash failures simultaneously. In particular, we provide an active replication communication scheme that maintains data consistency among replicas, detects crash failures, collates the messages generated by replicated objects, and delivers the result of each vote. We also present an adaptive majority voting algorithm that enables the correct ongoing vote while both the number of replicas and the majority size dynamically change. Together, these two algorithms form the basis of the mechanism for tolerating and recovering from value faults and crash failures in AQuA  相似文献   

2.
Synchronous atomic broadcast for redundant broadcast channels   总被引:4,自引:3,他引:1  
We propose a synchronous atomic broadcast protocol for distributed real-time systems based on redundant broadcast channels. The protocol can tolerate a finite number f of concurrent processor crash failures, channel adapter performance failures and channel omission failures. Its message cost is optimal: when no failures occur only f+1 messages are sent per broadcast. The cost implications of providing tolerance to other failure classes are also investigated.  相似文献   

3.
Self-stabilizing depth-first token circulation in arbitrary rooted networks   总被引:1,自引:0,他引:1  
Abstract. We present a deterministic distributed depth-first token passing protocol on a rooted network. This protocol uses neither the processor identifiers nor the size of the network, but assumes the existence of a distinguished processor, called the root of the network. The protocol is self-stabilizing, meaning that starting from an arbitrary state (in response to an arbitrary perturbation modifying the memory state), it is guaranteed to reach a state with no more than one token in the network. Our protocol implements a strictly fair token circulation scheme. The proposed protocol has extremely small state requirement – only states per processor, i.e., bits per processor, where is the degree of the network. The protocol can be used to implement a strictly fair distributed mutual exclusion in any rooted network. This protocol can also be used to construct a DFS spanning tree. Received: July 1998 / Accepted: April 2000  相似文献   

4.
基于遗传算法的实时组播通信路由算法   总被引:8,自引:0,他引:8  
陈明  李志杰 《软件学报》2001,12(5):721-728
组播通信路由技术是视频广播、计算机会议、CSCW()等新型分布式计算的关键技术.提出了基于分布式遗传算法的共享树组播路由算法,包括包交换的网络组播树的建立、组播树的动态维护和计算满足特定时延和时延抖动限制的近似斯坦利最小树算法等.利用它可以实现在给定网络和组播需求的情况下,在组成员间寻找动态的组播树,并使该树覆盖所有的成员,并约束网络费用达到最小.进而解决树状路由的建立以及树状路由的动态维护等问题.  相似文献   

5.
A ubiquitous information environment can be achieved by the mobile computing technologies. In this environment, users carrying their portable computers can retrieve local or remote information anywhere and at anytime. Data broadcast, with its advantages, has become a powerful means to disseminate data in wireless communications. Indexing methods for the broadcast data have been proposed to speedup access time and reduce power consumption. However, the influence of access failures has not been discussed. For the error-prone mobile environment, the occurrence of access failures is often due to disconnections, handoffs, and communication noises. Based on the distributed indexing scheme, we propose an adaptive access method which tolerates the access failures. The basic idea is to use index replication to recover from the access failures. One mechanism, named search range, is provided to dynamically record the range where the desired data item may exist. According to the search range, an unfinished search can be efficiently resumed by finding an available index replicate. A performance analysis is given to show the benefits of the method. Also, the concept of version bits is applied to deal with the updates of the broadcast data  相似文献   

6.
Atomic broadcast is a fundamental problem of distributed systems: It states that messages must be delivered in the same order to their destination processes. This paper describes a solution to this problem in asynchronous distributed systems in which processes can crash and recover. A consensus-based solution to atomic broadcast problem has been designed by Chandra and Toueg for asynchronous distributed systems where crashed processes do not recover. We extend this approach: it transforms any consensus protocol suited to the crash-recovery model into an atomic broadcast protocol suited to the same model. We show that atomic broadcast can be implemented requiring few additional log operations in excess of those required by the consensus. The paper also discusses how additional log operations can improve the protocol in terms of faster recovery and better throughput. To illustrate the use of the protocol, the paper also describes a solution to the replica management problem in asynchronous distributed systems in which processes can crash and recover. The proposed technique makes a bridge between established results on weighted voting and recent results on the consensus problem.  相似文献   

7.
High performance computing requires high quality load distribution of processes of a parallel application over processors in a parallel computer at runtime such that both the maximum load and dilation are minimized. The performance of a simple randomized tree embedding algorithm that dynamically supports tree-structured parallel computations on arbitrary static networks is analyzed in this paper. The algorithm spreads newly created tree nodes to neighboring processors, which actually provides randomized dilation-1 tree embedding in static networks. We develop a linear system of equations that characterizes expected loads on all processors under the reproduction tree model, which can generate trees of arbitrary size and shape. It is shown that as the tree size becomes large, the asymptotic performance ratio of the randomized tree embedding algorithm is the ratio of the maximum processor degree to the average processor degree. This implies that the simple randomized tree embedding algorithm is able to generate high quality load distributions on virtually all static networks commonly employed in parallel and distributed computing.  相似文献   

8.
This paper introduces a mobility-aware medium access control protocol for multi-hop wireless sensor networks (MA-MAC). The protocol evaluates the RSSI values of acknowledgement packets and determines whether a gradual deterioration in the RSSI values eventually leads to a disconnection. If it does, it initiates a handover by switching transmission from a unicast to a broadcast mode and by embedding neighbour discovery requests in the broadcast data packets. While the mobile node continues transmitting data packets via the existing link, the neighbour discovery requests enable it to discover new nodes that can serve as intermediate nodes. Once an intermediate node is found, the mobile node establishes a link with it and switches transmission back to unicast. Conceptually, MA-MAC's handover feature can be implemented by extending any of the existing transmitter initiated, energy-efficient protocols such as XMAC or BMAC. Our present implementation is based on the XMAC protocol. The paper reports how the protocol performs as the speed of mobility, handover threshold, and sending interval vary.  相似文献   

9.
Auction mechanisms are nowadays widely used in electronic commerce Web sites for buying and selling items among different users. The increasing importance of auction protocols in the negotiation phase is not limited to online marketplaces. In fact, the wide applicability of auctions as resource‐allocation and negotiation mechanisms have also led to a great deal of interest in auctions within the agent community. A challenging issue for agents operating in open Multiagent Systems (such as the emerging semantic Web infrastructure) concerns the specification of declarative communication rules which could be published and shared allowing agents to dynamically engage well‐known and trusted negotiation protocols. To cope with real‐world applications, these rules should also specify fault tolerant patterns of interaction, enabling negotiating agents to interact with each other tolerating failures, for instance terminating an auction process even if some bidding agents dynamically crash. In this paper, we propose an approach to specify fault tolerant auction protocols in open and dynamic environments by means of communication rules dealing with crash failures of agents. We illustrate these concepts considering a case study about the specification of an English Auction protocol which tolerate crashes of bidding agents and we discuss its properties.  相似文献   

10.
为了解决含分布式电源的配电网的故障定位问题,提出一种基于差分进化的改进细菌觅食算法进行配电网故障定位,首先针对分布式电源投切问题,构建能够动态适应多个分布式电源投切的开关函数,然后结合区域划分思想,通过将各个配网支路分为有源树和无源树进行故障信息筛选,降低解空间,提高故障定位的速度;同时针对细菌觅食算法精度不高和全局搜索能力较差的问题,借鉴差分进化中的变异和交叉机制,通过多样性控制与交叉操作协调来实现细菌觅食算法在细化搜索与扩展新区之间的协调,提高算法的寻优精度和全局寻优能力,适用于复杂的含分布式电源的配电网络。通过算例对该故障定位方法进行仿真,结果表明该算法能准确定位,并具有一定的有效性和容错性。  相似文献   

11.
This paper considers a system of autonomous mobile robots that can move freely in a two-dimensional plane, and where a number of them can fail by crashing. The crash of a robot can be either permanent or temporary, that is, after its crash the robot either executes no action or it recovers from its failure. These robots repeatedly go through a succession of activation cycles during which they observe the environment, compute a destination and move. In particular, we assume weak robots, in the sense that robots cannot communicate explicitly between each other. Also, they cannot remember their past computations (i.e., they are oblivious). Finally, robots do not agree on a common coordinate system.In this paper, we address the fault-tolerant flocking problem under the crash-recovery model. That is, starting from any initial configuration, a group of non-faulty robots are required to form a desired pattern, and move together while following a robot leader moving on some trajectory, and keeping such a pattern in movement. Specifically, we propose a fault-tolerant flocking algorithm in the semi-synchronous model that allows correct robots to dynamically form a regular polygon in finite time, and maintain it in movement infinitely often. Our algorithm relies on the existence of two devices, namely an eventually perfect failure detector device to ensure failure detection, and also an eventual leader device to handle leader election. The algorithm tolerates permanent crash failures, and also crash-recovery failures of robots due to its oblivious feature. The proposed algorithm ensures the necessary restrictions on the movement of robots in order to avoid collisions between them. In addition, it is robust with respect to changes in the environment.  相似文献   

12.
广播通信广泛应用于分布式应用或并行计算环境,文章充分利用交换式以太网路由所使用的生成树协议,提出了一种新的基于交换式以太网的可靠顺序广播协议。  相似文献   

13.
在实时的非对称通讯环境下,自适应混合广播策略能够根据数据的请求模式以及时间限制等特征来动态分配周期广播和按需广播的带宽比例.将这一策略推广到基于事务的多数据项广播调度中,同时引入"分布式广播"思想,以更精细的周期广播粒度来解决过长的周期广播与事务及数据的实时要求之间的矛盾,并且动态分配时间槽.实验结果表明:改进的调度策略更适合于实时环境下的数据广播,具有更低的事务失败率以及更小的上行信道负荷.  相似文献   

14.
As group applications are becoming widespread, efficient network utilization becomes a growing concern. Multicast transmission represents a necessary lower network service for the wide diffusion of new multimedia network applications. Multicast transmission may use network resources more efficiently than multiple point-to-point messages; however, creating optimal multicast trees (Steiner Tree Problem in networks) is prohibitively expensive. This paper proposes a distributed algorithm for the heuristic solution of the Steiner Tree Problem, allowing the construction of effective distribution trees using a coordination protocol among the network nodes. Furthermore, we propose a novel distributed technique for dynamically updating the multicast tree. The approach proposed has been implemented and extensively tested both in simulation, and on experimental networks. Performance evaluation indicates that the distributed algorithm performs as well as the centralized version, providing good levels of convergence time and communication complexity.  相似文献   

15.
广播操作是无线网络中一种常用的、重要的操作,通常采用泛洪来实现.无控制的泛洪会引起严重的竞争、冲突和拥塞,称为广播风暴问题.鉴此,本文提出了一种高效的无线网络广播协议.该协议通过根据网络拓扑图的最多叶子最短生成树来确定路由选择.分析和仿真结果说明,本文所提出的广播协议不仅能够避免冲突而且能够大大减少冗余的广播消息和减小广播延时.  相似文献   

16.
In this paper we present an algebra of actors extended with mechanisms to model crash failures and their detection. We show how this extended algebra of actors can be successfully used to specify distributed software architectures. The main components of a software architecture can be specified following an object-oriented style and then they can be composed using asynchronous message passing or more complex interaction patterns. This formal specification can be used to show that several requirements of a software system are satisfied at the architectural level despite failures. We illustrate this process by means of a case study: the specification of a software architecture for intelligent agents which supports a fault tolerant anonymous interaction protocol.  相似文献   

17.
To date, there is little evidence that modular reasoning about fault-tolerant systems can simplify the verification process in practice. This question is studied using a prominent example from the fault tolerance literature: the problem of reliable broadcast in point-to-point networks subject to crash failures of processes. The experiences from this case study show how modular specification techniques and rigorous proof re-use can indeed help in such undertakings.  相似文献   

18.
简要叙述了PIM-SM中两种组播转发树特点及其相关研究,给出了一种PIM-SM组播转发树从SPT变迁到ST的设计,这一变迁是PIM-SM组播转发算法的扩展。通过这一扩展使得PIM-SM组播转发可以动态地在ST和SPT组播转发树之间进行切换,从而可以综合利用两种组播转发树的优势,来适应动态变化的网络,使得PIM-SM组播转发算法能够在扩展性和性能之间达到较好的平衡和优化。通过在实验室组播试验平台上的实际测试,证明这种协议扩展的设计能够较好地满足设计的要求。  相似文献   

19.
Concurrent broadcast involves the dissemination of a database, consisting of messages initially distributed among the nodes of a network, so that a copy of the entire database eventually resides at each node. One application is the dissemination of network status information for adaptive routing in a communications network. This paper examines the time complexity and communication complexity of several distributed procedures for concurrent broadcast. The procedures do not use information depending on the network topology. The worst-case time complexity of a flooding procedure for concurrent broadcast is shown to be linear in the number of nodes plus the number of messages, and no other procedure for concurrent broadcast has a better worst-case time complexity. A variant of flooding is proposed to eliminate redundant message receipts from the flooding process by real-time signaling between neighbors concerning messages residing at each. This variant can reduce communication complexity, while having a worst-case time complexity similar in form to that of the flooding procedure. Special properties of concurrent broadcast in a tree are also given. The present time complexity results can be used to bound the time during which inconsistent databases may reside at different nodes, to evaluate and compare procedures for (or including) concurrent broadcast, and to schedule a sequence of instances of concurrent broadcast so that the instances do not overlap and there is no need for sequence numbers.  相似文献   

20.
A control abstraction called atomic action is a powerful general mechanism for ensuring consistent behavior of a system in spite of failures of individual computations running in the system, and in spite of system crashes. However, because of the ``all-or-nothing' property of atomic actions, an important amount of work might be abandoned needlessly when an internal error is encountered. This paper discusses how implementation of resilient distributed systems can be supported using a combination of nested atomic actions and stable checkpoints. Nested atomic actions form a tree structure. When an internal atomic action terminates, its results are not made permanent until the outermost atomic action commits, but they survive local node failures. Each subtree of atomic actions is recoverable individually. A checkpoint is established in stable storage as part of a remote request so that results of such a request can be reclaimed if the requesting node fails in the meantime, The paper shows how remote procedure call primitives with ``at-most-once' semantics and recovery blocks can be built with these mechanisms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号