首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 297 毫秒
1.
Adaptive mapping and navigation by teams of simple robots   总被引:1,自引:0,他引:1  
We present a technique for mapping an unknown environment and navigating through it using a team of simple robots. Minimal assumptions are made about the abilities of the robots on a team. We assume only that robots can explore the environment using a random walk, detect the goal location, and communicate among themselves by transmitting a single small integer over a limited distance and in a direct line of sight; additionally, one designated robot, the navigator, can track toward a team member when it is nearby and in a direct line of sight. We do not assume that robots can determine their absolute (x, y) positions in the environment to be mapped, determine their positions relative to other team members, or sense anything other than the goal location and the transmissions of their teammates. In spite of these restrictive assumptions, we show that for moderate-sized teams in complex environments the time needed to construct a map and then navigate to a goal location can be competitive with the time needed to navigate to the goal along an optimal path formed with perfect knowledge of the environment. In other words, collective mapping enables navigation in an unmapped environment with only modest overhead. This basic result holds over a wide range of assumptions about robot reliability, sensor range, tracking ability.

We then describe an extended mapping algorithm that allows an existing map to be efficiently corrected when a goal location changes. We show that a robot team using the algorithm is adaptive, in the sense that its performance will improve over time, whenever navigation goals follow certain regular patterns.  相似文献   


2.
Consistency of preferences is related to rationality, which is associated with the transitivity property. Many properties suggested to model transitivity of preferences are inappropriate for reciprocal preference relations. In this paper, a functional equation is put forward to model the “cardinal consistency in the strength of preferences” of reciprocal preference relations. We show that under the assumptions of continuity and monotonicity properties, the set of representable uninorm operators is characterized as the solution to this functional equation. Cardinal consistency with the conjunctive representable cross ratio uninorm is equivalent to Tanino's multiplicative transitivity property. Because any two representable uninorms are order isomorphic, we conclude that multiplicative transitivity is the most appropriate property for modeling cardinal consistency of reciprocal preference relations. Results toward the characterization of this uninorm consistency property based on a restricted set of $(n-1)$ preference values, which can be used in practical cases to construct perfect consistent preference relations, are also presented.   相似文献   

3.
Florin   《Performance Evaluation》2003,51(2-4):171-190
Large deviations papers like that of Ignatyuk et al. [Russ. Math. Surv. 49 (1994) 41–99] have shown that asymptotically, the stationary distribution of homogeneous regulated networks is of the form
with the coefficient being different in various “boundary influence domains” and also depending on some of these domains on n. In this paper, we focus on the case of constant exponents and on a subclass of networks we call “strongly skip-free” (which includes all Jackson and all two-dimensional skip-free networks). We conjecture that an asymptotic exponent is constant iff it corresponds to a large deviations escape path which progresses gradually (from the origin to the interior) through boundary facets whose dimension always increases by one. Solving the corresponding large deviations problem for our subclass of networks leads to a family of “local large deviation systems” (LLDSs) (for the constant exponents), which are expressed entirely in terms of the cumulant generating function of the network. In this paper, we show that at least for “strongly skip-free” Markovian networks with independent transition processes, the LLDS is closely related to some “local boundary equilibrium systems” (LESs) obtained by retaining from the equilibrium equations only those valid in neighborhoods of the boundary.

Since asymptotic results require typically only that the cumulant generating function is well-defined over an appropriate domain, it is natural to conjecture that these LLDSs will provide the asymptotic constant exponents regardless of any distributional assumptions made on the network.

Finally, we outline a practical recipe for combining the local approximations to produce a global large deviations approximation , with the coefficients Kj determined numerically.  相似文献   


4.
We investigate the computational complexity of finding optimal bribery schemes in voting domains where the candidate set is the Cartesian product of a set of variables and voters use CP-nets, an expressive and compact way to represent preferences. To do this, we generalize the traditional bribery problem to take into account several issues over which agents vote, and their inter-dependencies. We consider five voting rules, three kinds of bribery actions, and five cost schemes. For most of the combinations of these parameters, we find that bribery in this setting is computationally easy.  相似文献   

5.
This paper gives a combinatorial proof of a “yes” answer to an open question presented by Vidyasagar (ibid. vol.40, 1995), stated as follows: “Given a multilinear polynomial E(x): [0, 1] n→ℛ, is it true that Eb(x)=E(x)-bt x has a strict local minimum over the discrete set {0, 1}n for almost all b of sufficiently small norm?” The given combinatorial proof is completed directly by providing a sufficient condition for a conjecture on the strict local minima of multilinear polynomials, also postulated in Vidyasagar, to hold. In addition, a simple counter-example is presented to demonstrate that the conjecture may be not true if the provided sufficient condition is not satisfied  相似文献   

6.
An extensive fuzzy behavior-based architecture is proposed for the control of mobile robots in a multiagent environment. The behavior-based architecture decomposes the complex multirobotic system into smaller modules of roles, behaviors and actions. Fuzzy logic is used to implement individual behaviors, to coordinate the various behaviors, to select roles for each robot and, for robot perception, decision-making, and speed control. The architecture is implemented on a team of three soccer robots performing different roles interchangeably. The robot behaviors and roles are designed to be complementary to each other, so that a coherent team of robots exhibiting good collective behavior is obtained.  相似文献   

7.
Soccer is a competitive and collective sport in which teammates try to combine the execution of basic actions (cooperative behavior) to lead their team to more advantageous situations. The ability to recognize, extract and reproduce such behaviors can prove useful to improve the performance of a team in future matches. This work describes a methodology for achieving just that makes use of a plan definition language to abstract the representation of relevant behaviors in order to promote their reuse. Experiments were conducted based on a set of game log files generated by the Soccer Server simulator which supports the RoboCup 2D simulated robotic soccer league. The effectiveness of the proposed approach was verified by focusing primarily on the analysis of behaviors which started from set-pieces and led to the scoring of goals while the ball possession was kept. One of the results obtained showed that a significant part of the total goals scored was based on this type of behaviors, demonstrating the potential of conducting this analysis. Other results allowed us to assess the complexity of these behaviors and infer meaningful guidelines to consider when defining plans from scratch. Some possible extensions to this work include assessing which plans have the ability to maximize the creation of goal opportunities by countering the opponent’s team strategy and how the effectiveness of plans can be improved using optimization techniques.  相似文献   

8.
Colearning in Differential Games   总被引:1,自引:0,他引:1  
Sheppard  John W. 《Machine Learning》1998,33(2-3):201-233
Game playing has been a popular problem area for research in artificial intelligence and machine learning for many years. In almost every study of game playing and machine learning, the focus has been on games with a finite set of states and a finite set of actions. Further, most of this research has focused on a single player or team learning how to play against another player or team that is applying a fixed strategy for playing the game. In this paper, we explore multiagent learning in the context of game playing and develop algorithms for co-learning in which all players attempt to learn their optimal strategies simultaneously. Specifically, we address two approaches to colearning, demonstrating strong performance by a memory-based reinforcement learner and comparable but faster performance with a tree-based reinforcement learner.  相似文献   

9.
We propose a generalization of Paillier’s probabilistic public-key system, in which the expansion factor is reduced and which allows to adjust the block length of the scheme even after the public key has been fixed, without losing the homomorphic property. We show that the generalization is as secure as Paillier’s original system and propose several ways to optimize implementations of both the generalized and the original scheme. We construct a threshold variant of the generalized scheme as well as zero-knowledge protocols to show that a given ciphertext encrypts one of a set of given plaintexts, and protocols to verify multiplicative relations on plaintexts. We then show how these building blocks can be used for applying the scheme to efficient electronic voting. This reduces dramatically the work needed to compute the final result of an election, compared to the previously best known schemes. We show how the basic scheme for a yes/no vote can be easily adapted to casting a vote for up to t out of L candidates. The same basic building blocks can also be adapted to provide receipt-free elections, under appropriate physical assumptions. The scheme for 1 out of L elections can be optimized such that for a certain range of the other parameter values, the ballot size is logarithmic in L.  相似文献   

10.
利用SONYEV-D31摄像机和自主研发的摄像机控制模块,构建了一套主动视觉子系统,并将该子系统应用于RIRA-Ⅱ型移动机器人上,实现了移动机器人运动目标自动跟踪功能。RIRA-Ⅱ移动机器人采用了由一组分布式行为模块和集中命令仲裁器组成的基于行为的分布式控制体系结构。各行为模块基于领域知识通过反应方式产生投票,由仲裁器产生动作指令,机器人完成相应的动作。在设置了障碍、窄通道以及模拟墙体的复杂环境下进行运动目标跟踪实验,实验表明运动目标跟踪系统运行可靠,具有较高的鲁棒性。  相似文献   

11.
Resilience, the ability to adapt or absorb disturbance, disruption, and change, may be increased by team processes in a complex, socio-technical system. In particular, collaborative cross-checking is a strategy where at least two individuals or groups with different perspectives examine the others’ assumptions and/or actions to assess validity or accuracy. With this strategy, erroneous assessments or actions can be detected quickly enough to mitigate or eliminate negative consequences. In this paper, we seek to add to the understanding of the elements that are needed in effective cross-checking and the limitations of the strategy. We define collaborative cross-checking, describe in detail three healthcare incidents where collaborative cross-checks played a key role, and discuss the implications of emerging patterns.  相似文献   

12.
An algorithm is proposed for the design of "on-line" learning controllers to control a discrete stochastic plant. The subjective probabilities of applying control actions from a finite set of allowable actions using random strategy, after any plant-environment situation (called an "event") is observed, are modified through the algorithm. The subjective probability for the optimal action is proved to approach one with probability one for any observed event. The optimized performance index is the conditional expectation of the instantaneous performance evaluations with respect to the observed events and the allowable actions. The algorithm is described through two transformations, T1and T2. After the "ordering transformation" T1is applied on the estimates of the performance indexes of the allowable actions, the "learning transformation" T2modifies the subjective probabilities. The cases of discrete and continuous features are considered. In the latter, the Potential Function Method is employed. The algorithm is compared with a linear reinforcement scheme and computer simulation results are presented.  相似文献   

13.
开放系统中的信任关系本质上是最复杂的社会关系之一,涉及假设、期望、行为和环境多种因子,很难准确地定量表示和预测。本文在现有的基于行为监控的动态信任模型的基础上,把粗糙集理论和信息熵理论结合起来应用于信任度量与预测模块。通过实验证明,新的条件信息熵权重确定方法可以解决原有权重确定方法自适应性差和行为数据规模的扩展能力差的问题。  相似文献   

14.
This technical note addresses the discrete optimization of stochastic discrete event systems for which both the performance function and the constraint function are not known but can be evaluated by simulation and the solution space is either finite or unbounded. Our method is based on random search in a neighborhood structure called the most promising area proposed in and a moving observation area. The simulation budget is allocated dynamically to promising solutions. Simulation-based constraints are taken into account in an augmented performance function via an increasing penalty factor. We prove that under some assumptions, the algorithm converges with probability 1 to a set of true local optimal solutions. These assumptions are restrictive and difficult to verify but we hope that the encouraging numerical results would motivate future research exploiting ideas of this technical note.   相似文献   

15.
Recent advances in man–machine interaction include attempts to infer operator intentions from operator actions, to better anticipate and support system performance. This capability has been investigated in contexts such as intelligent interface designs and operation support systems. While some progress has been demonstrated, efforts to date have focused on a single operator. In large and complex artefacts such as power plants or aircrafts, however, a team generally operates the system, and team intention is not reducible to mere summation of individual intentions. It is therefore necessary to develop a team intention inference method for sophisticated team–machine communication. In this paper a method is proposed for team intention inference in process domains. The method uses expectations of the other members as clues to infer a team intention and describes it as a set of individual intentions and beliefs of the other team members. We applied it to the operation of a plant simulator operated by a two-person team, and it was shown that, at least in this context, the method is effective for team intention inference.  相似文献   

16.
The proposed method is implemented in three steps: first, when a variation in environment is perceived, agents take appropriate actions. Second, the behaviors are stimulated and controlled through communication with other agents. Finally, the most frequently stimulated behavior is adopted as a group behavior strategy. In this paper, two different reward models, reward model 1 and reward model 2, are applied. Each reward model is designed to consider the reinforcement or constraint of behaviors. In competitive agent environments, the behavior considered to be advantageous is reinforced as adding reward values. On the contrary, the behavior considered to be disadvantageous is constrained by reducing the reward values. The validity of this strategy is verified through simulation.  相似文献   

17.
H methods for the analysis and design of robust feedback control systems have sometimes been criticized for an apparent conservatism. However, recent results have shown that they can provide the least conservative results possible under a particular set of assumptions. For example, a ball in the v-gap metric is the largest set of plants that can be guaranteed to be stabilized a priori by a controller known only to satisfy a bound on the induced norm of a particular closed-loop operator. Nevertheless, there are examples of uncertainty which, whilst large when measured by the v-gap metric, would be regarded as relatively benign by an experienced designer of control systems. The present paper examines the possibility that in arriving at this judgment, such a designer is implicitly using the knowledge that he will always choose the least complex controller necessary to do the job. It is shown that, given an appropriate bound on the complexity of the controller, significantly stronger a priori robustness results can be obtained  相似文献   

18.
We apply the Markov Game formalism to develop a context-aware approach to valuing player actions, locations, and team performance in ice hockey. The Markov Game formalism uses machine learning and AI techniques to incorporate context and look-ahead. Dynamic programming is applied to learn value functions that quantify the impact of actions on goal scoring. Learning is based on a massive new dataset, from SportLogiq, that contains over 1.3M events in the National Hockey League. The SportLogiq data include the location of an action, which has previously been unavailable in hockey analytics. We give examples showing how the model assigns context and location aware values to a large set of 13 action types. Team performance can be assessed as the aggregate value of actions performed by the team’s players, or the aggregate value of states reached by the team. Model validation shows that the total team action and state value both provide a strong indicator predictor of team success, as measured by the team’s average goal ratio.  相似文献   

19.
A team control problem is considered whose information structure is partially nested and is characterized by the existence of a common past information set shared by the team members after a finite delay. Under LQG assumptions, it is shown that the optimal control strategy can take on a time-invariant recursive form based on suitable sufficient statistics.  相似文献   

20.
In this paper, we first discuss the meaning of physical embodiment and the complexity of the environment in the context of multi-agent learning. We then propose a vision-based reinforcement learning method that acquires cooperative behaviors in a dynamic environment. We use the robot soccer game initiated by RoboCup (Kitano et al., 1997) to illustrate the effectiveness of our method. Each agent works with other team members to achieve a common goal against opponents. Our method estimates the relationships between a learner's behaviors and those of other agents in the environment through interactions (observations and actions) using a technique from system identification. In order to identify the model of each agent, Akaike's Information Criterion is applied to the results of Canonical Variate Analysis to clarify the relationship between the observed data in terms of actions and future observations. Next, reinforcement learning based on the estimated state vectors is performed to obtain the optimal behavior policy. The proposed method is applied to a soccer playing situation. The method successfully models a rolling ball and other moving agents and acquires the learner's behaviors. Computer simulations and real experiments are shown and a discussion is given.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号