首页 | 官方网站   微博 | 高级检索  
 共查询到20条相似文献,搜索用时 296 毫秒

This paper presents a practical time-optimal and smooth trajectory planning algorithm and then applies it to robot manipulators. The proposed algorithm uses the time-optimal theory based on the dynamics model to plan the robot’s motion trajectory, constructs the trajectory optimization model under the constraints of the geometric path and joint torque, and dynamically selects the optimal trajectory parameters during the solving process to prominently improve the robot’s motion speed. Moreover, the proposed algorithm utilizes the input shaping algorithm instead of the jerk constraint in the trajectory optimization model to achieve a smooth trajectory. The input shaping of trajectory parameters during postprocessing not only suppresses the residual vibration of the robot but also takes the signal delay caused by traditional input shaping into account. The combination of these algorithms makes the proposed time-optimal and smooth trajectory planning algorithm ensure absolute time optimality and achieve a smooth trajectory. The results of an experiment on a six-degree-of-freedom industrial robot indicate the validity of the proposed algorithm.


采摘机械臂在夹住柔性果茎后运输果实时,执行器末端的加减速运动使得果实在移动过程中产生摆动,易引发掉落,进而导致采摘失败.本文以单个西红柿作为负载,将果茎近似为柔性连杆.由于每一个果实的质量是不同的,因此,针对机械臂抓取可变柔性负载移动过程中的振动抑制问题,提出了自适应输入整形控制方法.当系统模型由于负载的不确定性发生变化后,传统的输入整形算法无法抑制柔性连杆移动过程中产生的振动.因此采用自适应输入整形算法,实时计算脉冲的幅值和时间.构造二次性能指标函数,通过对机械臂移动的加速度和负载的摆角实时数据进行迭代运算,达到零残余振动的目的.仿真实验结果表明,在变负载情况下,自适应输入整形算法有良好的末端振动抑制能力,获得满意的控制效果.  相似文献   

This paper introduces a new model for robot behavior categorization. Correlation based adaptive resonance theory (CobART) networks are integrated hierarchically in order to develop an adequate categorization, and to elicit various behaviors performed by the robot. The proposed model is developed by adding a second layer CobART network which receives first layer CobART network categories as an input, and back-propagates the matching information to the first layer networks. The first layer CobART networks categorize self-behavior data of a robot or an object in the environment while the second layer CobART network categorizes the robot's behavior with respect to its effect on the object. Experiments show that the proposed model generates reasonable categorization of behaviors being tested. Moreover, it can learn different forms of the behaviors, and it can detect the relations between them. In essence, the model has an expandable architecture and it contains reusable parts. The first layer CobART networks can be integrated with other CobART networks for another categorization task. Hence, the model presents a way to reveal all behaviors performed by the robot at the same time.  相似文献   

模仿学习是机器人仿生机制研究的主要内容之一,即通过观察、理解、学习、模仿示教行为实现机器人的仿生特性。基于高斯过程分别表达采集离散示教信号所构成的示教轨迹和含有未知参数策略的模仿轨迹,构建模仿学习方法框架,将概率模型匹配引入到模仿学习中,以KL散度为代价函数比较两种轨迹的概率分布,运用梯度下降法寻求使KL散度最小的最优模仿控制策略,将策略应用于模仿机器人以完成与示教相同的模仿任务。以关节型机器人的机械臂摆动行为模仿为学习任务进行仿真,结果表明基于概率轨迹匹配的模仿学习方法能够实现机械臂摆动行为模仿,学习过程较传统方法简易且学习效果较好。  相似文献   

针对四足机器人面对腿部损伤无法继续有效自主运作的问题,提出一种基于分层学习的自适应控制模型。该模型结构由上层状态策略控制器(SDC)和下层基础运动控制器(BDC)组成。SDC对机器人腿部及姿态进行决策并选择运动子策略,BDC子运动策略表达该状态下机器人的运动行为。在Unity3D中构建反关节多自由度的四足机器人,训练多种腿部受损状况的BDC子运动策略,BDC成熟后20s周期随机腿部受损并训练SDC。该模型控制流程为SDC监测机器人状态,激活BDC策略,BDC输出期望关节角度,最后由PD控制器进行速度控制。其实现机器人在腿部受损后自我适应继续保持运作。仿真与实验结果表明,该控制模型能在机器人损伤后能自我快速、稳定调整运动策略,并保证运动的连贯性及柔和性。  相似文献   

基于深度强化学习的双足机器人斜坡步态控制方法   总被引:1,自引:0,他引:1  
为提高准被动双足机器人斜坡步行稳定性, 本文提出了一种基于深度强化学习的准被动双足机器人步态控制方法. 通过分析准被动双足机器人的混合动力学模型与稳定行走过程, 建立了状态空间、动作空间、episode过程与奖励函数. 在利用基于DDPG改进的Ape-X DPG算法持续学习后, 准被动双足机器人能在较大斜坡范围内实现稳定行走. 仿真实验表明, Ape-X DPG无论是学习能力还是收敛速度均优于基于PER的DDPG. 同时, 相较于能量成型控制, 使用Ape-X DPG的准被动双足机器人步态收敛更迅速、步态收敛域更大, 证明Ape-X DPG可有效提高准被动双足机器人的步行稳定性.  相似文献   

In this paper we describe a machine learning approach for acquiring a model of a robot behaviour from raw sensor data. We are interested in automating the acquisition of behavioural models to provide a robot with an introspective capability. We assume that the behaviour of a robot in achieving a task can be modelled as a finite stochastic state transition system.Beginning with data recorded by a robot in the execution of a task, we use unsupervised learning techniques to estimate a hidden Markov model (HMM) that can be used both for predicting and explaining the behaviour of the robot in subsequent executions of the task. We demonstrate that it is feasible to automate the entire process of learning a high quality HMM from the data recorded by the robot during execution of its task.The learned HMM can be used both for monitoring and controlling the behaviour of the robot. The ultimate purpose of our work is to learn models for the full set of tasks associated with a given problem domain, and to integrate these models with a generative task planner. We want to show that these models can be used successfully in controlling the execution of a plan. However, this paper does not develop the planning and control aspects of our work, focussing instead on the learning methodology and the evaluation of a learned model. The essential property of the models we seek to construct is that the most probable trajectory through a model, given the observations made by the robot, accurately diagnoses, or explains, the behaviour that the robot actually performed when making these observations. In the work reported here we consider a navigation task. We explain the learning process, the experimental setup and the structure of the resulting learned behavioural models. We then evaluate the extent to which explanations proposed by the learned models accord with a human observer's interpretation of the behaviour exhibited by the robot in its execution of the task.  相似文献   

For many applications such as compliant, accurate robot tracking control, dynamics models learned from data can help to achieve both compliant control performance as well as high tracking quality. Online learning of these dynamics models allows the robot controller to adapt itself to changes in the dynamics (e.g., due to time-variant nonlinearities or unforeseen loads). However, online learning in real-time applications - as required in control - cannot be realized by straightforward usage of off-the-shelf machine learning methods such as Gaussian process regression or support vector regression. In this paper, we propose a framework for online, incremental sparsification with a fixed budget designed for fast real-time model learning. The proposed approach employs a sparsification method based on an independence measure. In combination with an incremental learning approach such as incremental Gaussian process regression, we obtain a model approximation method which is applicable in real-time online learning. It exhibits competitive learning accuracy when compared with standard regression techniques. Implementation on a real Barrett WAM robot demonstrates the applicability of the approach in real-time online model learning for real world systems.  相似文献   

针对公共场合密集人群在紧急情况下疏散的危险性和效果不理想的问题,提出一种基于深度Q网络(DQN)的人群疏散机器人的运动规划算法。首先通过在原始的社会力模型中加入人机作用力构建出人机社会力模型,从而利用机器人对行人的作用力来影响人群的运动状态;然后基于DQN设计机器人运动规划算法,将原始行人运动状态的图像输入该网络并输出机器人的运动行为,在这个过程中将设计的奖励函数反馈给网络使机器人能够在"环境-行为-奖励"的闭环过程中自主学习;最后经过多次迭代,机器人能够学习在不同初始位置下的最优运动策略,最大限度地提高总疏散人数。在构建的仿真环境里对算法进行训练和评估。实验结果表明,与无机器人的人群疏散算法相比,基于DQN的人群疏散机器人运动规划算法使机器人在三种不同初始位置下将人群疏散效率分别增加了16.41%、10.69%和21.76%,说明该算法能够明显提高单位时间内人群疏散的数量,具有灵活性和有效性。  相似文献   

基于神经网络的进化机器人组合行为方法研究   总被引:2,自引:0,他引:2  
为了克服传统机器人设计方法存在的局限性,提高机器人的自适应能力,采用神经网络方法实现了进化机器人避碰、趋近及其组合行为学习,首先,提出了新的机器人模拟环境和机器人模型,结合了采用神经网络实现进化学习系统的方法。其次,对具有进化学习机制的机器人基本行为和组合行为学习系统进行了仿真,并通过仿真证明了新模型不要求环境知识的完备性,机器人具有环境自适应学习能力,还具有结构简洁、易扩展等特点,最后,对仿真结果进行分析与讨论,并提出了进一步研究方向。  相似文献   

This paper examines characteristics of interactive learning between human tutors and a robot having a dynamic neural-network model, which is inspired by human parietal cortex functions. A humanoid robot, with a recurrent neural network that has a hierarchical structure, learns to manipulate objects. Robots learn tasks in repeated self-trials with the assistance of human interaction, which provides physical guidance until the tasks are mastered and learning is consolidated within the neural networks. Experimental results and the analyses showed the following: 1) codevelopmental shaping of task behaviors stems from interactions between the robot and a tutor; 2) dynamic structures for articulating and sequencing of behavior primitives are self-organized in the hierarchically organized network; and 3) such structures can afford both generalization and context dependency in generating skilled behaviors.  相似文献   

It is envisioned that in the near future personal mobile robots will be assisting people in their daily lives. An essential characteristic shaping the design of personal robots is the fact that they must be accepted by human users. This paper explores the interactions between humans and mobile personal robots, by focusing on the psychological effects of robot behavior patterns during task performance. These behaviors include the personal robot approaching a person, avoiding a person while passing, and performing non-interactive tasks in an environment populated with humans. The level of comfort the robot causes human subjects is analyzed according to the effects of robot speed, robot distance, and robot body design, as these parameters are varied in order to present a variety of behaviors to human subjects. The information gained from surveys taken by 40 human subjects can be used to obtain a better understanding of what characteristics make up personal robot behaviors that are most acceptable to the human users.  相似文献   

在多机器人协同搬运过程中,针对传统的强化学习算法仅使用数值分析却忽略了推理环节的问题,将多机器人的独立强化学习与“信念-愿望-意向”(BDI)模型相结合,使得多机器人系统拥有了逻辑推理能力,并且,采用距离最近原则将离障碍物最近的机器人作为主机器人,并指挥从机器人运动,提出随多机器人系统位置及最近障碍物位置变化的评价函数,同时将其与基于强化学习的行为权重结合运用,在多机器人通过与环境不断交互中,使行为权重逐渐趋向最佳。仿真实验表明,该方法可行,能够成功实现协同搬运过程。  相似文献   

林谦  余超  伍夏威  董银昭  徐昕  张强  郭宪 《软件学报》2024,35(2):711-738
近年来,基于环境交互的强化学习方法在机器人相关应用领域取得巨大成功,为机器人行为控制策略优化提供一个现实可行的解决方案.但在真实世界中收集交互样本存在高成本以及低效率等问题,因此仿真环境被广泛应用于机器人强化学习训练过程中.通过在虚拟仿真环境中以较低成本获取大量训练样本进行策略训练,并将学习策略迁移至真实环境,能有效缓解真实机器人训练中存在的安全性、可靠性以及实时性等问题.然而,由于仿真环境与真实环境存在差异,仿真环境中训练得到的策略直接迁移到真实机器人往往难以获得理想的性能表现.针对这一问题,虚实迁移强化学习方法被提出用以缩小环境差异,进而实现有效的策略迁移.按照迁移强化学习过程中信息的流动方向和智能化方法作用的不同对象,提出一个虚实迁移强化学习系统的流程框架,并基于此框架将现有相关工作分为3大类:基于真实环境的模型优化方法、基于仿真环境的知识迁移方法、基于虚实环境的策略迭代提升方法,并对每一分类中的代表技术与关联工作进行阐述.最后,讨论虚实迁移强化学习研究领域面临的机遇和挑战.  相似文献   

This paper presents several novel methods that improve the current input shaping techniques for vibration suppression for multi-degree of freedom industrial robots. Three different techniques, namely, the optimal S-curve trajectory, the robust zero-vibration shaper, and the dynamic zero-vibration shaper, are proposed. These methods can suppress multiple vibration modes of a flexible joint robot under a computed torque control based on a rigid model. The time delays for each method are quantified and compared. The optimal S-curve trajectory finds the maximum jerk to obtain the minimum vibration. The robust zero-vibration shaper can suppress multiple modes without an accurate model. The delay of the dynamic zero-vibration shaper is smaller than the existing input shaping techniques. Our analysis is verified both by simulation and experiment with a six degrees-of-freedom commercial industrial robot.  相似文献   

We present a novel method for a robot to interactively learn, while executing, a joint human–robot task. We consider collaborative tasks realized by a team of a human operator and a robot helper that adapts to the human’s task execution preferences. Different human operators can have different abilities, experiences, and personal preferences so that a particular allocation of activities in the team is preferred over another. Our main goal is to have the robot learn the task and the preferences of the user to provide a more efficient and acceptable joint task execution. We cast concurrent multi-agent collaboration as a semi-Markov decision process and show how to model the team behavior and learn the expected robot behavior. We further propose an interactive learning framework and we evaluate it both in simulation and on a real robotic setup to show the system can effectively learn and adapt to human expectations.  相似文献   

The creation of physical behavior by computational means has been approached differently by industrial and artificial intelligence robotics. Industrial robotics, considering fast response of a robot its most important characteristic, has equipped the robot with predefined, specific behavioral trajectories resulting in fast but inflexible behavior. Artificial intelligence robotics, claiming flexibility as the paramount robot feature, has employed inferred behavior whereby the robot itself determines behavioral patterns for tasks based on the robot's general knowledge about a task domain. Response is now flexible, but the response time is commonly badly degraded. This work defines an action propensity skill which generates flexibleand fast behavior. Flexibility is achieved by attaching perceptions in skills to guide behavior; fast response results from the direct activation of skills. The acquisition and generalization of skills happens under the supervision of a human teacher in an advice-taking mode into which the robot shifts from the execution mode after recognizing lacking competence for a given task. This paper defines such skills, describes an implemented skilled robot system, and discusses some simulation results.  相似文献   

In this work, we combined the model based reinforcement learning (MBRL) and model free reinforcement learning (MFRL) to stabilize a biped robot (NAO robot) on a rotating platform, where the angular velocity of the platform is unknown for the proposed learning algorithm and treated as the external disturbance. Nonparametric Gaussian processes normally require a large number of training data points to deal with the discontinuity of the estimated model. Although some improved method such as probabilistic inference for learning control (PILCO) does not require an explicit global model as the actions are obtained by directly searching the policy space, the overfitting and lack of model complexity may still result in a large deviation between the prediction and the real system. Besides, none of these approaches consider the data error and measurement noise during the training process and test process, respectively. We propose a hierarchical Gaussian processes (GP) models, containing two layers of independent GPs, where the physically continuous probability transition model of the robot is obtained. Due to the physically continuous estimation, the algorithm overcomes the overfitting problem with a guaranteed model complexity, and the number of training data is also reduced. The policy for any given initial state is generated automatically by minimizing the expected cost according to the predefined cost function and the obtained probability distribution of the state. Furthermore, a novel Q(λ) based MFRL method scheme is employed to improve the policy. Simulation results show that the proposed RL algorithm is able to balance NAO robot on a rotating platform, and it is capable of adapting to the platform with varying angular velocity.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号