社会物理信息系统 (Cyber-physical-social systems, CPSS)在传统物理信息系统 (Cyber-physical systems, CPS)的基础上纳入对社会信号及社会关系的考虑, 利用网络世界近乎无限的人力、数据和信息资源, 突破物理世界有限的资源约束以及时空的限制. 然而, CPSS中人类和社会行为的复杂性加剧了实际系统和其模型之间的建模鸿沟, 使得系统的形态演变为“默顿系统”. 对此, 以ACP方法为核心的平行智能 (Parallel intelligence, PI) 框架通过组合人工系统 (Artificial systems, A)、计算实验 (Computational experiments, C)、平行执行 (Parallel execution, P)三个过程, 为跨越这一鸿沟提供了可行的路径. 具体而言, ACP将模型从系统解析器转变为数据生成器, 使原本难以控制的“默顿系统”可测试、可计算、可验证, 为复杂系统中“涌现”和“收敛”的对立统一确立了方法基础. 本文从平行控制与智能控制、平行机器人与平行制造、平行管理与智能交通、平行医学与智慧健康、平行生态与平行社会、平行经济系统与社会计算、平行军事系统以及平行认知与平行哲学这八个方面阐述面向CPSS的平行智能应用成果. 最后, 对CPSS未来的发展方向和技术趋势进行了讨论与展望.  相似文献   

电动汽车动力电池是分布式储能的重要组成部分.对此,基于信息物理社会(CPSS)融合系统理论,深度融合信息(对私家车出行的调查数据)、物理(动力电池的充放电物理模型)以及社会(实际用户对电价或激励的响应)因素,借鉴平行系统思想,以软件定义的方式构建映射真实电动汽车群体的平行人工电动汽车群体,研究电动汽车作为分布式储能参与储能汇聚复用的可行性和有效性.以参与辅助电网平抑区域负荷波动为例,采用蒙特卡洛方法和中心极限定理,得到不同场景下人工电动汽车群体的日充放电曲线,并进一步完成以下仿真实验:1)给定价格策略下,不同理性程度的用户充放电行为差异;2)不同价格策略下,电动汽车群体充放电行为对区域负荷方差的影响和相应的电网成本与收益;3)不同动力电池参数下,电动汽车群体接入电网后区域负荷方差缩减量变化.得到的仿真结果可以为电网制定合理的价格策略提供指导,也为挖掘影响电动汽车作为分布式储能帮助降低区域负荷方差的关键因素提供依据,并从源头上为电动汽车有序接入电网提供技术支撑.  相似文献   

本文针对动态流水车间调度问题(DFSP), 以最小化最大完工时间为优化目标, 提出一种自适应深度强化学习算法(ADRLA)进行求解. 首先, 将DFSP的新工件动态到达过程模拟为泊松过程, 进而采用马尔科夫决策过程(MDP)对DFSP的求解过程进行描述, 将DFSP转化为可由强化学习求解的序贯决策问题. 然后, 根据DFSP的排序模型特点, 设计具有较好状态特征区分度和泛化性的状态特征向量, 并依此提出5种特定动作(即调度规则)来选择当前需加工的工件, 同时构造基于问题特性的奖励函数以获取动作执行效果的评价值(即奖励值), 从而确定ADRLA的3类基本要素. 进而, 以深度双Q网络(DDQN) 作为ADRLA中的智能体, 用于进行调度决策. 该智能体采用由少量小规模DFSP确定的数据集(即3类基本要素在不同问题上的数据)训练后, 可较准确刻画不同规模DFSP的状态特征向量与Q值向量(由各动作的Q值组成)间的非线性关系, 从而能对各种规模DFSP进行自适应实时调度. 最后, 通过在不同测试问题上的仿真实验和与算法比较, 验证了所提ADRLA求解DFSP的有效性和实时性.  相似文献   

衡量航母作战性能的重要指标是舰载机出动架次率,而影响舰载机出动架次率的关键因素是舰载机保障作业调度效率.舰载机保障作业调度是指在有限时间、空间和资源约束的前提下合理安排舰载机所需保障作业顺序并高效完成舰载机的作业保障.现有基于最优化方法(动态规划、线性规划等)和启发式方法(如遗传算法、粒子群等)的求解策略仅适用于保障作...  相似文献   

强化学习系统及其基于可靠度最优的学习算法   总被引:3,自引:0,他引:3  
归纳了强化学习的主要理论方法,提出了一个区分主客观因素的强化学习系统描述,引入了任务域的概念,针对以往强化学习采用的期望最优准则描述任务域能力的不足,考虑了目标水平准则下的首达时间可靠度优准则模型,分别结合随机逼近理论和时间差分理论,提出了基于概率估计的J-学习和无需建增量R-学习。  相似文献   

赵海峰  余强  曹俞旦 《计算机科学》2014,41(12):160-163
多标签学习用于处理一个样本同时拥有多个标签的问题。已有的多标签懒惰学习算法IMLLA未充分考虑样本分布的特点,即在构建样本的近邻点集时,近邻点个数取固定值,这可能会将相似度高的点排除在近邻集之外,或者将相似度低的点包括在近邻集内,影响分类方法的性能。针对IMLLA的缺陷,将粒计算的思想加入近邻集的构建,提出一种基于粒计算的多标签懒惰学习算法(GMLLA)。该方法通过粒度控制,确定样本近邻点集,使得近邻集内的样本具有高相似度。实验结果表明,本算法的性能优于IMLLA。  相似文献   

研究了多Agent环境下的协作与学习.对多Agent系统中的协作问题提出了协作模型MACM,该模型通过提供灵活协调机制支持多Agent之间的协作及协作过程中的学习.系统中的学习Agent采用分布式强化学习算法.该算法通过映射减少Q值表的存储空间,降低对系统资源的要求,同时能够保证收敛到最优解.  相似文献   

1.引言学习是人类获取知识的主要形式,也是人类具有智能的显著标志,是人类提高智能水平的基本途径。建造具有类似人的智能机器(Agent)是智能控制、人工智能的研究目标。要使机器具有一定的智能,一种方式是靠人事先编程来建立知识库和推理机制,这具有明显的局限性。我们希望Agent具有向环境学习的能力,即自动获取知识、积累经验、不断更新和扩充知识,  相似文献   

基于深度强化学习的平行企业资源计划   总被引:1,自引:0,他引:1  
秦蕊  曾帅  李娟娟  袁勇 《自动化学报》2017,43(9):1588-1596
传统的企业资源计划(Enterprise resource planning,ERP)采用静态化的业务流程设计理念,忽略了人的关键作用,且很少涉及系统性的过程模型,因此难以应对现代企业资源计划的复杂性要求.为实现现代企业资源计划的新范式,本文在ACP(人工社会(Artificial societies)、计算实验(Computational experiments)、平行执行(Parallel execution))方法框架下,以大数据为驱动,融合深度强化学习方法,构建基于平行管理的企业ERP系统.首先基于多Agent构建ERP整体建模框架,然后针对企业ERP的整个流程建立序贯博弈模型,最后运用基于深度强化学习的神经网络寻找最优策略,解决复杂企业ERP所面临的不确定性、多样性和复杂性.  相似文献   

主动配电网的新能源、储能等能源形式可以有效提高运行的灵活性和可靠性, 同时新能源和负荷也给配电网带来了双重不确定性, 致使主动配电网的实时优化调度决策维度大、建模精度差. 针对这一问题, 本文提出结合图神经网络和强化学习的图强化学习方法, 避免对复杂系统的精准建模. 首先, 将实时优化调度问题表述为马尔可夫决策过程, 并将其表述为动态序贯决策问题. 其次, 提出了基于物理连接关系的图表示方法, 用以表达状态量的隐含相关性. 随后, 提出图强化学习来学习将系统状态图映射到决策输出的最优策略. 最后, 将图强化学习推广到分布式图强化学习. 算例结果表明, 图强化学习在最优性和效率方面都取得了更好的效果.  相似文献   

In a real-time database system, an application supports a mix of transactions. These include the real-time transactions that require completion by a given deadline. Time-critical requirements also exist in many distributed multi-media system applications. Existing concurrency control procedures introduce excessive delays due to non-availability of data resources. In this study, we ignore the delays incurred by ordinary transactions, in order to achieve a non-interference mode of execution (near parallel) for the time-critical transactions. For this purpose, a data allocation model has been studied. It is a stochastic process model based on the use of two-phase locking. It highlights the available possibilities for reductions of delays for time-critical transactions within a distributed real-time database systems. Based on the new conceptual model, modified synchronization techniques for time-critical transactions have been proposed.  相似文献   

As a complex and critical cyber-physical system (CPS), the hybrid electric powertrain is significant to mitigate air pollution and improve fuel economy. Energy management strategy (EMS) is playing a key role to improve the energy efficiency of this CPS. This paper presents a novel bidirectional long short-term memory (LSTM) network based parallel reinforcement learning (PRL) approach to construct EMS for a hybrid tracked vehicle (HTV). This method contains two levels. The high-level establishes a parallel system first, which includes a real powertrain system and an artificial system. Then, the synthesized data from this parallel system is trained by a bidirectional LSTM network. The lower-level determines the optimal EMS using the trained action state function in the model-free reinforcement learning (RL) framework. PRL is a fully data-driven and learning-enabled approach that does not depend on any prediction and predefined rules. Finally, real vehicle testing is implemented and relevant experiment data is collected and calibrated. Experimental results validate that the proposed EMS can achieve considerable energy efficiency improvement by comparing with the conventional RL approach and deep RL.   相似文献   

By virtue of alternating direction method of multipliers(ADMM), Newton-Raphson method, ratio consensus approach and running sum method, two distributed iterative strategies are presented in this paper to address the economic dispatch problem(EDP) in power systems. Different from most of the existing distributed ED approaches which neglect the effects of packet drops or/and time delays, this paper takes into account both packet drops and time delays which frequently occur in communication networks. Moreover, directed and possibly unbalanced graphs are considered in our algorithms, over which many distributed approaches fail to converge. Furthermore, the proposed schemes can address the EDP with local constraints of generators and nonquadratic convex cost functions, not just quadratic ones required in some existing ED approaches. Both theoretical analyses and simulation studies are provided to demonstrate the effectiveness of the proposed schemes.  相似文献   

Reinforcement learning (RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming (ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively. Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks, showing how they promote ADP formulation significantly. Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has demonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.  相似文献   

非线性系统学习控制理论的发展与展望   总被引:6,自引:0,他引:6  
论述了学习控制的基本理论问题,给出了与学习控制系统的基本定义,着重讨论了学习控制方法产生的历史背景、目前非线性系统学习控制的研究状况,提出了一些有待继续研究的问题。  相似文献   

In this paper, both output-feedback iterative learning control (ILC) and repetitive learning control (RLC) schemes are proposed for trajectory tracking of nonlinear systems with state-dependent time-varying uncertainties. An iterative learning controller, together with a state observer and a fully-saturated learning mechanism, through Lyapunov-like synthesis, is designed to deal with time-varying parametric uncertainties. The estimations for outputs, instead of system outputs themselves, are applied to form the error equation, which helps to establish convergence of the system outputs to the desired ones. This method is then extended to repetitive learning controller design. The boundedness of all the signals in the closed-loop is guaranteed and asymptotic convergence of both the state estimation error and the tracking error is established in both cases of ILC and RLC. Numerical results are presented to verify the effectiveness of the proposed methods.   相似文献   

徐昕  沈栋  高岩青  王凯 《自动化学报》2012,38(5):673-687
基于马氏决策过程(Markov decision process, MDP)的动态系统学习控制是近年来一个涉及机器学习、控制理论和运筹学等多个学科的交叉研究方向, 其主要目标是实现系统在模型复杂或者不确定等条件下基于数据驱动的多阶段优化控制. 本文对基于MDP的动态系统学习控制理论、算法与应用的发展前沿进行综述,重点讨论增强学习(Reinforcement learning, RL)与近似动态规划(Approximate dynamic programming, ADP)理论与方法的研究进展,其中包括时域差值学习理论、求解连续状态与行为空间MDP的值函数逼近方法、 直接策略搜索与近似策略迭代、自适应评价设计算法等,最后对相关研究领域的应用及发展趋势进行分析和探讨.  相似文献   

研究了控制信号被恶意篡改的信息物理系统的安全控制问题. 首先, 提出一种改进果蝇优化核极限学习机算法(Kernel extreme learning machine with improved fruit fly optimization algorithm, IFOA-KELM)对攻击信号进行重构. 然后, 将所得重构信号作为系统扰动加以补偿, 进而设计模型预测控制策略, 并给出了使被控系统是输入到状态稳定的条件. 另外, 本文从攻击者角度建立优化模型得到最优攻击策略用以生成足够的受攻击数据, 基于此数据, 来训练改进果蝇优化核极限学习机算法. 最后, 使用弹簧−质量−阻尼系统进行仿真, 验证了改进果蝇优化极限学习机算法和所提安全控制策略的有效性.  相似文献   

一类线性离散切换系统的迭代学习控制   总被引:1,自引:0,他引:1  
考虑具有任意切换序列线性离散切换系统的迭代学习控制问题. 假设切换系统在有限时间区间内重复运行, P型ILC算法可实现该类系统在整个时间区间内的完全跟踪控制. 采用超向量方法给出了算法在迭代域内收敛的条件, 并在理论上分析了的收敛性. 仿真示例验证了理论的结果.  相似文献   

数字孪生与平行系统:发展现状、对比及展望   总被引:10,自引:0,他引:10  
杨林瑶  陈思远  王晓  张俊  王成红 《自动化学报》2019,45(11):2001-2031
随着物联网、大数据、人工智能(Artificial intelligence,AI)等技术的发展,针对促进新一代信息技术与制造业深度融合、实现制造物理世界与信息世界交互与共融的需要,数字孪生和平行系统技术成为智能制造和复杂系统管理与控制领域研究的热点.本文对数字孪生和平行系统技术的基本概念、技术内涵、相关应用等进行了研究与总结,对比了两者之间的异同,并分析了两者的发展趋势,预期能够给复杂系统管理与控制领域的研究人员提供一定的参考和借鉴.  相似文献   

