Reinforcement learning as autonomous learning is greatly driving artificial intelligence(AI)development to practical applications.Having demonstrated the potential to significantly improve synchronously parallel learning,the parallel computing based asynchronous advantage actor-critic(A3C)opens a new door for reinforcement learning.Unfortunately,the acceleration's influence on A3C robustness has been largely overlooked.In this paper,we perform the first robustness assessment of A3C based on parallel computing.By perceiving the policy's action,we construct a global matrix of action probability deviation and define two novel measures of skewness and sparseness to form an integral robustness measure.Based on such static assessment,we then develop a dynamic robustness assessing algorithm through situational whole-space state sampling of changing episodes.Extensive experiments with different combinations of agent number and learning rate are implemented on an A3C-based pathfinding application,demonstrating that our proposed robustness assessment can effectively measure the robustness of A3C,which can achieve an accuracy of 83.3%.
Tong ChenJi-Qiang LiuHe LiShuo-Ru WangWen-Jia NiuEn-Dong TongLiang ChangQi Alfred ChenGang Li
为提高复杂动态背景下运动目标检测精度,基于低秩及稀疏分解理论,本文提出一种基于群稀疏的运动目标检测方法.所提方法将观测视频分解为低秩静态背景,群稀疏前景及动态背景三部分.所提方法首先使用伽马范数近乎无偏近似矩阵秩函数,以解决核范数过度惩罚较大奇异值导致所得最小化问题无法获得最优解进而降低检测性能的问题;其次,为利用前景目标边界先验信息以提升运动目标检测性能,每一帧使用过分割算法生成同性区域以定义群稀疏范数并用于约束前景矩阵;再次,为避免运动目标同时出现在稀疏前景和动态背景中,引入非相干项以提升二者可分性;最后,本文利用交替方向乘子方法(Alternating Direction Method of Multipliers,ADMM)求解所得非凸目标函数.实验结果表明,与现有主流运动目标检测算法相比,复杂动态背景下本文所提方法可较好抑制动态背景从而显著提高复杂运动背景下运动目标检测精度.