首页 | 官方网站   微博 | 高级检索  
     

引入注意力机制的自监督光流计算
作者姓名:安 峰  戴 军  韩 振  严仲兴
作者单位:1. 苏州工业园区服务外包职业学院人工智能学院,江苏 苏州 215123;2. 同济大学经济与管理学院,上海 210092
基金项目:国家自然科学基金项目(71272048);江苏省高校“青蓝工程”优秀教学团队项目(苏教师函[2020]10号)
摘    要:光流计算是诸多计算机视觉系统的关键模块,广泛应用于动作识别、机器人定位与导航等领域。 但目前端到端的光流计算仍受限于数据源的缺少,尤其是真实场景下的光流数据难以获取。人工合成的光流数 据占绝大多数,且合成数据不能完全反应真实场景(如树叶晃动、行人倒影等),难以避免过拟合等情况。无监 督或自监督方法可以利用海量的视频数据进行训练,摆脱了对数据集的依赖,是解决数据集缺少的有效途径。 基于此搭建了一个自监督学习光流计算网络,其中的“Teacher”模块和“Student”模块集成了最新光流计算网 络:稀疏相关体网络(SCV),减少了计算冗余量;同时引入注意力模型作为网络的一个节点,以提高图像特征 在通道和空间上的维度属性。将 SCV 与注意力机制集成在自监督学习光流计算网络之中,在 KITTI 2015 数据 集上的测试结果达到或超过了常见的有监督训练网络。

关 键 词:光流计算  自监督学习  卷积注意力模块  空间/通道注意力  稀疏相关体  

Self-supervised optical flow estimation with attention module
Authors:AN Feng  DAI Jun  HAN Zhen  YAN Zhong-xing
Affiliation:1. School of Artificial Intelligence, Suzhou Industrial Park Institute of Services Outsourcing, Suzhou Jiangsu 215123, China;2. School of Economics & Management, Tongji University, Shanghai 210092, China
Abstract:Optical flow estimation is the key module of many computer vision systems, which is widely utilized in motion recognition, robot positioning, and navigation. However, due to the absence of labeled optical flow datasets of real scenes, synthetic datasets were used as the main training data sources, and synthetic data could not fully represent real scenes (such as leaf movement and pedestrian reflection). Unsupervised or self-supervised methods could employ a large amount of video data for training, and at the same time facilitate fine-tuning of supervised training, which was an effective way to solve the lack of datasets. In this paper, a self-supervised learning optical flow calculation network was constructed, in which the “Teacher” module and the “Student” module adopted sparse correlation volume (SCV) network to reduce the redundancy of correlation computation, and the attention model was introduced as a node of the network, in order to enhance the dimension attribute of image feature in terms of channel and space. This paper marks the first endeavor to implement a self-supervised optical flow computing network based on SCV. The test results on the KITTI 2015 dataset could reach or outperform those of the common supervised training networks such as FlowNet and LightFlowNet. 
Keywords:optical flow estimation  self-supervised learning  convolutional block attention module  spatial/channel  attention  sparse correlation volume  
点击此处可从《》浏览原始摘要信息
点击此处可从《》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号