基于双重金字塔网络的视频目标分割方法 Video object segmentation method based on dual pyramid network期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于双重金字塔网络的视频目标分割方法

引用本文：	姜斯浩,宋慧慧,张开华,汤润发.基于双重金字塔网络的视频目标分割方法[J].计算机应用,2019,39(8):2242-2246.

作者姓名：	姜斯浩宋慧慧张开华汤润发

作者单位：	江苏省大数据分析技术重点实验室(南京信息工程大学),南京,210044;江苏省大数据分析技术重点实验室(南京信息工程大学),南京,210044;江苏省大数据分析技术重点实验室(南京信息工程大学),南京,210044;江苏省大数据分析技术重点实验室(南京信息工程大学),南京,210044

基金项目：	国家自然科学基金资助项目（61872189，61876088）；江苏省自然科学基金资助项目（BK20170040）。

摘要：	针对复杂视频场景中难以分割特定目标的问题,提出一种基于双重金字塔网络(DPN)的视频目标分割方法。首先,通过调制网络的单向传递让分割模型适应特定目标的外观。具体而言,从给定目标的视觉和空间信息中学习一种调制器,并通过调制器调节分割网络的中间层以适应特定目标的外观变化。然后,通过基于不同区域的上下文聚合的方法,在分割网络的最后一层中聚合全局上下文信息。最后,通过横向连接的自左而右结构,在所有尺度中构建高阶语义特征图。所提出的视频目标分割方法是一个可以端到端训练的分割网络。大量实验结果表明,所提方法在DAVIS2016数据集上的性能与较先进的使用在线微调的方法相比,可达到相竞争的结果,且在DAVIS2017数据集上性能较优。
关键词：	视频目标分割特征金字塔卷积神经网络深度学习多尺度融合
收稿时间：	2019-01-02
修稿时间：	2019-03-14
Video object segmentation method based on dual pyramid network

JIANG Sihao,SONG Huihui,ZHANG Kaihua,TANG Runfa.Video object segmentation method based on dual pyramid network[J].journal of Computer Applications,2019,39(8):2242-2246.

Authors:	JIANG Sihao SONG Huihui ZHANG Kaihua TANG Runfa

Affiliation:	Jiangsu Key Laboratory of Big Data Analysis Technology(Nanjing University of Information Science and Technology), Nanjing Jiangsu 210044, China

Abstract:	Focusing on the issue that it is difficult to segment a specific object in a complex video scene, a video object segmentation method based on Dual Pyramid Network (DPN) was proposed. Firstly, the one-way transmission of modulating network was used to make the segmentation model adapt to the appearance of a specific object, which means, a modulator was learned based on visual and spatial information of target object to modulate the intermediate layers of segmentation network to make the network adapt to the appearance changes of specific object. Secondly, global context information was aggregated in the last layer of segmentation network by different-region-based context aggregation method. Finally, a left-to-right architecture with lateral connections was developed for building high-level semantic feature maps at all scales. The proposed video object segmentation method is a network which is able to be trained end-to-end. Extensive experimental results show that the proposed method achieves results which can be competitive to the results of the state-of-the-art methods using online fine-tuning on DAVIS2016 dataset, and outperforms other methods on DAVIS2017 dataset.

Keywords:	video object segmentation feature pyramid Convolutional Neural Network (CNN) deep learning multi-scale fusion
本文献已被维普万方数据等数据库收录！
	点击此处可从《计算机应用》浏览原始摘要信息
	点击此处可从《计算机应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏