CNN image caption generation期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

CNN image caption generation

Authors:	LI Yong CHENG Honghong LIANG Xinyan GUO Qian QIAN Yuhua

Affiliation:	1. Research Institute of Big Data Science and Industry, Shanxi University, Taiyuan 030006, China;2. Key Lab. of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan 030006, China;3. School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China

Abstract:	The image caption generation task needs to generate a meaningful sentence which can accurately describe the content of the image. Existing research usually uses the convolutional neural network to encode image information and the recurrent neural network to encode text information, due to the “serial character” of the recurrent neural network which result in the low performance. In order to solve this problem, the model we proposed is completely based on the convolutional neural network. We use different convolutional neural networks to process the data of two modals simultaneously. Benefiting from the “parallel character” of convolution operation, the efficiency of the operation has been significantly improved, and experiments have been carried out on two public data sets. Experimental results have also been improved in the specified evaluation indexes, which indicates the effectiveness of the model for processing the image caption generation task.

Keywords:	multi-modal data image caption long short term memory neural networks

	点击此处可从《西安电子科技大学学报》浏览原始摘要信息
	点击此处可从《西安电子科技大学学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏