期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Attention-guided image captioning with adaptive global and local feature fusion

《Journal of Visual Communication and Image Representation》2021

相似文献

2.

A heuristic framework for perceptual saliency prediction

《Journal of Visual Communication and Image Representation》2020

Saliency prediction can be regarded as the human spontaneous activity. The most effective saliency model should highly approximate the response of viewers to the perceived information. In the paper, we exploit the perception response for saliency detection and propose a heuristic framework to predict salient region. First, to find the perceptually meaningful salient regions, an orientation selectivity based local feature and a visual Acuity based global feature are proposed to jointly predict candidate salient regions. Subsequently, to further boost the accuracy of saliency map, we introduce a visual error sensitivity based operator to activate the meaningful salient regions from a local and global perspective. In addition, an adaptive fusion method based on free energy principle is designed to combine the sub-saliency maps from each image channel to obtain the final saliency map. Experimental results on five natural and emotional datasets demonstrate the superiority of the proposed method compared to twelve state-of-the-art algorithms. 相似文献

3.

Spatiotemporal saliency for video classification

《Signal Processing: Image Communication》2009,24(7):557-571

Computer vision applications often need to process only a representative part of the visual input rather than the whole image/sequence. Considerable research has been carried out into salient region detection methods based either on models emulating human visual attention (VA) mechanisms or on computational approximations. Most of the proposed methods are bottom-up and their major goal is to filter out redundant visual information. In this paper, we propose and elaborate on a saliency detection model that treats a video sequence as a spatiotemporal volume and generates a local saliency measure for each visual unit (voxel). This computation involves an optimization process incorporating inter- and intra-feature competition at the voxel level. Perceptual decomposition of the input, spatiotemporal center-surround interactions and the integration of heterogeneous feature conspicuity values are described and an experimental framework for video classification is set up. This framework consists of a series of experiments that shows the effect of saliency in classification performance and let us draw conclusions on how well the detected salient regions represent the visual input. A comparison is attempted that shows the potential of the proposed method. 相似文献

4.

Analysis of visual search patterns with EMD metric in normalized anatomical space

Dempere-Marco L Hu XP Ellis SM Hansell DM Yang GZ 《IEEE transactions on medical imaging》2006,25(8):1011-1021

Eye movements provide important insight into the cognitive processes underlying the visual search tasks. For image understanding, although the visual search patterns of different observers while studying the same scene bear some common characteristics, the idiosyncrasy associated with individual observers provides both research opportunities and challenges. The aim of this paper is to study the spatial characteristics of visual search, together with the intrinsic visual features of the fixation points for comparing different visual search strategies. An analysis framework based on earth mover's distance (EMD) in normalized anatomical space is proposed, and the results are demonstrated with high resolution computed tomography (HRCT) images of the lungs. The study shows that through the effective use of both spatial and feature space representation, it is possible to untangle what appear to be uncorrelated fixation distribution patterns to reveal common visual search behaviors. 相似文献

5.

基于深度卷积神经网络和二进制哈希学习的图像检索方法

彭天强栗芳《电子与信息学报》2016,38(8):2068-2075

随着图像数据的迅猛增长,当前主流的图像检索方法采用的视觉特征编码步骤固定,缺少学习能力,导致其图像表达能力不强,而且视觉特征维数较高,严重制约了其图像检索性能。针对这些问题,该文提出一种基于深度卷积神径网络学习二进制哈希编码的方法,用于大规模的图像检索。该文的基本思想是在深度学习框架中增加一个哈希层,同时学习图像特征和哈希函数,且哈希函数满足独立性和量化误差最小的约束。首先,利用卷积神经网络强大的学习能力挖掘训练图像的内在隐含关系,提取图像深层特征,增强图像特征的区分性和表达能力。然后,将图像特征输入到哈希层,学习哈希函数使得哈希层输出的二进制哈希码分类误差和量化误差最小,且满足独立性约束。最后,给定输入图像通过该框架的哈希层得到相应的哈希码,从而可以在低维汉明空间中完成对大规模图像数据的有效检索。在3个常用数据集上的实验结果表明,利用所提方法得到哈希码,其图像检索性能优于当前主流方法。相似文献

6.

一种基于RGB颜色空间和随机矩形区域的显著性检测方法

杨波姚志均王金武《舰船电子对抗》2013,(5):51-55

提出了一种基于RGB颜色空间和随机矩形区域的显著性检测方法.该方法以R、G、B作为图像特征,然后随机产生不同位置和大小的矩形区域,并统计每个矩形区域内各像素特征值与该区域的特征均值之间的距离,再综合所有矩形区域和所有特征得到最终的显著图.因不需进行颜色空间转换,可大幅减少计算时间;同时,RGB颜色空间三通道的亮度变化比较一致,使得在特征融合时能够充分利用所有特征的信息,因而取得了更好的检测效果.实验结果表明该方法能更快速、更有效地检测出图像中的显著性区域. 相似文献

7.

A Unified Relevance Feedback Framework for Web Image Retrieval 总被引：1，自引：0，他引：1

《IEEE transactions on image processing》2009,18(6):1350-1357

Although relevance feedback (RF) has been extensively studied in the content-based image retrieval community, no commercial Web image search engines support RF because of scalability, efficiency, and effectiveness issues. In this paper, we propose a unified relevance feedback framework for Web image retrieval. Our framework shows advantage over traditional RF mechanisms in the following three aspects. First, during the RF process, both textual feature and visual feature are used in a sequential way. To seamlessly combine textual feature-based RF and visual feature-based RF, a query concept-dependent fusion strategy is automatically learned. Second, the textual feature-based RF mechanism employs an effective search result clustering (SRC) algorithm to obtain salient phrases, based on which we could construct an accurate and low-dimensional textual space for the resulting Web images. Thus, we could integrate RF into Web image retrieval in a practical way. Last, a new user interface (UI) is proposed to support implicit RF. On the one hand, unlike traditional RF UI which enforces users to make explicit judgment on the results, the new UI regards the users' click-through data as implicit relevance feedback in order to release burden from the users. On the other hand, unlike traditional RF UI which hardily substitutes subsequent results for previous ones, a recommendation scheme is used to help the users better understand the feedback process and to mitigate the possible waiting caused by RF. Experimental results on a database consisting of nearly three million Web images show that the proposed framework is wieldy, scalable, and effective. 相似文献

8.

基于特征元素和关联规则的图象分类方法 总被引：3，自引：0，他引：3

李勍章毓晋《电子学报》2002,30(9):1262-1265

图象分类是搜索引擎中的重要模块.本文提出了一种基于特征元素的图象分类方法.特征元素与特征向量相比能够根据人的主观感知来提取图象的视觉特征.与传统的基于特征向量的图象分类方法不同,本文提出的图象分类方法不计算特征空间中特征向量之间的距离,而是通过关联规则挖掘发现图象的特征元素与图象所属类别之间的联系.本文实现了该分类算法并将其与一种基于特征向量的图象分类方法NFL相比较.实验的结果证实了所提方法的优越性. 相似文献

9.

The use of visual search for knowledge gathering in image decision support 总被引：1，自引：0，他引：1

Dempere-Marco L Hu XP MacDonald SL Ellis SM Hansell DM Yang GZ 《IEEE transactions on medical imaging》2002,21(7):741-754

This paper presents a new method of knowledge gathering for decision support in image understanding based on information extracted from the dynamics of saccadic eye movements. The framework involves the construction of a generic image feature extraction library, from which the feature extractors that are most relevant to the visual assessment by domain experts are determined automatically through factor analysis. The dynamics of the visual search are analyzed by using the Markov model for providing training information to novices on how and where to look for image features. The validity of the framework has been evaluated in a clinical scenario whereby the pulmonary vascular distribution on Computed Tomography images was assessed by experienced radiologists as a potential indicator of heart failure. The performance of the system has been demonstrated by training four novices to follow the visual assessment behavior of two experienced observers. In all cases, the accuracy of the students improved from near random decision making (33%) to accuracies ranging from 50% to 68%. 相似文献

10.

Web image concept annotation with better understanding of tags and visual features

Shenghua Gao Liang-Tien Chia Xiangang Cheng 《Journal of Visual Communication and Image Representation》2010,21(8):806-814

This paper focuses on improving the semi-manual method for web image concept annotation. By sufficiently studying the characteristics of tag and visual feature, we propose the Grouping-Based-Precision & Recall-Aided (GBPRA) feature selection strategy for concept annotation. Specifically, for visual features, we construct a more robust middle level feature by concatenating the k-NN results for each type of visual feature. For tag, we construct a concept-tag co-occurrence matrix, based on which the probability of an image belonging to certain concept can be calculated. By understanding the tags’ quality and groupings’ semantic depth, we propose a grouping based feature selection method; by studying the tags’ distribution, we adopt Precision and Recall as a complementary indicator for feature selection. In this way, the advantages of both tags and visual features are boosted. Experimental results show our method can achieve very high Average Precision, which greatly facilitates the annotation of large-scale web image dataset. 相似文献

11.

视觉注意驱动的基于混沌分析的运动检测方法

下载免费PDF全文

马龙王鲁平李飚沈振康《信号处理》2010,26(12):1825-1832

提出了视觉注意驱动的基于混沌分析的运动检测方法(MDSA)。MDSA首先基于视觉注意机制提取图像的显著区域,而后对显著区域进行混沌分析以检测运动目标。算法技术路线为:首先根据场景图像提取多种视觉敏感的底层图像特征;然后根据特征综合理论将这些特征融合起来得到一幅反映场景图像中各个位置视觉显著性的显著图;而后对显著性水平最高的图像位置所在的显著区域运用混沌分析的方法进行运动检测;根据邻近优先和返回抑制原则提取下一最显著区域并进行运动检测,直至遍历所有的显著区域。本文对传统的显著区域提取方法进行了改进以减少计算量:以邻域标准差代替center-surround算子评估图像各位置的局部显著度,采用显著点聚类的方法代替尺度显著性准则提取显著区域;混沌分析首先判断各显著区域的联合直方图（JH）是否呈现混沌特征,而后依据分维数以一固定阈值对存在混沌的JH中各散点进行分类,最后将分类结果对应到显著区域从而实现运动分割。MDSA具有较好的运动分割效果和抗噪性能,对比实验和算法开销分析证明MDSA优于基于马塞克的运动检测方法（MDM）。相似文献

12.

Feature selection for content-based image retrieval

Esin Guldogan Moncef Gabbouj 《Signal, Image and Video Processing》2008,2(3):241-250

In this article, we propose a novel system for feature selection, which is one of the key problems in content-based image indexing and retrieval as well as various other research fields such as pattern classification and genomic data analysis. The proposed system aims at enhancing semantic image retrieval results, decreasing retrieval process complexity, and improving the overall system usability for end-users of multimedia search engines. Three feature selection criteria and a decision method construct the feature selection system. Two novel feature selection criteria based on inner-cluster and intercluster relations are proposed in the article. A majority voting-based method is adapted for efficient selection of features and feature combinations. The performance of the proposed criteria is assessed over a large image database and a number of features, and is compared against competing techniques from the literature. Experiments show that the proposed feature selection system improves semantic performance results in image retrieval systems. This work was supported by the Academy of Finland, Project No. 213,462 (Finnish Centre of Excellence Program 2006–2011). 相似文献

13.

Image feature localization by multiple hypothesis testing of Gabor features. 总被引：2，自引：0，他引：2

Jarmo Ilonen Joni-Kristian Kamarainen Pekka Paalanen Miroslav Hamouz Josef Kittler Heikki K?lvi?inen 《IEEE transactions on image processing》2008,17(3):311-325

相似文献

14.

Image annotation by input-output structural grouping sparsity

Han Y Wu F Tian Q Zhuang Y 《IEEE transactions on image processing》2012,21(6):3066-3079

Automatic image annotation (AIA) is very important to image retrieval and image understanding. Two key issues in AIA are explored in detail in this paper, i.e., structured visual feature selection and the implementation of hierarchical correlated structures among multiple tags to boost the performance of image annotation. This paper simultaneously introduces an input and output structural grouping sparsity into a regularized regression model for image annotation. For input high-dimensional heterogeneous features such as color, texture, and shape, different kinds (groups) of features have different intrinsic discriminative power for the recognition of certain concepts. The proposed structured feature selection by structural grouping sparsity can be used not only to select group-of-features but also to conduct within-group selection. Hierarchical correlations among output labels are well represented by a tree structure, and therefore, the proposed tree-structured grouping sparsity can be used to boost the performance of multitag image annotation. In order to efficiently solve the proposed regression model, we relax the solving process as a framework of the bilayer regression model for multilabel boosting by the selection of heterogeneous features with structural grouping sparsity (Bi-MtBGS). The first-layer regression is to select the discriminative features for each label. The aim of the second-layer regression is to refine the feature selection model learned from the first layer, which can be taken as a multilabel boosting process. Extensive experiments on public benchmark image data sets and real-world image data sets demonstrate that the proposed approach has better performance of multitag image annotation and leads to a quite interpretable model for image understanding. 相似文献

15.

Clustering of hierarchical image database to reduce inter-and intra-semantic gaps in visual space for finding specific image semantics

《Journal of Visual Communication and Image Representation》2016

Empowering content based systems to assign image semantics is an interesting concept. This work explores semantically categorized image database and forms a hierarchical visual search space. Overlapping of visual features of images from different categories and subcategories are possible reasons behind inter-semantic and intra-semantic gaps. Usually each category/node in the image database has a single representation, but variability and broadness of semantic limit the usage of such representation. This work explores the application of agglomerative hierarchical clustering to automatically identify groups within a semantic in the visual space. Visual signatures of dominant clusters corresponding to a node represent its semantic. Adaptive selection of branches on this clustered data facilitates efficient semantic assignment to query image in reduced search cost. Based on the concept, content based semantic retrieval system is developed and tested on hierarchical and non-hierarchical databases. Results showcase capability of the proposed system to reduce inter- and intra-semantic gaps. 相似文献

16.

Similarity-based online feature selection in content-based image retrieval. 总被引：2，自引：0，他引：2

Wei Jiang Guihua Er Qionghai Dai Jinwei Gu 《IEEE transactions on image processing》2006,15(3):702-712

Content-based image retrieval (CBIR) has been more and more important in the last decade, and the gap between high-level semantic concepts and low-level visual features hinders further performance improvement. The problem of online feature selection is critical to really bridge this gap. In this paper, we investigate online feature selection in the relevance feedback learning process to improve the retrieval performance of the region-based image retrieval system. Our contributions are mainly in three areas. 1) A novel feature selection criterion is proposed, which is based on the psychological similarity between the positive and negative training sets. 2) An effective online feature selection algorithm is implemented in a boosting manner to select the most representative features for the current query concept and combine classifiers constructed over the selected features to retrieve images. 3) To apply the proposed feature selection method in region-based image retrieval systems, we propose a novel region-based representation to describe images in a uniform feature space with real-valued fuzzy features. Our system is suitable for online relevance feedback learning in CBIR by meeting the three requirements: learning with small size training set, the intrinsic asymmetry property of training samples, and the fast response requirement. Extensive experiments, including comparisons with many state-of-the-arts, show the effectiveness of our algorithm in improving the retrieval performance and saving the processing time. 相似文献

17.

Multiple kernel learning with NOn-conVex group spArsity

《Journal of Visual Communication and Image Representation》2014,25(7):1616-1624

As the high-dimensional heterogeneous visual features extracted from images are intrinsically embedded in a non-linear space, some kernel methods such as SVM have been proposed to solve this problem. Since different kinds of heterogeneous features in images have different intrinsic discriminative powers for image understanding, how to enforce grouping sparsity penalty to effectively select out discriminative heterogeneous visual features is critical for image understanding. Most existing approaches are using a convex penalty for feature selection, which easily leads to inconsistent selection. To guarantee a consistent selection for heterogeneous features embedded in a non-linear space, this paper proposes a new approach called MKL-NOVA (Multiple Kernel Learning with NOn-conVex group spArsity). Because MKL-NOVA conducts a non-convex penalty for the selection of groups of features, it achieves the consistent selection. Furthermore, considering the contextual correlation between multi labels, sparse canonical correlation analysis is conducted to boost the image annotation performance by MKL-NOVA. We have demonstrated the superior performance of MKL-NOVA via two experiments in the paper. First, we showed that MKL-NOVA converges to the true underlying model by using a ground-truth-available generative-model simulation. Second, we compare the proposed MKL-NOVA and the state-of-the-art approaches which showed that MKL-NOVA achieved the best performance. 相似文献

18.

基于2D-PCA特征描述的非负权重邻域嵌入人脸超分辨率重建算法

曹明明干宗良崔子冠李然朱秀昌《电子与信息学报》2015,37(4):777-783

在基于邻域嵌入人脸图像的超分辨率重建算法中,训练和重建均在特征空间进行,因此,特征选择对算法性能具有较大影响。另外,算法模型对重建权重未加限定,导致负数权重出现而产生过拟合效应,使得重建人脸图像质量衰退。考虑到人脸图像的特征选择以及权重符号限定的重要作用,该文提出一种基于2维主成分分析(2D- PCA)特征描述的非负权重邻域嵌入人脸超分辨率重建算法。首先将人脸图像分成若干子块,利用K均值聚类获得图像子块的局部视觉基元,并利用得到的局部视觉基元对图像子块分类。然后,利用2D-PCA对每一类人脸图像子块提取特征,并建立高、低分辨率样本库。最后,在重建过程中使用新的非负权重求解方法求取权重。仿真实验结果表明,相比其他基于邻域嵌入人脸超分辨率重建方法,所提算法可有效提高权重的稳定性,减少过拟合效应,其重建人脸图像具有较好的主客观质量。相似文献

19.

Hand gesture understanding by weakly-supervised fusing shallow/deep image attributes

《Signal Processing: Image Communication》2020

Accurately recognizing human hand gestures is a useful component in many modern intelligent systems, such as identification authentication, human–computer interaction, and sign language recognition. Conventional approaches are typically based on shallow visual features and relatively simple backgrounds, which cannot readily recognize partially occluded hand gestures with sophisticated backgrounds. In this work, we propose a unified hand gesture recognition framework by optimally fusing a set of shallow/deep finger-level image attributes, based on which a weakly-supervised ranking algorithm is designed to select semantically salient regions for gesture understanding. More specifically, given a rich number of hand gesture images, we employ the well-known BING object proposal generator to extract hundreds of object patches that potentially draw human visual attention. Since the hundreds of object patches are still too many for building an effective recognition system, a weakly-supervised metric is proposed to rank them by extracting multiple shallow/deep features. And visual semantics are encoded at region-level by transferring the image-level semantic tags into various human gesture image regions by a weakly-supervised learning paradigm Apparently, the top-ranking highly salient object patches are highly indicative to human visual perception of human hand gesture, Thus we extract their ImageNet-CNN features and further concatenate them. Finally, the concatenated deep feature is fed into a multi-class SVM for classifying each hand gesture image into a particular type. Comprehensive experimental validations have demonstrated the effectiveness and robustness of our proposed hybrid-feature-based hand gesture categorization. 相似文献

20.

航天遥感图像感兴趣区域的自动提取方法

赵冬赵光恒叶建设《无线电工程》2009,39(9):10-12

感兴趣区域提取是航天遥感图像分析的重要前提。随着图像空间分辨率的提高,场景内显著目标以及背景变得愈加复杂。利用传统的特征提取技术将会耗费大量计算空间和时间。提出了基于改进视觉注意方法的感兴趣区域自动提取,在HSV空间将目标与背景在颜色和亮度上的差异作为显著特征,利用高斯金字塔和中心-周边求差算子计算图像的显著特征图,并对特征图进行归一化和线性融合,设计注意焦点的转移步骤,完成感兴趣区域的自动提取。通过仿真和实验可以看出,本方法能有效地实现航天遥感图像感兴趣区域的自动提取。相似文献