首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
细粒度图像分类的目标是区分同一个常见类下的不同子类,由于数据集往往存在较大的类内差异和较大的类间相似性,细粒度图像分类相比于传统图像分类具有更大的挑战性。以往工作中,基于组件的方法和基于注意力的方法致力于挖掘图像中的判别力区域,而忽视了用来区分易混淆类别的微弱差异。为了解决以上问题,本文提出了一个基于多视角融合的细粒度图像分类方法,包含两个分支,其中一个分支基于特征图挖掘图像的局部特征,另一个分支则学习图像的全局特征。同时引入一种嵌入损失,与传统多分类交叉熵损失函数结合增强特征的判别性,进而提升模型的分类性能。所提方法仅使用图像级标签,在CUB-200-2011,Stanford Cars和FGVC Aircraft这三个基准数据集上的分类准确率分别达到了88.3%,94.3%和92.4%,实验结果表明所提方法相比其它细粒度图像分类方法具有一定的优越性。   相似文献   

2.
Considering limitations of Linear Discriminant Analysis (LDA) and Marginal Fisher Analysis (MFA), a novel discriminant analysis called Local Correlation Discriminant Analysis (LCDA) is proposed in this paper. The main idea behind LCDA is to use more robust similarity measure, correlation metric, to measure the local similarity between image data. This results in better classification performance. In addition, to further improve the discriminant power of LCDA, we extend LCDA to semi-supervised case, which can make use of both labeled and unlabeled data to perform discriminant analysis. Extensive experimental results on ORL and AR face databases demonstrate that the proposed LCDA and its semi-supervised version are superior to Principal Component Analysis (PCA), LDA, CEA, and MFA.  相似文献   

3.
Fine-grained Visual Categorization (FGVC) in computer vision aims to recognize images belonging to multiple subordinate categories of a super-category. The difficulty of FGVC lies in the close resemblance among inter-classes and large variations among intra-classes. Most existing networks only focus on a few discriminative regions, while ignoring many subtle complementary features. So we propose a Progressive Erasing Network (PEN). In PEN, a Multi-Grid Erasure mechanism augments data samples and assists in capturing the local discriminative features, where the overall structure of the image is destroyed indirectly through pixel-wise erasure. Cross-layer feature aggregation by extracting salient class features is of great significance in FGVC. However, the capability of cross-layer feature representation based on a simple aggregation strategy is still inefficient. To this end, the proposed Consistency loss explores the cross-layer semantic affinity, which guides the Cross-Layer Incentive (CLI) block to mine more efficient feature representations of different granularity. We also integrate Cross Entropy and Complementary Entropy to take the distribution of negative classes into account for better classification performance. Our method uses end-to-end training with only classification labels. Experimental results show that our model outperforms the state-of-the-art on three fine-grained benchmarks.  相似文献   

4.
常规基于内容图像检索的方法是提取图像的颜色、纹理等物理特征,运用相似性度量准则从图像库中查询相似的图像。为了提高图像检索的正确率,这里提出改进的方法。具体方法是:提取图像的物理特征,并将特征作为支持向量机(SVM)的输入向量,对图像进行分类,然后利用分类结果,对检索图像进行相似性匹配,从同类图像中找出相似的图像。实验结果显示,该方法的检索结果优于常规方法。  相似文献   

5.
6.
云模型相似度方法是对象相似性分析的一种重要方法.为提高图像分类的准确性,提出一种基于云模型相似度的图像分类方法.首先给出图像云模型的定义,然后根据云模型方法的逆向云算法对图像云模型特征进行数字特征计算,最后引入云模型相似性测度方法对图像云模型相似性进行测算并确定图像分类.仿真结果表明,文章所提方法可准确地对图像进行分类,且计算效率较高.  相似文献   

7.
胡正平  涂潇蕾 《信号处理》2011,27(10):1536-1542
针对场景分类问题中,传统的“词包”模型不包含图像的上下文信息,且没有考虑图像特征间的类别差异问题,本文提出一种多方向上下文特征结合空间金字塔模型的场景分类方法。该方法首先对图像进行均匀网格分块并提取尺度不变(SIFT)特征,对每个局部图像块分别结合其周围三个方向的空间相邻区域,形成三种上下文特征;然后,将每类训练图像的上下文特征分别聚类形成视觉词汇,再将其连接形成最终的视觉词汇表,得到图像的视觉词汇直方图;最后,结合空间金字塔匹配算法形成金字塔直方图,并采用SVM分类器来进行分类。该方法将图像块在特征域的相似性同空间域的上下文关系有机地结合起来并加以类别区分,从而形成了具有更好区分力的视觉词汇表。在通用场景图像库上的实验表明,相比传统方法具有更好的分类性能。   相似文献   

8.
针对视频分类中普遍面临的类内离散度和类间相似性较大而制约分类性能的问题,该文提出一种基于深度度量学习的视频分类方法。该方法设计了一种深度网络,网络包含特征学习、基于深度度量学习的相似性度量,以及分类3个部分。其中相似性度量的工作原理为:首先,计算特征间的欧式距离作为样本之间的语义距离;其次,设计一个间隔分配函数,根据语义距离动态分配语义间隔;最后,根据样本语义间隔计算误差并反向传播,使网络能够学习到样本间语义距离的差异,自动聚焦于难分样本,以充分学习难分样本的特征。该网络在训练过程中采用多任务学习的方法,同时学习相似性度量和分类任务,以达到整体最优。在UCF101和HMDB51上的实验结果表明,与已有方法相比,提出的方法能有效提高视频分类精度。  相似文献   

9.
由于自然条件下拍摄的花卉图像背景复杂,而且其存在类内差异性大和类间相似性高的问题,现有主流方法仅依靠卷积模块提取花卉的局部特征难以实现准确的细粒度分类。针对上述问题,本文提出了1种高精度、轻量化的花卉分类方法(ConvTrans-ResMLP),通过结合Transformer模块和残差MLP(multi-layer perceptron)模块实现对花卉图像的全局特征提取,并在Transformer模块中加入卷积计算使得模型仍保留提取局部特征的能力;同时,为了进一步将花卉分类模型部署到边缘设备中,本研究基于知识蒸馏技术实现对模型的压缩与优化。实验结果表明,本文所提出的方法在Oxford 17、Oxford 102和自制的Flowers 32数据集上的准确率分别达98.62%、97.61%和98.40%;知识蒸馏后本文的轻量化模型的大小约为原来的1/18,而准确率仅下降2%左右。因此,本研究能较好地提升边缘设备下花卉细粒度分类的效率,对促进花卉培育的自动化发展具有切实意义。  相似文献   

10.
Face recognition using spatially constrained earth mover's distance   总被引:3,自引:0,他引:3  
Face recognition is a challenging problem, especially when the face images are not strictly aligned (e.g., images can be captured from different viewpoints or the faces may not be accurately cropped by a human or automatic algorithm). In this correspondence, we investigate face recognition under the scenarios with potential spatial misalignments. First, we formulate an asymmetric similarity measure based on Spatially constrained Earth Mover's Distance (SEMD), for which the source image is partitioned into nonoverlapping local patches while the destination image is represented as a set of overlapping local patches at different positions. Assuming that faces are already roughly aligned according to the positions of their eyes, one patch in the source image can be matched only to one of its neighboring patches in the destination image under the spatial constraint of reasonably small misalignments. Because the similarity measure as defined by SEMD is asymmetric, we propose two schemes to combine the two similarity measures computed in both directions. Moreover, we adopt a distance-as-feature approach by treating the distances to the reference images as features in a Kernel Discriminant Analysis (KDA) framework. Experiments on three benchmark face databases, namely the CMU PIE, FERET, and FRGC databases, demonstrate the effectiveness of the proposed SEMD.  相似文献   

11.
Complex wavelet structural similarity (CW-SSIM) index has been recognized as a novel image similarity measure of broad potential applications due to its robustness to small geometric distortions such as translation, scaling and rotation of images. Nevertheless, how to make the best use of it in image classification problems has not been deeply investigated. In this paper, we introduce a series of novel image classification algorithms based on CW-SSIM and use handwritten digit recognition, and face recognition as examples for demonstration. Among the proposed approaches, the best compromise between accuracy and complexity is obtained by the CW-SSIM support vector machine based algorithms, which combines an unsupervised clustering method to divide the training images into clusters with representative images and a supervised learning method based on support vector machines to maximize the classification accuracy. Our experiments show that such a conceptually simple image classification method, which does not involve any registration, intensity normalization or sophisticated feature extraction processes, and does not rely on any modeling of the image patterns or distortion processes, achieves competitive performance with reduced computational cost.  相似文献   

12.
高光谱图像的低空间分辨率特性往往导致全局纹理提取技术难以获取地物要素的精准纹理信息,同时,单一尺度的局部纹理提取技术难以达到有效识别地物的目的。基于此,该文设计了一种多尺度超像素纹理保持与融合(MSuTPF)的高光谱图像分类方法,主要架构如下:首先,利用2D Gabor滤波器对高光谱图像进行多方向与尺度的全局纹理提取,并通过融合各尺度的纹理特征,增强纹理结构表征能力;其次,融合纹理与光谱主成分特征以形成光谱-纹理联合判别特征;再次,采用形状自适应的超分割方法,作用至光谱-纹理联合特征进行局部纹理信息保持与融合,尤其是,为克服超像素邻域像元的隐性不相关问题,该文定义了基于密度最近邻相似性评价准则,使超像素纹理进一步趋于一致性;最后,将各更新的光谱-纹理联合特征输入像素级分类器获取其对应的类标签,并采用多数表决的决策融合机制取得最终分类结果。Indian Pines和Pavia University真实数据集的实验表明,该方法在小样本条件下的分类精度优于基准分类器(SVM)、深度学习方法(GFDN)以及最新的空-谱分类方法(S3-PCA)等8个对比方法,充分证明了该文所提方法的实用性和有效性。  相似文献   

13.
刘颖  车鑫 《信号处理》2022,38(1):202-210
近年来,虽然深度学习技术在图像分类任务中取得了有竞争力的表现,但实际应用中,往往存在缺乏大量训练样本的情况,易于产生过拟合现象.小样本学习技术为此提供了解决方案.由于图神经网络在表示类内和类间样本关系上的优势,已被用于小样本图像分类任务.现有算法是通过几个卷积块获取图像特征作为节点特征输入图网络,为了更好的表示图节点之...  相似文献   

14.
SAR图像相似度准则是目标识别、图像匹配等研究内容的基础,定义合理可靠的相似度将极大地提高SAR图像解译能力。该文针对SAR图像轮廓特点,提出一种基于不确定轮廓的相似度置信区间及其可信度构建方法。首先将SAR图像不确定轮廓模糊化得到相似度定义,进而通过分析模糊模型分布函数,在给定显著性水平下得到相似度置信区间,并给出可信度定义。实验结果表明,该方法对轮廓定位有一定容错性,对一定程度的断裂轮廓及多边缘轮廓也能得到合理的相似度范围和可信度,符合人眼视觉感知。  相似文献   

15.
针对大规模图像分类处理中图像旋转或背景变换导致的配准度较低问题,提出一种基于边缘增强的卷积神经网络图像分类方法。该方法通过VGG19网络模型提取图像特征,并使用余弦相似度进行图像分类判定,利用边缘增强突出图像主体的边缘特征,降低图像旋转或背景变换对VGG19网络分类性能带来的影响。实验证明,该方法可以有效地提高同一主体旋转图像和背景变换图像与原始图像的相似度,适用于各类图像的分类。  相似文献   

16.
The bag of visual words (BOW) model is an efficient image representation technique for image categorization and annotation tasks. Building good visual vocabularies, from automatically extracted image feature vectors, produces discriminative visual words, which can improve the accuracy of image categorization tasks. Most approaches that use the BOW model in categorizing images ignore useful information that can be obtained from image classes to build visual vocabularies. Moreover, most BOW models use intensity features extracted from local regions and disregard colour information, which is an important characteristic of any natural scene image. In this paper, we show that integrating visual vocabularies generated from each image category improves the BOW image representation and improves accuracy in natural scene image classification. We use a keypoint density-based weighting method to combine the BOW representation with image colour information on a spatial pyramid layout. In addition, we show that visual vocabularies generated from training images of one scene image dataset can plausibly represent another scene image dataset on the same domain. This helps in reducing time and effort needed to build new visual vocabularies. The proposed approach is evaluated over three well-known scene classification datasets with 6, 8 and 15 scene categories, respectively, using 10-fold cross-validation. The experimental results, using support vector machines with histogram intersection kernel, show that the proposed approach outperforms baseline methods such as Gist features, rgbSIFT features and different configurations of the BOW model.  相似文献   

17.
为了使冗余字典能够自适应地表征图像特征,提 出了一种优化的图像字典构造算法。算法 采用了冗余字典内基元类之间的灰色关联度作为字典优化依据,建立了一种新的字典优化原 则,提出了一 种自适应的字典设计算法。算法能够根据图像结构和噪声等信息自适应地选择字典的冗余 因子,将算法运用于图像去噪,结果表明,算法效率大大提高,同时也提高了图像去噪 效果。  相似文献   

18.
徐漫飞 《电子科技》2013,26(4):25-27
将空间域图像质量评价方法结构相似度SSIM推广到HWD变换域,结合人眼视觉倾斜效应和粒子群优化算法,提出一种新的图像质量评价测度。将SSIM直接用于各HWD分解频带,用频带相关性图加权各频带的结构相似度得到局部质量,然后对不同方向的局部质量求加权和得到整幅图像的结构相似度。实验结果表明,该测度与主观感知有较好的一致性,能准确地反映人眼对图像的视觉感知。  相似文献   

19.
针对遥感图像场景分类面临的类内差异性大、类间相似性高导致的部分场景出现分类混淆的问题,该文提出了一种基于双重注意力机制的强鉴别性特征表示方法。针对不同通道所代表特征的重要性程度以及不同局部区域的显著性程度不同,在卷积神经网络提取的高层特征基础上,分别设计了一个通道维和空间维注意力模块,利用循环神经网络的上下文信息提取能力,依次学习、输出不同通道和不同局部区域的重要性权重,更加关注图像中的显著性特征和显著性区域,而忽略非显著性特征和区域,以提高特征表示的鉴别能力。所提双重注意力模块可以与任意卷积神经网络相连,整个网络结构可以端到端训练。通过在两个公开数据集AID和NWPU45上进行大量的对比实验,验证了所提方法的有效性,与现有方法对比,分类准确率取得了明显的提升。  相似文献   

20.
Recently, sparse coding has become popular for image classification. However, images are often captured under different conditions such as varied poses, scales and different camera parameters. This means local features may not be discriminative enough to cope with these variations. To solve this problem, affine transformation along with sparse coding is proposed. Although proven effective, the affine sparse coding has no constraints on the tilt and orientations as well as the encoding parameter consistency of the transformed local features. To solve these problems, we propose a Laplacian affine sparse coding algorithm which combines the tilt and orientations of affine local features as well as the dependency among local features. We add tilt and orientation smooth constraints into the objective function of sparse coding. Besides, a Laplacian regularization term is also used to characterize the encoding parameter similarity. Experimental results on several public datasets demonstrate the effectiveness of the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号