首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 718 毫秒
1.
顾昕  张兴亮  王超  陈思媛  方正 《计算机应用》2014,(Z2):280-282,313
为了提高图像检索的效率,提出一种基于文本和内容的图像检索算法。该算法采用稠密的尺度不变特征转换( DSIFT)构造视觉单词的方式来描述图像内容,依据基于概率潜在语义分析( PLSA)模型的图像自动标注方法获取的视觉语义对查询图像进行初步检索,在此结果集上对筛选出的语义相关图像按内容相似度排序输出。在数据集Corel1000上的实验结果表明,该算法能够实现有效的图像检索,检索效率优于单一的基于内容的图像检索算法。  相似文献   

2.
图像语义检索的一个关键问题就是要找到图像底层特征与语义之间的关联,由于文本是表达语义的一种有效手段,因此提出通过研究文本与图像两种模态之间关系来构建反映两者间潜在语义关联的有效模型的思路。基于该模型,可使用自然语言形式(文本语句)来表达检索意图,最终检索到相关图像。该模型基于稀疏典型性相关分析(sparse canonical correlation analysis,简称sparse CCA),按照如下步骤训练得到:首先利用隐语义分析方法构造文本语义空间,然后以视觉词袋(bag of visual words)来表达文本所对应的图像,最后通过Sparse CCA算法找到一个语义相关空间,以实现文本语义与图像视觉单词间的映射。使用稀疏的相关性分析方法可以提高模型可解释性和保证检索结果稳定性。实验结果验证了Sparse CCA方法的有效性,同时也证实了所提出的图像语义检索方法的可行性。  相似文献   

3.
一种基于稀疏典型性相关分析的图像检索方法   总被引:1,自引:0,他引:1  
庄凌  庄越挺  吴江琴  叶振超  吴飞 《软件学报》2012,23(5):1295-1304
图像语义检索的一个关键问题就是要找到图像底层特征与语义之间的关联,由于文本是表达语义的一种有效手段,因此提出通过研究文本与图像两种模态之间关系来构建反映两者间潜在语义关联的有效模型的思路,基于该模型,可使用自然语言形式(文本语句)来表达检索意图,最终检索到相关图像.该模型基于稀疏典型性相关分析(sparse canonical correlation analysis,简称sparse CCA),按照如下步骤训练得到:首先利用隐语义分析方法构造文本语义空间,然后以视觉词袋(bag of visual words)来表达文本所对应的图像,最后通过Sparse CCA算法找到一个语义相关空间,以实现文本语义与图像视觉单词间的映射.使用稀疏的相关性分析方法可以提高模型可解释性和保证检索结果稳定性.实验结果验证了Sparse CCA方法的有效性,同时也证实了所提出的图像语义检索方法的可行性.  相似文献   

4.
图像语义检索的一个有效解决途径是找到图像底层特征与文本语义之间的关联.文中在核方法和图拉普拉斯矩阵的基础上,提出一种相关空间嵌入算法,并利用文本隐性语义索引和图像特征的视觉单词,构造出文本语义空间与图像特征空间这两个异构空间的相关关系,从而找出文本语义与视觉单词间潜在关联,实现图像的语义检索.文中算法把保持数据流形结构的一致性作为一种先验约束,将文本语义空间和图像特征空间中的数据点嵌入到同一个相关空间中.因此,与典型相关分析算法相比,这种相关嵌入映射不仅可揭示不同数据空间之间存在的相关关系,还可在相关空间中保留原始数据分布结构,从而提高算法的可靠性.实验验证文中算法的有效性,为图像语义检索提供一种可行方法.  相似文献   

5.
词包模型中视觉单词歧义性分析   总被引:4,自引:0,他引:4       下载免费PDF全文
刘扬闻  霍宏  方涛 《计算机工程》2011,37(19):204-206,209
传统词包(BOW)模型中的视觉单词是通过无监督聚类图像块的特征向量得到的,没有考虑视觉单词的语义信息和语义性质。为解决该问题,提出一种基于文本分类的视觉单词歧义性分析方法。利用传统BOW模型生成初始视觉单词词汇表,使用文档频率、χ2分布和信息增益这3种文本分类方法分析单词语义性质,剔除具有低类别信息的歧义性单词,并采用支持向量机分类器实现图像分类。实验结果表明,该方法具有较高的分类精度。  相似文献   

6.
分析了基于内容的图像检索中存在的问题,利用本体论方法建立图像底层特征本体及特定类图像本体.同时,定义了图像描述因子并建立相应的图像组合规则.最后,利用图像的底层特征进行图像检索,结合多分类支持向量机,实现图像底层特征与高层描述信息的关联,进而实现了图像语义检索,缩小了"语义鸿沟"对基于内容的图像检索的影响.实验结果表明本模型能够提高基于内容的图像检索的准确率,同时,经过3~5次反馈,可以实现语义检索功能.  相似文献   

7.
基于多语义特征的彩色图像检索技术研究   总被引:3,自引:0,他引:3  
基于语义内容的图像检索已成为解决图像低层特征与人类高级语义之间"语义鸿沟"的关键.以性能优越的回归型支持向量机(SVR)理论为基础,结合重要的图像边缘信息及人眼视觉特性,提出了一种基于多语义特征的彩色图像检索新算法.该算法首先利用Canny检测算子提取原始图像的边缘信息,并得到低层纹理特征,同时利用SVR将低层特征映射到高级语义,以获得图像的高级纹理语义.然后结合人眼视觉系统感知特性,给出基于重要区域主要颜色的高级颜色语义.最后根据上述高级语义特征(纹理语义和颜色语义)进行图像检索.实验结果表明,该算法能够有效地对图像高级语义进行刻画,不仅图像匹配检索效果良好,而且具有稳定的检索性能,其对于缩小低层视觉特征与高级语义概念之间的"语义鸿沟"具有重要意义.  相似文献   

8.
曾梦琪  马蔚吟  李力 《计算机工程》2019,45(11):262-268
融合文本和视觉信息进行图像检索可避免图像低层视觉特征与高层语义之间的语义鸿沟,但在提高检索质量的同时难以保证检索效率。为此,针对基于文本和内容的混合图像检索,通过结合曼哈顿哈希、倒排索引和R树等技术,设计一个新型的索引结构CAT树和相应的top-k检索算法,并由此提出三段式图像检索方案。在基准图像数据集上的实验结果表明,该方案可以在保持准确率的前提下,显著提升图像检索的效率。  相似文献   

9.
为生成有效表示图像场景语义的视觉词典,提高场景语义标注性能,提出一种基于形式概念分析(FCA)的图像场景语义标注模型。该方法首先将训练图像集与其初始的视觉词典抽象为形式背景,采用信息熵标识了各视觉单词的权重,并分别构造了各场景类别概念格结构;然后再利用各视觉单词权重的均值刻画概念格内涵上各组合视觉单词标注图像的贡献,按照类别视觉词典生成阈值,从格结构上有效提取了标注各类场景图像语义的视觉词典;最后,利用K最近邻标注测试图像的场景语义。在Fei-Fei Scene 13类自然场景图像数据集上进行实验,并与Fei-Fei方法和Bai方法相比,结果表明该方法在β=0.05和γ=15时,标注分类精度更优。  相似文献   

10.
为了解决传统的CBIR系统中存在的"语义鸿沟"问题,提出一种结合语义特征和视觉特征的图像检索方法.将图像的语义特征和视觉特征数据结合到同一个索引向量中,进行基于内容的图像检索.系统使用潜在语义索引(LSI)技术提取图像的语义特征,提取颜色直方图作为图像的视觉特征.通过将图像底层视觉特征与图像在向量空间中的语义统计特征相...  相似文献   

11.
In order to improve the retrieval accuracy of content-based image retrieval systems, research focus has been shifted from designing sophisticated low-level feature extraction algorithms to reducing the ‘semantic gap’ between the visual features and the richness of human semantics. This paper attempts to provide a comprehensive survey of the recent technical achievements in high-level semantic-based image retrieval. Major recent publications are included in this survey covering different aspects of the research in this area, including low-level image feature extraction, similarity measurement, and deriving high-level semantic features. We identify five major categories of the state-of-the-art techniques in narrowing down the ‘semantic gap’: (1) using object ontology to define high-level concepts; (2) using machine learning methods to associate low-level features with query concepts; (3) using relevance feedback to learn users’ intention; (4) generating semantic template to support high-level image retrieval; (5) fusing the evidences from HTML text and the visual content of images for WWW image retrieval. In addition, some other related issues such as image test bed and retrieval performance evaluation are also discussed. Finally, based on existing technology and the demand from real-world applications, a few promising future research directions are suggested.  相似文献   

12.
This paper addresses automatic image annotation problem and its application to multi-modal image retrieval. The contribution of our work is three-fold. (1) We propose a probabilistic semantic model in which the visual features and the textual words are connected via a hidden layer which constitutes the semantic concepts to be discovered to explicitly exploit the synergy among the modalities. (2) The association of visual features and textual words is determined in a Bayesian framework such that the confidence of the association can be provided. (3) Extensive evaluation on a large-scale, visually and semantically diverse image collection crawled from Web is reported to evaluate the prototype system based on the model. In the proposed probabilistic model, a hidden concept layer which connects the visual feature and the word layer is discovered by fitting a generative model to the training image and annotation words through an Expectation-Maximization (EM) based iterative learning procedure. The evaluation of the prototype system on 17,000 images and 7736 automatically extracted annotation words from crawled Web pages for multi-modal image retrieval has indicated that the proposed semantic model and the developed Bayesian framework are superior to a state-of-the-art peer system in the literature.  相似文献   

13.
图像中所蕴涵的丰富语义仅用若干低级物理特征是不能进行完整描述的,而且在语义映射时也会有信息丢失,因而产成“语义鸿沟”是在所难免的。将多特征融合,建立情感语义模型,分析情感的概念解析功能对提高智能信息检索的精度和效率是非常必要的。论文讨论了图像的颜色、纹理等特征的提取与表示,低阶图像可视化特征到高阶图像语义特征的映射过程,图像的情感语义分类,建立了情感语义模型,实现对基于情感语义图像的检索。对由2500幅数字图像组成的数据集进行了实验,并对实验结果进行分析,部分结果是令人满意的,而且提高了基于内容图像检索的精度。  相似文献   

14.
刘长红  曾胜  张斌  陈勇 《计算机应用》2022,42(10):3018-3024
跨模态图像文本检索的难点是如何有效地学习图像和文本间的语义相关性。现有的大多数方法都是学习图像区域特征和文本特征的全局语义相关性或模态间对象间的局部语义相关性,而忽略了模态内对象之间的关系和模态间对象关系的关联。针对上述问题,提出了一种基于语义关系图的跨模态张量融合网络(CMTFN-SRG)的图像文本检索方法。首先,采用图卷积网络(GCN)学习图像区域间的关系并使用双向门控循环单元(Bi-GRU)构建文本单词间的关系;然后,将所学习到的图像区域和文本单词间的语义关系图通过张量融合网络进行匹配以学习两种不同模态数据间的细粒度语义关联;同时,采用门控循环单元(GRU)学习图像的全局特征,并将图像和文本的全局特征进行匹配以捕获模态间的全局语义相关性。将所提方法在Flickr30K和MS-COCO两个基准数据集上与多模态交叉注意力(MMCA)方法进行了对比分析。实验结果表明,所提方法在Flickr30K测试集、MS-COCO1K测试集以及MS-COCO5K测试集上文本检索图像任务的Recall@1分别提升了2.6%、9.0%和4.1%,召回率均值(mR)分别提升了0.4、1.3和0.1个百分点,可见该方法能有效提升图像文本检索的精度。  相似文献   

15.
Multimodal Retrieval is a well-established approach for image retrieval. Usually, images are accompanied by text caption along with associated documents describing the image. Textual query expansion as a form of enhancing image retrieval is a relatively less explored area. In this paper, we first study the effect of expanding textual query on both image and its associated text retrieval. Our study reveals that judicious expansion of textual query through keyphrase extraction can lead to better results, either in terms of text-retrieval or both image and text-retrieval. To establish this, we use two well-known keyphrase extraction techniques based on tf-idf and KEA. While query expansion results in increased retrieval efficiency, it is imperative that the expansion be semantically justified. So, we propose a graph-based keyphrase extraction model that captures the relatedness between words in terms of both mutual information and relevance feedback. Most of the existing works have stressed on bridging the semantic gap by using textual and visual features, either in combination or individually. The way these text and image features are combined determines the efficacy of any retrieval. For this purpose, we adopt Fisher-LDA to adjudge the appropriate weights for each modality. This provides us with an intelligent decision-making process favoring the feature set to be infused into the final query. Our proposed algorithm is shown to supersede the previously mentioned keyphrase extraction algorithms for query expansion significantly. A rigorous set of experiments performed on ImageCLEF-2011 Wikipedia Retrieval task dataset validates our claim that capturing the semantic relation between words through Mutual Information followed by expansion of a textual query using relevance feedback can simultaneously enhance both text and image retrieval.  相似文献   

16.
语义图像检索研究进展   总被引:57,自引:0,他引:57  
语义图像检索已成为解决图像简单视觉特征和用户检索丰富语义之间存在的“语义鸿沟”问题的关键。从图像语义描述方式、图像语义抽取方法和语义检索系统设计3个方面对语义图像检索的研究状况进行了分析和研究;讨论了面向对象的图像内容模型和图像语义表示问题;对利用系统知识的提取、根据用户交互的提取和利用外部信息源的语义生成等具有代表性的语义处理方法进行了阐述;介绍了系统设计中用户界面和语义处理的不同方式,最后从对象识别、语义抽取规则、用户检索模型和图像检索性能评价标准4个方面剖析了实现图像语义处理所面临的困难,并提出了一些初步解决思路。  相似文献   

17.
In recent years, the rapid growth of multimedia content makes content-based image retrieval (CBIR) a challenging research problem. The content-based attributes of the image are associated with the position of objects and regions within the image. The addition of image content-based attributes to image retrieval enhances its performance. In the last few years, the bag-of-visual-words (BoVW) based image representation model gained attention and significantly improved the efficiency and effectiveness of CBIR. In BoVW-based image representation model, an image is represented as an order-less histogram of visual words by ignoring the spatial attributes. In this paper, we present a novel image representation based on the weighted average of triangular histograms (WATH) of visual words. The proposed approach adds the image spatial contents to the inverted index of the BoVW model, reduces overfitting problem on larger sizes of the dictionary and semantic gap issues between high-level image semantic and low-level image features. The qualitative and quantitative analysis conducted on three image benchmarks demonstrates the effectiveness of the proposed approach based on WATH.  相似文献   

18.
When images are described with visual words based on vector quantization of low-level color, texture, and edge-related visual features of image regions, it is usually referred as “bag-of-visual words (BoVW)”-based presentation. Although it has proved to be effective for image representation similar to document representation in text retrieval, the hard image encoding approach based on one-to-one mapping of regions to visual words is not expressive enough to characterize the image contents with higher level semantics and prone to quantization error. Each word is considered independent of all the words in this model. However, it is found that the words are related and their similarity of occurrence in documents can reflect the underlying semantic relations between them. To consider this, a soft image representation scheme is proposed by spreading each region’s membership values through a local fuzzy membership function in a neighborhood to all the words in a codebook generated by self-organizing map (SOM). The topology preserving property of the SOM map is exploited to generate a local membership function. A systematic evaluation of retrieval results of the proposed soft representation on two different image (natural photographic and medical) collections has shown significant improvement in precision at different recall levels when compared to different low-level and “BoVW”-based feature that consider only probability of occurrence (or presence/absence) of a word.  相似文献   

19.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号