首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
基于稀疏自编码深度神经网络的林火图像分类   总被引:1,自引:0,他引:1  
针对林火与相似目标很难区分的问题,提出一种基于稀疏自编码深度神经网络的林火图像分类新方法。采用无监督的特征学习算法稀疏自编码从无标签图像小块中学习特征参数,完成深度神经网络的训练;利用学习到的特征从原始大小分类图像中提取特征并卷积和均值池化特征;对卷积和池化后的特征采用softmax回归来训练最终softmax分类器。实验结果表明,跟传统的BP神经网络相比,新方法能够更有效区分林火与红旗、红叶等类似物体。  相似文献   

2.
ABSTRACT

With the rapid growing of remotely sensed imagery data, there is a high demand for effective and efficient image retrieval tools to manage and exploit such data. In this letter, we present a novel content-based remote sensing image retrieval (RSIR) method based on Triplet deep metric learning convolutional neural network (CNN). By constructing a Triplet network with metric learning objective function, we extract the representative features of the images in a semantic space in which images from the same class are close to each other while those from different classes are far apart. In such a semantic space, simple metric measures such as Euclidean distance can be used directly to compare the similarity of images and effectively retrieve images of the same class. We also investigate a supervised and an unsupervised learning methods for reducing the dimensionality of the learned semantic features. We present comprehensive experimental results on two public RSIR datasets and show that our method significantly outperforms state-of-the-art.  相似文献   

3.
目前多数图像分类的方法是采用监督学习或者半监督学习对图像进行降维,然而监督学习与半监督学习需要图像携带标签信息。针对无标签图像的降维及分类问题,提出采用混阶栈式稀疏自编码器对图像进行无监督降维来实现图像的分类学习。首先,构建一个具有三个隐藏层的串行栈式自编码器网络,对栈式自编码器的每一个隐藏层单独训练,将前一个隐藏层的输出作为后一个隐藏层的输入,对图像数据进行特征提取并实现对数据的降维。其次,将训练好的栈式自编码器的第一个隐藏层和第二个隐藏层的特征进行拼接融合,形成一个包含混阶特征的矩阵。最后,使用支持向量机对降维后的图像特征进行分类,并进行精度评价。在公开的四个图像数据集上将所提方法与七个对比算法进行对比实验,实验结果表明,所提方法能够对无标签图像进行特征提取,实现图像分类学习,减少分类时间,提高图像的分类精度。  相似文献   

4.
In content-based image retrieval context, a classic strategy consists in computing off-line a dictionary of visual features. This visual dictionary is then used to provide a new representation of the data which should ease any task of classification or retrieval. This strategy, based on past research works in text retrieval, is suitable for the context of batch learning, when a large training set can be built either by using a strong prior knowledge of data semantics (like for textual data) or with an expensive off-line pre-computation. Such an approach has major drawbacks in the context of interactive retrieval, where the user iteratively builds the training data set in a semi-supervised approach by providing positive and negative annotations to the system in the relevance feedback loop. The training set is thus built for each retrieval session without any prior knowledge about the concepts of interest for this session. We propose a completely different approach to build the dictionary on-line from features extracted in relevant images. We design the corresponding kernel function, which is learnt during the retrieval session. For each new label, the kernel function is updated with a complexity linear with respect to the size of the database. We propose an efficient active learning strategy for the weakly supervised retrieval method developed in this paper. Moreover this framework allows the combination of features of different types. Experiments are carried out on standard databases, and show that a small dictionary can be dynamically extracted from the features with better performances than a global one.  相似文献   

5.
小目标检测是图像处理领域的一个难点,尤其是医学图像中的小目标检测。微动脉瘤MA作为眼底图像中的一类小目标,尺寸小、局部对比度较低,并且存在较多的噪声干扰,检测难度较大。传统的检测方法需要手工提取特征,难以准确检测MA。而基于深度学习的检测需要进行复杂的前期准备工作,工作量大,并且难以解决正负样本数量不平衡的问题,容易产生过拟合。稀疏编码器SAE是一种无监督机器学习算法,可以在样本数量不平衡的环境中有效地提取样本的特征。因此,提出了一种基于SAE的无监督学习方法检测MA,采用反向传播更新SAE的权重和偏置以提取样本的特征,并利用提取的特征训练Softmax,最终实现MA的准确检测。为验证方法性能,选取了Retinopathy Online Challenge、DIARETDB1和E-ophtha-MA 3个数据库分别进行实验。实验结果表明,本文方法能够准确地检测出眼底图像中的MA,并且获得了较高的准确率和灵敏度。准确率分别为98.5%,87.2%和92.6%,灵敏度分别为99.9%,99.8%和98.7%。  相似文献   

6.
口服液压盖过程,会出现压盖不良等情况,瓶盖可能会出现划痕、刮花、表面卷曲、压盖破损等缺陷,为保证食品药品安全必须在出厂前进行检测.在基于深度学习的口服液瓶压盖缺陷检测的研究过程中,使用传统卷积神经网络对口服液压盖缺陷数据集进行训练,需要进行人工标注,效率较低.为有效解决上述问题,设计出一种无监督学习的深度卷积去噪自编码...  相似文献   

7.
In this paper, we propose an Interactive Object-based Image Clustering and Retrieval System (OCRS). The system incorporates two major modules: Preprocessing and Object-based Image Retrieval. In preprocessing, an unsupervised segmentation method called WavSeg is used to segment images into meaningful semantic regions (image objects). This is an area where a huge number of image regions are involved. Therefore, we propose a Genetic Algorithm based algorithm to cluster these images objects and thus reduce the search space for object-based image retrieval. In the learning and retrieval module, the Diverse Density algorithm is adopted to analyze the user’s interest and generate the initial hypothesis which provides a prototype for future learning and retrieval. Relevance Feedback technique is incorporated to provide progressive guidance to the learning process. In interacting with user, we propose to use One-Class Support Vector Machine (SVM) to learn the user’s interest and refine the returned result. Performance is evaluated on a large image database and the effectiveness of our retrieval algorithm is demonstrated through comparative studies.
Xin ChenEmail:
  相似文献   

8.
图像检索系统大多是利用图像的底层特征如颜色、纹理和图像来分析图像,没有考虑图像内容及其对象的内容语义,导致对图像的理解不佳.为使系统能更准确的理解图像中的对象及其深层语义,分析了目前图像标注的优缺点,提出了一种以底层特征为基础,利用本体论建构的知识辅助计算机分析图像中实体对象,判断对象与对象间在现实世界中存在的合理相关性,进而对图像进行标注.实验结果显示加入本体论辅助标注图像大大提高了图像识别的准确性.  相似文献   

9.
Similarity retrieval of iconic image database   总被引:3,自引:0,他引:3  
The perception of spatial relationships among objects in a picture is one of the important selection criteria to discriminate and retrieve the images in an iconic image database system. The data structure called 2D string, proposed by Chang et al., is adopted to represent symbolic pictures. The 2D string preserves the objects' spatial knowledge embedded in images. Since spatial relationship is a fuzzy concept, the capability of similarity retrieval for the retrieval by subpicture is essential. In this paper, similarity measure based on 2D string longest common subsequence is defined. The algorithm for similarity retrieval is also proposed. Similarity retrieval provides the iconic image database with the distinguishing function different from a conventional database.  相似文献   

10.
Learning-based hashing methods are becoming the mainstream for approximate scalable multimedia retrieval. They consist of two main components: hash codes learning for training data and hash functions learning for new data points. Tremendous efforts have been devoted to designing novel methods for these two components, i.e., supervised and unsupervised methods for learning hash codes, and different models for inferring hashing functions. However, there is little work integrating supervised and unsupervised hash codes learning into a single framework. Moreover, the hash function learning component is usually based on hand-crafted visual features extracted from the training images. The performance of a content-based image retrieval system crucially depends on the feature representation and such hand-crafted visual features may degrade the accuracy of the hash functions. In this paper, we propose a semi-supervised deep learning hashing (DLH) method for fast multimedia retrieval. More specifically, in the first component, we utilize both visual and label information to learn an optimal similarity graph that can more precisely encode the relationship among training data, and then generate the hash codes based on the graph. In the second stage, we apply a deep convolutional network to simultaneously learn a good multimedia representation and a set of hash functions. Extensive experiments on five popular datasets demonstrate the superiority of our DLH over both supervised and unsupervised hashing methods.  相似文献   

11.
In recent years, deep learning techniques have been applied to the diagnosis of pulmonary nodules. In order to improve the pulmonary nodule diagnostic performance effectively, we propose a novel pulmonary nodule diagnosis method using dual‐modal deep supervised autoencoder based on extreme learning machine for which discriminative features are automatically learnt from the input data. The network is fed with nodule images in pairs obtained from computed tomography and positron emission tomography respectively. For each pair image, the high‐level discriminative features of nodules in computed tomography and positron emission tomography are extracted from stacked supervised autoencoder layers. The outputs of the proposed architecture are combined using an ideal fusion method to get the final classification. In the experiments, 5‐fold cross‐validation method is used to validate the proposed method on 1,600 pulmonary nodule images and our method reaches high‐classification sensitivities of 91.75% at 1.58 false positives per scan. Meanwhile, compared with other deep learning diagnosis methods, our method achieves better discriminative results and is highly suited to be used for pulmonary nodule diagnosis.  相似文献   

12.
The vast amount of images available on the Web request for an effective and efficient search service to help users find relevant images.The prevalent way is to provide a keyword interface for users to submit queries.However,the amount of images without any tags or annotations are beyond the reach of manual efforts.To overcome this,automatic image annotation techniques emerge,which are generally a process of selecting a suitable set of tags for a given image without user intervention.However,there are three main challenges with respect to Web-scale image annotation:scalability,noiseresistance and diversity.Scalability has a twofold meaning:first an automatic image annotation system should be scalable with respect to billions of images on the Web;second it should be able to automatically identify several relevant tags among a huge tag set for a given image within seconds or even faster.Noise-resistance means that the system should be robust enough against typos and ambiguous terms used in tags.Diversity represents that image content may include both scenes and objects,which are further described by multiple different image features constituting different facets in annotation.In this paper,we propose a unified framework to tackle the above three challenges for automatic Web image annotation.It mainly involves two components:tag candidate retrieval and multi-facet annotation.In the former content-based indexing and concept-based codebook are leveraged to solve scalability and noise-resistance issues.In the latter the joint feature map has been designed to describe different facets of tags in annotations and the relations between these facets.Tag graph is adopted to represent tags in the entire annotation and the structured learning technique is employed to construct a learning model on top of the tag graph based on the generated joint feature map.Millions of images from Flickr are used in our evaluation.Experimental results show that we have achieved 33% performance improvements compared with those single facet approaches in terms of three metrics:precision,recall and F1 score.  相似文献   

13.
文本-图像行人检索旨在从行人数据库中查找符合特定文本描述的行人图像.近年来受到学术界和工业界的广泛关注.该任务同时面临两个挑战:细粒度检索以及图像与文本之间的异构鸿沟.部分方法提出使用有监督属性学习提取属性相关特征,在细粒度上关联图像和文本.然而属性标签难以获取,导致这类方法在实践中表现不佳.如何在没有属性标注的情况下提取属性相关特征,建立细粒度的跨模态语义关联成为亟待解决的关键问题.为解决这个问题,融合预训练技术提出基于虚拟属性学习的文本-图像行人检索方法,通过无监督属性学习建立细粒度的跨模态语义关联.第一,基于行人属性的不变性和跨模态语义一致性提出语义引导的属性解耦方法,所提方法利用行人的身份标签作为监督信号引导模型解耦属性相关特征.第二,基于属性之间的关联构建语义图提出基于语义推理的特征学习模块,所提模块通过图模型在属性之间交换信息增强特征的跨模态识别能力.在公开的文本-图像行人检索数据集CUHK-PEDES和跨模态检索数据集Flickr30k上与现有方法进行实验对比,实验结果表明了所提方法的有效性.  相似文献   

14.
15.
An image representation method using vector quantization (VQ) on color and texture is proposed in this paper. The proposed method is also used to retrieve similar images from database systems. The basic idea is a transformation from the raw pixel data to a small set of image regions, which are coherent in color and texture space. A scheme is provided for object-based image retrieval. Features for image retrieval are the three color features (hue, saturation, and value) from the HSV color model and five textural features (ASM, contrast, correlation, variance, and entropy) from the gray-level co-occurrence matrices. Once the features are extracted from an image, eight-dimensional feature vectors represent each pixel in the image. The VQ algorithm is used to rapidly cluster those feature vectors into groups. A representative feature table based on the dominant groups is obtained and used to retrieve similar images according to the object within the image. This method can retrieve similar images even in cases where objects are translated, scaled, and rotated.  相似文献   

16.
The problems of efficient data storage and data retrieval are important issues in the design of image database systems. A data structure called a 2-D string, which represents symbolic pictures preserving spatial knowledge, was proposed by Chang et al. It allows a natural way to construct iconic indexes for pictures. We proposed a data structure 2-D B-string to characterize the spatial knowledge embedded in images. It is powerful enough to describe images with partly overlapping or completely overlapping objects without the need of partitioning objects. When there exist a large volume of complex images in the image database, the processing time for image retrieval is tremendous. It is essential to develop efficient access methods for retrieval. In this paper, access methods, to different extents of precision, for retrieval of desired images encoded in 2-D B-strings are proposed. The signature file acting as a spatial filter of image database is based on disjoint coding and superimposed coding techniques. It provides an efficient way to retrieve images in image databases.  相似文献   

17.
Zhixiao Xie   《Computers & Geosciences》2004,30(9-10):1093-1104
This research proposes a rotation- and flip-invariant algorithm for representing spatial continuity information in high-resolution geographic images for content based image retrieval (CBIR). Starting with variogram concept, the new visual property representation, in the form of a numeric index vector, consists of a set of semi-variances at selected lags and directions, based on three well-justified principles: (1) capture the basic shape of sample variogram, (2) represent the spatial continuity anisotropy, and (3) make the representation rotation- and flip-invariant. The algorithm goes through two tests. The first test confirms that it can indeed align the image representations based on spatial continuity information of objects within images by re-ordering the semi-variances accordingly. In the second test, the algorithm is applied to retrieve seven types of typical geographic entities from an Erie County orthophoto database. The retrieval results demonstrate the effectiveness of the new algorithm in CBIR, as assessed by retrieval precision.  相似文献   

18.
基于监督学习的卷积神经网络被证明在图像识别的任务中具有强大的特征学习能力。然而,利用监督的深度学习方法进行图像检索,需要大量已标注的数据,否则很容易出现过拟合的问题。为了解决这个问题,提出了一种新颖的基于深度自学习的图像哈希检索方法。首先,通过无监督的自编码网络学习到一个具有判别性的特征表达函数,这种方法降低了学习的复杂性,让训练样本不需要依赖于有语义标注的图像,算法被迫在大量未标注的数据上学习更强健的特征。其次,为了加快检索速度,抛弃了传统利用欧氏距离计算相似性的方法,而使用感知哈希算法来进行相似性衡量。这两种技术的结合确保了在获得更好的特征表达的同时,获得了更快的检索速度。实验结果表明,提出的方法优于一些先进的图像检索方法。  相似文献   

19.
评价对象抽取主要用于文本的意见挖掘,旨在发掘评论文本中的评价对象实体。基于无监督的自编码器方法可以识别评论语料库中潜藏的主题信息,且无需人工标注语料,但自编码器抽取的评价对象缺乏多样性。提出一种基于监督学习的句子级分类任务和无监督学习自编码器混合模型。该模型通过训练一个分类器生成评价对象类别,对自编码器共享分类任务中的LSTM-Attention结构进行编码得到句向量表征,以增加语义关联度,根据得到的评价对象类别将句向量表征转化为中间层语义向量,从而捕捉到评价对象类别与评价对象之间的相关性,提高编码器的编码能力,最终通过对句向量的重构进行解码得到评价对象矩阵,并依据计算评价对象矩阵与句中单词的余弦相似度完成评价对象的抽取。在多领域评论语料库上的实验结果表明,与k-means、LocLDA等方法相比,该方法评价指标在餐厅领域中提升了3.7%,在酒店领域中提升了2.1%,可有效解决训练过程缺少评价类别多样性的问题,具有较好的评价对象抽取能力。  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号