首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Image retrieval for food ingredients is important work, tremendously tiring, uninteresting, and expensive. Computer vision systems have extraordinary advancements in image retrieval with CNNs skills. But it is not feasible for small-size food datasets using convolutional neural networks directly. In this study, a novel image retrieval approach is presented for small and medium-scale food datasets, which both augments images utilizing image transformation techniques to enlarge the size of datasets, and promotes the average accuracy of food recognition with state-of-the-art deep learning technologies. First, typical image transformation techniques are used to augment food images. Then transfer learning technology based on deep learning is applied to extract image features. Finally, a food recognition algorithm is leveraged on extracted deep-feature vectors. The presented image-retrieval architecture is analyzed based on a small-scale food dataset which is composed of forty-one categories of food ingredients and one hundred pictures for each category. Extensive experimental results demonstrate the advantages of image-augmentation architecture for small and medium datasets using deep learning. The novel approach combines image augmentation, ResNet feature vectors, and SMO classification, and shows its superiority for food detection of small/medium-scale datasets with comprehensive experiments.  相似文献   

2.
Nowadays, the amount of wed data is increasing at a rapid speed, which presents a serious challenge to the web monitoring. Text sentiment analysis, an important research topic in the area of natural language processing, is a crucial task in the web monitoring area. The accuracy of traditional text sentiment analysis methods might be degraded in dealing with mass data. Deep learning is a hot research topic of the artificial intelligence in the recent years. By now, several research groups have studied the sentiment analysis of English texts using deep learning methods. In contrary, relatively few works have so far considered the Chinese text sentiment analysis toward this direction. In this paper, a method for analyzing the Chinese text sentiment is proposed based on the convolutional neural network (CNN) in deep learning in order to improve the analysis accuracy. The feature values of the CNN after the training process are nonuniformly distributed. In order to overcome this problem, a method for normalizing the feature values is proposed. Moreover, the dimensions of the text features are optimized through simulations. Finally, a method for updating the learning rate in the training process of the CNN is presented in order to achieve better performances. Experiment results on the typical datasets indicate that the accuracy of the proposed method can be improved compared with that of the traditional supervised machine learning methods, e.g., the support vector machine method.  相似文献   

3.
Image fusion aims to integrate complementary information from multiple modalities into a single image with none distortion and loss of data. Image fusion is important in medical imaging, specifically for the purpose of detecting the tumor and identification of diseases. In this article, completely unique discrete wavelet transform (DWT) and intuitionistic fuzzy sets (IFSs) based fusion method (DWT‐IFS) is proposed. For fusion, initially, all source images are fused using DWT with the average, maximum, and entropy fusion rules. Besides, on the fused image IFS is applied. In the IFS process images are converted into intuitionistic fuzzy images (IFIs) by selecting an optimum value for the parameter in membership, non‐membership, and hesitation degree function using entropy. Then, the resulting IFIs are decomposed into the blocks, and the corresponding blocks of the images are fused using the intersection and union operations of IFS. The efficiency of the proposed DWT‐IFS fusion method is recognized by examining it with other existing methods, such as Averaging (AVG), Principal Component Analysis (PCA), Laplacian Pyramid Approach (LPA), Contrast Pyramid Approach (CPA), Discrete Wavelet Transform (DWT), Morphological Pyramid Approach (MPA), Redundancy Discrete Wavelet Transform (RDWT), Contourlet Transform (CONTRA), and Intuitionistic Fuzzy Set (IFS) using subjective and objective performance evaluation measures. The experimental results reveal that the proposed DWT‐IFS fusion method provides higher quality of information in terms of physical properties and contrast as compared to the existing methods.  相似文献   

4.
With the development of Deep Convolutional Neural Networks (DCNNs), the extracted features for image recognition tasks have shifted from low-level features to the high-level semantic features of DCNNs. Previous studies have shown that the deeper the network is, the more abstract the features are. However, the recognition ability of deep features would be limited by insufficient training samples. To address this problem, this paper derives an improved Deep Fusion Convolutional Neural Network (DF-Net) which can make full use of the differences and complementarities during network learning and enhance feature expression under the condition of limited datasets. Specifically, DF-Net organizes two identical subnets to extract features from the input image in parallel, and then a well-designed fusion module is introduced to the deep layer of DF-Net to fuse the subnet’s features in multi-scale. Thus, the more complex mappings are created and the more abundant and accurate fusion features can be extracted to improve recognition accuracy. Furthermore, a corresponding training strategy is also proposed to speed up the convergence and reduce the computation overhead of network training. Finally, DF-Nets based on the well-known ResNet, DenseNet and MobileNetV2 are evaluated on CIFAR100, Stanford Dogs, and UECFOOD-100. Theoretical analysis and experimental results strongly demonstrate that DF-Net enhances the performance of DCNNs and increases the accuracy of image recognition.  相似文献   

5.
Edge detection is one of the core steps of image processing and computer vision. Accurate and fine image edge will make further target detection and semantic segmentation more effective. Holistically-Nested edge detection (HED) edge detection network has been proved to be a deep-learning network with better performance for edge detection. However, it is found that when the HED network is used in overlapping complex multi-edge scenarios for automatic object identification. There will be detected edge incomplete, not smooth and other problems. To solve these problems, an image edge detection algorithm based on improved HED and feature fusion is proposed. On the one hand, features are extracted using the improved HED network: the HED convolution layer is improved. The residual variable convolution block is used to replace the normal convolution enhancement model to extract features from edges of different sizes and shapes. Meanwhile, the empty convolution is used to replace the original pooling layer to expand the receptive field and retain more global information to obtain comprehensive feature information. On the other hand, edges are extracted using Otsu algorithm: Otsu-Canny algorithm is used to adaptively adjust the threshold value in the global scene to achieve the edge detection under the optimal threshold value. Finally, the edge extracted by improved HED network and Otsu-Canny algorithm is fused to obtain the final edge. Experimental results show that on the Berkeley University Data Set (BSDS500) the optimal data set size (ODS) F-measure of the proposed algorithm is 0.793; the average precision (AP) of the algorithm is 0.849; detection speed can reach more than 25 frames per second (FPS), which confirms the effectiveness of the proposed method.  相似文献   

6.

Lip reading is typically regarded as visually interpreting the speaker’s lip movements during the speaking. This is a task of decoding the text from the speaker’s mouth movement. This paper proposes a lip-reading model that helps deaf people and persons with hearing problems to understand a speaker by capturing a video of the speaker and inputting it into the proposed model to obtain the corresponding subtitles. Using deep learning technologies makes it easier for users to extract a large number of different features, which can then be converted to probabilities of letters to obtain accurate results. Recently proposed methods for lip reading are based on sequence-to-sequence architectures that are designed for natural machine translation and audio speech recognition. However, in this paper, a deep convolutional neural network model called the hybrid lip-reading (HLR-Net) model is developed for lip reading from a video. The proposed model includes three stages, namely, pre-processing, encoder, and decoder stages, which produce the output subtitle. The inception, gradient, and bidirectional GRU layers are used to build the encoder, and the attention, fully-connected, activation function layers are used to build the decoder, which performs the connectionist temporal classification (CTC). In comparison with the three recent models, namely, the LipNet model, the lip-reading model with cascaded attention (LCANet), and attention-CTC (A-ACA) model, on the GRID corpus dataset, the proposed HLR-Net model can achieve significant improvements, achieving the CER of 4.9%, WER of 9.7%, and Bleu score of 92% in the case of unseen speakers, and the CER of 1.4%, WER of 3.3%, and Bleu score of 99% in the case of overlapped speakers.

  相似文献   

7.
Compressive strength of concrete is a significant factor to assess building structure health and safety. Therefore, various methods have been developed to evaluate the compressive strength of concrete structures. However, previous methods have several challenges in costly, time-consuming, and unsafety. To address these drawbacks, this paper proposed a digital vision based concrete compressive strength evaluating model using deep convolutional neural network (DCNN). The proposed model presented an alternative approach to evaluating the concrete strength and contributed to improving efficiency and accuracy. The model was developed with 4,000 digital images and 61,996 images extracted from video recordings collected from concrete samples. The experimental results indicated a root mean square error (RMSE) value of 3.56 (MPa), demonstrating a strong feasibility that the proposed model can be utilized to predict the concrete strength with digital images of their surfaces and advantages to overcome the previous limitations. This experiment contributed to provide the basis that could be extended to future research with image analysis technique and artificial neural network in the diagnosis of concrete building structures.  相似文献   

8.
Human action recognition under complex environment is a challenging work. Recently, sparse representation has achieved excellent results of dealing with human action recognition problem under different conditions. The main idea of sparse representation classification is to construct a general classification scheme where the training samples of each class can be considered as the dictionary to express the query class, and the minimal reconstruction error indicates its corresponding class. However, how to learn a discriminative dictionary is still a difficult work. In this work, we make two contributions. First, we build a new and robust human action recognition framework by combining one modified sparse classification model and deep convolutional neural network (CNN) features. Secondly, we construct a novel classification model which consists of the representation-constrained term and the coefficients incoherence term. Experimental results on benchmark datasets show that our modified model can obtain competitive results in comparison to other state-of-the-art models.  相似文献   

9.
为了提高目标检测的准确性,提出了一种基于深度学习利用特征图加权融合实现目标检测的方法。首先,提出将卷积神经网络中的浅层特征图采样后与最深层特征图进行加权融合的思想;其次,根据所提的特征图加权融合思想以及卷积神经网络的具体结构,制定相应的特征图加权融合方案,并由该方案得到新特征图;然后,提出改进的RPN网络,并将新特征图输入到改进的RPN网络得到区域建议;最后,将新特征图和区域建议输入到后续网络层完成目标检测。实验结果表明所提方法取得了更高的目标检测精度以及更好的目标检测效果。  相似文献   

10.
针对传统电能质量扰动分类方法中人工选取特征困难、步骤繁琐和分类准确率低等问题,提出了一种基于粒子群优化(particle swarm optimization,PSO)算法与卷积神经网络(convolutional neural network,CNN)的扰动分类方法。首先,利用reshape函数将各电能质量扰动信号的一维时间序列分别转成行列相等的二维矩阵,并对这些二维矩阵进行适当划分,形成训练数据集和测试数据集;其次,基于CNN构建电能质量扰动的分类模型;再次,采用PSO算法对该分类模型的参数进行优化,使用训练数据集对优化后的电能质量扰动分类模型进行训练;最后,使用测试数据集对经过训练的电能质量扰动分类模型进行测试,根据输出标签得到各类电能质量扰动的分类结果。仿真结果表明:该分类模型可以自行提取电能质量扰动数据的特征,相较于其他电能质量扰动分类模型,其对电能质量扰动信号的分类准确率更高。  相似文献   

11.
针对电机故障诊断问题,设计一种新型的一维卷积神经网络结构(1D-CNN),提出一种基于声信号和1D-CNN的电机故障诊断方法.为了验证1D-CNN算法在电机故障识别领域的有效性,以一组空调故障电机作为实验对象,搭建电机故障诊断平台,对4种状态的空调电机进行声信号采集实验,制作电机故障声信号数据集,并运用1D-CNN算法...  相似文献   

12.
一种基于Directionlet变换的图像融合算法   总被引:3,自引:0,他引:3  
为了提高图像融合效果,提出了一种基于Directionlet变换的图像融合算法.首先对已配准的待融合源图像由给定的生成矩阵分别进行陪集分解,得到每个陪集对应的子图;接着将每两个子图相减,得到源图像的高频和低频分量,其中边缘、纹理等奇异特征包含在高频分量中;然后对低频分量采用直接平均融合的方法进行系数选择,对高频分量选择子区域边缘信息较强的系数;最后,通过Directionlet陪集分解的反变换,得到融合后的图像.多聚焦图像融合实验表明,在主观视觉上,该算法明显更好地融合了边缘等图像特征,从而较好地保持了左右聚焦图像各自的细节信息;在客观评价上,通过熵、平均梯度、标准差和互信息量等性能参数比较,该方法也优于小波变换和其他的融合方法.  相似文献   

13.
The exponential increase in data over the past few years, particularly in images, has led to more complex content since visual representation became the new norm. E-commerce and similar platforms maintain large image catalogues of their products. In image databases, searching and retrieving similar images is still a challenge, even though several image retrieval techniques have been proposed over the decade. Most of these techniques work well when querying general image databases. However, they often fail in domain-specific image databases, especially for datasets with low intraclass variance. This paper proposes a domain-specific image similarity search engine based on a fused deep learning network. The network is comprised of an improved object localization module, a classification module to narrow down search options and finally a feature extraction and similarity calculation module. The network features both an offline stage for indexing the dataset and an online stage for querying. The dataset used to evaluate the performance of the proposed network is a custom domain-specific dataset related to cosmetics packaging gathered from various online platforms. The proposed method addresses the intraclass variance problem with more precise object localization and the introduction of top result reranking based on object contours. Finally, quantitative and qualitative experiment results are presented, showing improved image similarity search performance.  相似文献   

14.
15.
基于图像纹理特征的目标快速检索   总被引:1,自引:0,他引:1  
在讨论共生矩阵的基础上,提出了一个通过图像分割获取目标图像纹理特征,进而实现图像快速检索的方法。试验表明,该方法检索目标图像的可靠性较高,具有良好的应用价值。  相似文献   

16.
目的 将深度学习与社交网络、情感计算相结合,探索利用深度神经网络进行社交网络用户情感研究的新方法和新技术,探索模型在用户需求分析和推荐上的应用.方法 自动筛选和挖掘海量社交网络数据,研究具有长时记忆的非先验情感预测方法,对网络中海量的用户数据、人与人之间关系进行建模,为关联时间序列创建LSTM模型,并结合其相互关系融入统一的大型深度循环网络中.具体包括:基于注意力模型的社交网络异构数据处理;基于深度LSTM的长时记忆建模,研究子网络选取、深度LSTM设计,以及针对社交网络的大型网络结构设计;基于社交网络情感模型和强化学习的推荐算法.结果 提高了分析的准确度,降低了对先验假设的依赖,减轻了人工情感模型的工作量和偏差,增强了对不同网络数据的普适性;供深度模型使用.结论 研究成果促进了深度学习与情感计算的结合,可推动网络用户行为分析和预测的研究,可用于个性化推荐、定向广告等领域,具有广泛的学术意义和应用前景.  相似文献   

17.
对美国密歇根大学电子工程系的研究人员提出的一种多源数据融合算法进行了介绍,对SAR图像与可见光图像融合的一系列相关技术及其主要步骤进行了探讨,简要概括了评价融合后图像效果的标准和方法,并介绍了目标的检测与识别。  相似文献   

18.
Multi-source information can be obtained through the fusion of infrared images and visible light images, which have the characteristics of complementary information. However, the existing acquisition methods of fusion images have disadvantages such as blurred edges, low contrast, and loss of details. Based on convolution sparse representation and improved pulse-coupled neural network this paper proposes an image fusion algorithm that decompose the source images into high-frequency and low-frequency subbands by non-subsampled Shearlet Transform (NSST). Furthermore, the low-frequency subbands were fused by convolutional sparse representation (CSR), and the high-frequency subbands were fused by an improved pulse coupled neural network (IPCNN) algorithm, which can effectively solve the problem of difficulty in setting parameters of the traditional PCNN algorithm, improving the performance of sparse representation with details injection. The result reveals that the proposed method in this paper has more advantages than the existing mainstream fusion algorithms in terms of visual effects and objective indicators.  相似文献   

19.
Artificial scent screening systems (known as electronic noses, E-noses) have been researched extensively. A portable, automatic, and accurate, real-time E-nose requires both robust cross-reactive sensing and fingerprint pattern recognition. Few E-noses have been commercialized because they suffer from either sensing or pattern-recognition issues. Here, cross-reactive colorimetric barcode combinatorics and deep convolutional neural networks (DCNNs) are combined to form a system for monitoring meat freshness that concurrently provides scent fingerprint and fingerprint recognition. The barcodes—comprising 20 different types of porous nanocomposites of chitosan, dye, and cellulose acetate—form scent fingerprints that are identifiable by DCNN. A fully supervised DCNN trained using 3475 labeled barcode images predicts meat freshness with an overall accuracy of 98.5%. Incorporating DCNN into a smartphone application forms a simple platform for rapid barcode scanning and identification of food freshness in real time. The system is fast, accurate, and non-destructive, enabling consumers and all stakeholders in the food supply chain to monitor food freshness.  相似文献   

20.
With the development of deep learning and Convolutional Neural Networks (CNNs), the accuracy of automatic food recognition based on visual data have significantly improved. Some research studies have shown that the deeper the model is, the higher the accuracy is. However, very deep neural networks would be affected by the overfitting problem and also consume huge computing resources. In this paper, a new classification scheme is proposed for automatic food-ingredient recognition based on deep learning. We construct an up-to-date combinational convolutional neural network (CBNet) with a subnet merging technique. Firstly, two different neural networks are utilized for learning interested features. Then, a well-designed feature fusion component aggregates the features from subnetworks, further extracting richer and more precise features for image classification. In order to learn more complementary features, the corresponding fusion strategies are also proposed, including auxiliary classifiers and hyperparameters setting. Finally, CBNet based on the well-known VGGNet, ResNet and DenseNet is evaluated on a dataset including 41 major categories of food ingredients and 100 images for each category. Theoretical analysis and experimental results demonstrate that CBNet achieves promising accuracy for multi-class classification and improves the performance of convolutional neural networks.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号