首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 265 毫秒
1.
基于增量学习支持向量机的音频例子识别与检索   总被引:5,自引:0,他引:5  
音频例子识别与检索的主要任务是构造一个良好的分类学习机,而在构造过程中,从含有冗余样本的训练库中选择最佳训练例子、节省学习机的训练时间是构造分类机面临的一个挑战,尤其是对含有大样本训练库音频例子的识别.由于支持向量是支持向量机中的关键例子,提出了增量学习支持向量机训练算法.在这个算法中,训练样本被分成训练子库按批次进行训练,每次训练中,只保留支持向量,去除非支持向量.与普通和减量支持向量机对比的实验表明,算法在显著减少训练时间前提下,取得了良好的识别检索正确率.  相似文献   

2.
针对壁画图像具有较大类内差异的特点,提出一种分组策略,将样本空间划分为不同的子空间,每一个子空间中的所有训练样本训练分类器模型,测试阶段,根据测试样本落到的子空间来选择不同的分类模型对测试样本进行分类。在各个子空间训练分类器时,为了克服壁画图像较强背景噪音的影响,我们将每一幅壁画图像样本看作多个实例的组成,采用多实例学习的方式来训练分类器。训练过程中,我们引入隐变量用于标识每一个实例,隐变量的存在使得分类器的优化问题不是一个凸问题,因此我们无法用梯度下降法去直接求解,本文中我们采用迭代的方式训练Latent SVM作为每一个子空间的分类器。实验证明了本文的分类模型能够较大程度的解决壁画图像的类内差异以及背景噪音对分类结果造成的影响。  相似文献   

3.
Pedestrian detection is a fundamental problem in video surveillance and has achieved great progress in recent years. However, training a generic detector performing well in a great variety of scenes has proved to be very difficult. On the other hand, exhausting manual labeling efforts for each specific scene to achieve high accuracy of detection is not acceptable especially for video surveillance applications. To alleviate the manual labeling efforts without scarifying accuracy of detection, we propose a transfer learning framework based on sparse coding for pedestrian detection. In our method, generic detector is used to get the initial target samples, and then several filters are used to select a small part of samples (called as target templates) from the initial target samples which we are very sure about their labels and confidence values. The relevancy between source samples and target templates and the relevancy between target samples and target templates are estimated by sparse coding and later used to calculate the weights for source samples and target samples. By adding the sparse coding-based weights to all these samples during re-training process, we can not only exclude outliers in the source samples, but also tackle the drift problem in the target samples, and thus get a well scene-specific pedestrian detector. Our experiments on two public datasets show that our trained scene-specific pedestrian detector performs well and is comparable with the detector trained on a large number of training samples manually labeled from the target scene.  相似文献   

4.
目的 输电线路金具种类繁多、用处多样,与导线和杆塔安全密切相关。评估金具运行状态并实现故障诊断,需对输电线路金具目标进行精确定位和识别,然而随着无人机巡检采集的数据逐渐增多,将全部数据进行人工标注愈发困难。针对无标注数据无法有效利用的问题,提出一种基于自监督E-Swin Transformer (efficient shifted windows Transformer)的输电线路金具检测模型,充分利用无标注数据提高检测精度。方法 首先,为了减少自注意力的计算量、提高模型计算效率,对Swin Transformer自注意力计算进行优化,提出一种高效的主干网络E-Swin。然后,为了利用无标注金具数据加强特征提取效果,针对E-Swin设计轻量化的自监督方法,并进行预训练。最后,为了提高检测定位精度,采用一种添加额外分支的检测头,并结合预训练之后的主干网络构建检测模型,利用少量有标注的数据进行微调训练,得到最终检测结果。结果 实验结果表明,在输电线路金具数据集上,本文模型的各目标平均检测精确度(AP50)为88.6%,相比传统检测模型提高了10%左右。结论 本文改进主干网络的自注意力计算,并采用自监督学习,使模型高效提取特征,实现无标注数据的有效利用,构建的金具检测模型为解决输电线路金具检测的数据利用问题提供了新思路。  相似文献   

5.
This paper presents a fast training strategy for the Viola–Jones (VJ) type object-detection systems. The VJ object- detection system, popular for its high accuracy at real-time testing speeds, has a drawback that it is slow to train. A face detector, for example, can take days to train. In content-based image retrieval (CBIR), where search needs to be performed instantaneously, VJ’s long training time is not affordable. Therefore, VJ’s method is hardly used for such applications. This paper proposes two modifications to the training algorithm of VJ-type object detection systems which reduces the training time to the order of seconds. Firstly, Laplacian clutter (non-object) models are used to train the weak classifier, thus eliminating the need to read and evaluate thousands of clutter images. Secondly, the training procedure is simplified by removing the time-consuming AdaBoost-based feature selection procedure. An object detector, trained with 500 images, approximately takes 2 s for training in a conventional 3 GHz machine. Our results show that the accuracy of the detector, built with the proposed approach, is inferior to that of VJ for difficult object class such as frontal faces. However, for objects with lesser degree of intra-class variations such as hearts, state-of-the-art accuracy can be obtained. Importantly, for CBIR applications, the fast testing speed of the VJ type object detector is maintained.  相似文献   

6.
Concept detection is targeted at automatically labeling video content with semantic concepts appearing in it, like objects, locations, or activities. While concept detectors have become key components in many research prototypes for content-based video retrieval, their practical use is limited by the need for large-scale annotated training sets. To overcome this problem, we propose to train concept detectors on material downloaded from web-based video sharing portals like YouTube, such that training is based on tags given by users during upload, no manual annotation is required, and concept detection can scale up to thousands of concepts. On the downside, web video as training material is a complex domain, and the tags associated with it are weak and unreliable. Consequently, performance loss is to be expected when replacing high-quality state-of-the-art training sets with web video content.This paper presents a concept detection prototype named TubeTagger that utilizes YouTube content for an autonomous training. In quantitative experiments, we compare the performance when training on web video and on standard datasets from the literature. It is demonstrated that concept detection in web video is feasible, and that – when testing on YouTube videos – the YouTube-based detector outperforms the ones trained on standard training sets. By applying the YouTube-based prototype to datasets from the literature, we further demonstrate that: (1) If training annotations on the target domain are available, the resulting detectors significantly outperform the YouTube-based tagger. (2) If no annotations are available, the YouTube-based detector achieves comparable performance to the ones trained on standard datasets (moderate relative performance losses of 11.4% is measured) while offering the advantage of a fully automatic, scalable learning. (3) By enriching conventional training sets with online video material, performance improvements of 11.7% can be achieved when generalizing to domains unseen in training.  相似文献   

7.
8.
目的 基于深度学习的飞机发动机损伤检测是计算机视觉中的一个新问题。当前的目标检测方法没有考虑飞机发动机损伤检测问题的特殊性,将其直接用于发动机损伤检测的效果较差,无法满足实际使用的要求。为了提高损伤检测的精度,提出检测器和分类器级联的发动机损伤检测方法:Cascade-YOLO (cascade-you only look once)。方法 首先,将损伤区域作为正例、正常区域作为负例,训练损伤检测网络,初始化特征提取网络的网络参数;其次,固定特征提取网络,使用多个检测头分别检测不同类型的发动机损伤,每个检测头独立进行检测,从而提高单类别损伤的检测召回率;最后,对于置信度在一定范围内的损伤,训练一个多分类判别器,用于校正检测头输出的损伤类别。基于检测结果,利用语义分割分支可以准确分割出损伤区域。结果 构建了一个具有1 305幅且包含9种损伤类型的孔探图像数据集,并在该数据集上量化、对比了6个先进的目标检测方法。本文方法的平均精确率(mean average precision,MAP)、准确率、召回率相比单阶段检测器YOLO v5分别提高了2.49%、12.59%和12.46%。结论 本文提出的检测器和分类器级联的发动机损伤检测模型通过对每类缺陷针对性地训练单独的检测头,充分考虑了不同缺陷间的分布差异,在提高召回率的同时提升了检测精度。同时该模型易于扩展类别,并可以快速应用于分割任务,符合实际的应用需求。  相似文献   

9.
ABSTRACT

Using remote sensing techniques to detect trees at the individual level is crucial for forest management while finding the treetop is an initial and important first step. However, due to the large variations of tree size and shape, traditional unsupervised treetop detectors need to be carefully designed with heuristic knowledge making an efficient and versatile treetop detection still challenging. Currently, the deep convolutional neural networks (CNNs) have shown powerful capabilities to classify and segment images, but the required volume of labelled data for the training impedes their applications. Considering the strengths and limitations of the unsupervised and deep learning methods, we propose a framework using the automatically generated pseudo labels from unsupervised treetop detectors to train the CNNs, which saves the manual labelling efforts. In this study, we use multi-view satellite imagery derived digital surface model (DSM) and multispectral orthophoto as research data and train the fully convolutional networks (FCN) with pseudo labels separately generated from two unsupervised treetop detectors: top-hat by reconstruction (THR) operation and local maxima filter with a fixed window (FFW). The experiments show the FCN detectors trained by pseudo labels, have much better detection accuracies than the unsupervised detectors (6.5% for THR and 11.1% for FFW), especially in the densely forested area (more than 20% of improvement). In addition, our comparative experiments when using manually labelled samples show the proposed treetop detection framework has the potential to significantly reduce the need for training samples while keep a comparable performance.  相似文献   

10.
Wang  Xing-Gang  Wang  Jia-Si  Tang  Peng  Liu  Wen-Yu 《计算机科学技术学报》2019,34(6):1269-1278

Learning an effective object detector with little supervision is an essential but challenging problem in computer vision applications. In this paper, we consider the problem of learning a deep convolutional neural network (CNN) based object detector using weakly-supervised and semi-supervised information in the framework of fast region-based CNN (Fast R-CNN). The target is to obtain an object detector as accurate as the fully-supervised Fast R-CNN, but it requires less image annotation effort. To solve this problem, we use weakly-supervised training images (i.e., only the image-level annotation is given) and a few proportions of fully-supervised training images (i.e., the bounding box level annotation is given), that is a weakly- and semi-supervised (WASS) object detection setting. The proposed solution is termed as WASS R-CNN, in which there are two main components. At first, a weakly-supervised R-CNN is firstly trained; after that semi-supervised data are used for finetuning the weakly-supervised detector. We perform object detection experiments on the PASCAL VOC 2007 dataset. The proposed WASS R-CNN achieves more than 85% of a fully-supervised Fast R-CNN’s performance (measured using mean average precision) with only 10% of fully-supervised annotations together with weak supervision for all training images. The results show that the proposed learning framework can significantly reduce the labeling efforts for obtaining reliable object detectors.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号