首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
2.
We define similar video content as video sequences with almost identical content but possibly compressed at different qualities, reformatted to different sizes and frame-rates, undergone minor editing in either spatial or temporal domain, or summarized into keyframe sequences. Building a search engine to identify such similar content in the World-Wide Web requires: 1) robust video similarity measurements; 2) fast similarity search techniques on large databases; and 3) intuitive organization of search results. In a previous paper, we proposed a randomized technique called the video signature (ViSig) method for video similarity measurement. In this paper, we focus on the remaining two issues by proposing a feature extraction scheme for fast similarity search, and a clustering algorithm for identification of similar clusters. Similar to many other content-based methods, the ViSig method uses high-dimensional feature vectors to represent video. To warrant a fast response time for similarity searches on high dimensional vectors, we propose a novel nonlinear feature extraction scheme on arbitrary metric spaces that combines the triangle inequality with the classical Principal Component Analysis (PCA). We show experimentally that the proposed technique outperforms PCA, Fastmap, Triangle-Inequality Pruning, and Haar wavelet on signature data. To further improve retrieval performance, and provide better organization of similarity search results, we introduce a new graph-theoretical clustering algorithm on large databases of signatures. This algorithm treats all signatures as an abstract threshold graph, where the distance threshold is determined based on local data statistics. Similar clusters are then identified as highly connected regions in the graph. By measuring the retrieval performance against a ground-truth set, we show that our proposed algorithm outperforms simple thresholding, single-link and complete-link hierarchical clustering techniques.  相似文献   

3.
Content-based copy retrieval (CBCR) aims at retrieving in a database all the modified versions or the previous versions of a given candidate object. In this paper, we present a copy-retrieval scheme based on local features that can deal with very large databases both in terms of quality and speed. We first propose a new approximate similarity search technique in which the probabilistic selection of the feature space regions is not based on the distribution in the database but on the distribution of the features distortion. Since our CBCR framework is based on local features, the approximation can be strong and reduce drastically the amount of data to explore. Furthermore, we show how the discrimination of the global retrieval can be enhanced during its post-processing step, by considering only the geometrically consistent matches. This framework is applied to robust video copy retrieval and extensive experiments are presented to study the interactions between the approximate search and the retrieval efficiency. Largest used database contains more than 1 billion local features corresponding to 30000 h of video  相似文献   

4.
基于多特征匹配的视频拷贝检测算法   总被引:3,自引:0,他引:3  
针对已有的视频拷贝检测算法仅使用单一特征进行视频内容匹配,难以应对多种不同形式的拷贝变化的问题,提出一种基于多种视觉特征的视频拷贝检测算法.该算法采用级联式检测过滤框架,在提取视频帧图像的全局特征用于检测画面轻微变化的重复视频片段后,使用更精准的局部特征等来检测各种复杂变化后的拷贝内容.为在大规模数据库中实现快速检测,使用kd树型索引结构实现特征近邻检索.在标准评测数据集上的实验结果表明,文中算法对多种拷贝变化具有鲁棒性,并具有较高的检测效率.  相似文献   

5.
基于内容检索的视频处理技术   总被引:32,自引:1,他引:31       下载免费PDF全文
从分析视频数据的结构和特点出发,总结了基于内容检索的视频处理方法的一般步骤,即视频分割、关键帧选取、静态和动态特征提取以及视频聚类等,然后深入介绍了各个处理过程中的一些最新方法,并分析了各种方法和技术的优缺点;最后,对基于内容的视频检索提出一些值得进一步研究的问题。  相似文献   

6.
7.
提出一种基于鲁棒Hash的视频拷贝检测方案.通过对特征点进行分类,选取在时空域上持久存在的稳定点,对邻域点进行微分计算构造局部特征.将多维特征数据进行Hilbert编码,并选取有效位作为检测Hash码.为了准确的在目标视频中定位可疑内容,提出了Hash匹配方案,将序列相似度作为匹配的依据,提高匹配精度.实验结果表明本方案拥有较好检测性能,适用于视频内容的拷贝检测.  相似文献   

8.
基于内容的视频拷贝检测是多媒体领域的一个研究热点.由于拷贝变换的多样性和综合性,单一特征难以获得很好的检测效果.提出一种多特征综合的方法来提高视频拷贝检测的效果.除了使用传统的局部和全局视觉特征外,还使用非正交二值子空间(NBS)方法来表示视频内容,并在其基础上使用归一化互相关(NCC)来提高拷贝视频内容相似度计算的效果.在此基础上,还采用多种措施对拷贝视频的判定结果进行精化.实验结果表明,该套方案对多种拷贝变换具有很强的鲁棒性,并且能够得到很好的检测精度.  相似文献   

9.
基于内容的视频拷贝检测研究   总被引:1,自引:1,他引:0       下载免费PDF全文
刘红  文朝晖  王晔 《计算机工程》2010,36(7):227-229
提出基于图的视频拷贝检测方法,该方法将视频序列匹配结果转换为匹配结果图,进而将视频拷贝检测转换成在匹配结果图中查找最长路径的问题。实验结果显示基于图的序列匹配算法拷贝定位准确度高,可弥补图像底层特征描述力不足的缺陷,节约检测时间,批量定位2段视频序列中可能存在的多段拷贝。  相似文献   

10.
《Pattern recognition》2014,47(2):568-577
Face recognition is one of the most extensively studied topics in image analysis because of its wide range of possible applications such as in surveillance, access control, content-based video search, human–computer interaction, electronic advertisement and more. Face identification is a one-to-n matching problem where a captured face is compared to n samples in a database. In this work we propose two new methods for face identification. The first one combines entropy-like weighted Gabor features with the local normalization of Gabor features. The second fuses the entropy-like weighted Gabor features at the score level with the local binary pattern (LBP) applied to the magnitude (LGBP) and phase (LGXP) components of the Gabor features. We used the FERET, AR, and FRGC 2.0 databases to test and compare our results with those previously published. Results on these databases show significant improvement relative to previously published results, reaching the best performance on the FERET and AR databases. Our methods also showed significant robustness to slight pose variations. We tested the proposed methods assuming noisy eye detection to check their robustness to inexact face alignment. Results show that the proposed methods are robust to errors of up to 3 pixels in eye detection.  相似文献   

11.
We propose a video copy detection scheme that employs a transform domain global video fingerprinting method. Video fingerprinting has been performed by the subspace learning based on nonnegative matrix factorization (NMF). It is shown that the binary video fingerprints extracted from the basis and gain matrices of the NMF representation enable us to efficiently represent the spatial and temporal content of a video segment respectively. An extensive performance evaluation has been carried out on the query and reference dataset of CBCD task of TRECVID 2011. Our results are compared with the average and the best performance reported for the task. Also NDCR and F1 rates are reported in comparison to the performance achieved via the global methods designed by the TRECVID 2011 participants. Results demonstrate that the proposed method achieves higher correct detection rates with good localization capability for the transformation of text/logo insertion, strong re-encoding, frame dropping, noise addition, gamma change or their mixtures; however there is still potential for improvement to detect copies with picture-in-picture transformations. It is also concluded that the introduced binary fingerprinting scheme is superior to the existing transform based methods in terms of the compactness.  相似文献   

12.
13.
Evaluation of key frame-based retrieval techniques for video   总被引:1,自引:0,他引:1  
We investigate the application of a variety of content-based image retrieval techniques to the problem of video retrieval. We generate large numbers of features for each of the key frames selected by a highly effective shot boundary detection algorithm to facilitate a query by example type search. The retrieval performance of two learning methods, boosting and k-nearest neighbours, is compared against a vector space model. We carry out a novel and extensive evaluation to demonstrate and compare the usefulness of these algorithms for video retrieval tasks using a carefully created test collection of over 6000 still images, where performance is measured against relevance judgements based on human image annotations. Three types of experiment are carried out: classification tasks, category searches (both related to automated annotation and summarisation of video material) and real world searches (for navigation and entry point finding). We also show graphical results of real video search tasks using the algorithms, which have not previously been applied to video material in this way.  相似文献   

14.
马苗  王伯龙  吴琦  武杰  郭敏 《软件学报》2019,30(4):867-883
作为计算机视觉、多媒体、人工智能和自然语言处理等领域的交叉性研究课题,视觉场景描述的研究内容是自动生成一个或多个语句用于描述图像或视频中呈现的视觉场景信息.视觉场景中内容的丰富性和自然语言表达的多样性使得视觉场景描述成为一项充满挑战的任务,综述了现有视觉场景描述方法及其效果评价.首先,论述了视觉场景描述的定义、研究任务及方法分类,简要分析了视觉场景描述与多模态检索、跨模态学习、场景分类、视觉关系检测等相关技术的关系;然后分类讨论视觉场景描述的主要方法、模型及研究进展,归纳日渐增多的基准数据集;接下来,梳理客观评价视觉场景描述效果的主要指标和视觉场景描述技术面临的问题与挑战,最后讨论未来的应用前景.  相似文献   

15.
基于向量空间模型的视频语义相关内容挖掘   总被引:1,自引:0,他引:1       下载免费PDF全文
对海量视频数据库中所蕴涵的语义相关内容进行挖掘分析,是视频摘要生成方法面临的难题。该文提出了一种基于向量空间模型的视频语义相关内容挖掘方法:对新闻视频进行预处理,将视频转化为向量形式的数据集,采用主题关键帧提取算法对视频聚类内容进行挖掘,保留蕴涵场景独特信息的关键帧,去除视频中冗余的内容,这些主题关键帧按原有的时间顺序排列生成视频的摘要。实验结果表明,使用该视频语义相关内容挖掘的算法生成的新闻视频具有良好的压缩率和内容涵盖率。  相似文献   

16.
Relevance ranking in georeferenced video search   总被引:1,自引:0,他引:1  
The rapid adoption and deployment of ubiquitous video cameras has led to the collection of voluminous amounts of media data. However, indexing and searching of large video databases remain a very challenging task. Recently, some recorded video data are automatically annotated with meta-data collected from various sensors such as Global Positioning System (GPS) and compass devices. In our earlier work, we proposed the notion of a viewable scene model derived from the fusion of location and direction sensor information with a video stream. Such georeferenced media streams are useful in many applications and, very importantly, they can effectively be searched via their meta-data on a large scale. Consequently, search by geo-properties complements traditional content-based retrieval methods. The result of a georeferenced video query will in general consist of a number of video segments that satisfy the query conditions, but with more or less relevance. For example, a building of interest may appear in a video segment, but may only be visible in a corner. Therefore, an essential and integral part of a video query is the ranking of the result set according to the relevance of each clip. An effective result ranking is even more important for video than it is for text search, since the browsing of results can only be achieved by viewing each clip, which is very time consuming. In this study, we investigate and present three ranking algorithms that use spatial and temporal properties of georeferenced videos to effectively rank search results. To allow our techniques to scale to large video databases, we further introduce a histogram-based approach that allows fast online computations. An experimental evaluation demonstrates the utility of the proposed methods.  相似文献   

17.
In the News     
This paper discusses how some artificial intelligence (AI) researchers and search experts are using AI methods to try to improve the accuracy of video search results. One example is a University of Oxford project in which researchers use statistical machine learning, specifically computer vision methods for face detection and facial feature localization, to provide automatic annotation of video with information about all the content of the video. Another example is the video search engine from Blinkx that objectively analyzes video content using speech recognition and matches the spoken words to context gleaned from a massive database. Finally, researchers at Dartmouth University are working on a technology that shows whether images or video clips have been doctored. This technique uses support vector machines to differentiate computer-generated images from photographic images. The paper goes on to discuss computer Go programs. Go is an ancient Asian board game which has become a challenge for AI researchers around the world. Go is resistant to Deep Blue's brute-force search of the game tree; the number of possible moves is too large. This inspires researchers to develop hybrid methods combining different methods and algorithms  相似文献   

18.
This paper describes an algorithm for the reduction of computational complexity in phonetic search KeyWord Spotting (KWS). This reduction is particularly important when searching for keywords within very large speech databases and aiming for rapid response time. The suggested algorithm consists of an anchor-based phoneme search that reduces the search space by generating hypotheses only around phonemes recognized with high reliability. Three databases have been used for the evaluation: IBM Voicemail I and Voicemail II, consisting of long spontaneous utterances and the Wall Street Journal portion of the MACROPHONE database, consisting of read speech utterances. The results indicated a significant reduction of nearly 90 % in the computational complexity of the search while improving the false alarm rate, with only a small decrease in the detection rate in both databases. Search space reduction, as well as, performance gain or loss can be controlled according to the user preferences via the suggested algorithm parameters and thresholds.  相似文献   

19.
20.
针对视频拷贝检测问题,提出了基于拉普拉斯特征映射(Laplacian Eigenmaps,LE)的视频哈希方法,该方法利用视频层析成像技术和服从均匀分布的向量对视频进行镜头分割和关键帧提取,以高阶累计量作为视频在高维空间的特征,并利用LE进行降维,得到视频在三维空间中的轨迹,利用三维空间中点的范数构造视频哈希来实现视频拷贝检测。实验结果表明,该方法具有较好的鲁棒性和区分性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号