首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 187 毫秒
1.
针对如何在镜头基础上进行聚类,以得到更高层次的场景问题,提出了一个基于语义的场景分割算法。该算法首先将视频分割为镜头,并提取镜头的关键帧。然后计算关键帧的颜色直方图和MPEG-7边缘直方图,以形成关键帧的特征;接着利用镜头关键帧的颜色和纹理特征对支持向量机(SVM)进行训练来构造7个基于SVM对应不同语义概念的分类器,并利用它们对要进行场景分割的视频镜头关键帧进行分类,以得到关键帧的语义。并根据关键帧包含的语义概念形成了其语义概念矢量,最后根据语义概念矢量通过对镜头关键帧进行聚类来得到场景。另外.为提取场景关键帧,还构建了镜头选择函数,并根据该函数值的大小来选择场景的关键帧。实验结果表明,该场景分割算法与Hanjalic的方法相比,查准率和查全率分别提高了34.7%和9.1%。  相似文献   

2.
提出一种基于全局场景特征在视频序列中寻找频繁镜头集合,并通过局部语义特征精确定位视频场景边界的视频场景分割方法。首先对分析视频进行高精度镜头分割,选取具有代表性的镜头关键帧。然后提取各镜头关键帧的全局场景特征和局部特征,并利用局部特征聚类得到的视觉词对各个镜头关键帧进行语义标注。接下来计算基于全局场景特征的镜头间相关性,结合视频场景的概念和特性,在镜头关键帧序列中寻找局部频繁出现的相关性高的镜头集合,粗略定位视频场景位置。最后利用镜头关键帧的语义标注特征精确定位视频场景边界。实验证明该方法能够准确、有效地检测并定位到大部分视频场景。  相似文献   

3.
视频语义概念检测是跨越语义鸿沟问题,实现基于语义的视频检索的前提。本文提出了一种基于证据理论的视频语义概念检测方法。首先,分别提取了镜头关键帧的分块颜色矩、小波纹理特征和边缘方向直方图特征;然后,利用支持向量机(Support vector machine,SVM)对3种特征数据分别进行训练,分别建立分类器模型;再次,对各SVM模型泛化误差进行分析,采用折扣系数法对不同SVM模型输出的分类结果进行修正;最后,采用证据融合公式对修正后的输出进行融合,把融合结果作为最终的概念检测结果。实验结果表明,新方法提高了概念检测的准确率,优于传统的线性分类器融合方法。  相似文献   

4.
在基于视频内容检索的多媒体系统中,需要进行镜头分割、提取关键帧,需要用静态图像来表示视频内容以及对该图像的特性进行分析。视频序列中相邻画面一般具有相似和连续的特性,这是镜头分割和关键帧提取的共同理论依据。本文构造的关键帧提取系统,能直接提取关键帧而不用先进行镜头分割,只需要Ⅰ帧信息及其频城直流分量,能达到最小程度的解码。在关键帧的判定方面.通过分析当前镜头分割技术的特点,分析其发展方向,提出质点等价法和基于宏块互异的方法。  相似文献   

5.
一种改进的视频关键帧提取算法研究   总被引:2,自引:0,他引:2  
视频镜头分割和关键帧提取是基于内容的视频检索的核心问题.提出了一种改进的关键帧提取算法,其为视频检索奠定了基础,镜头分割部分采用改进直方图方法及基于像素方法的综合方法.首先,通过结合直方图交集及非均匀分块加权的改进直方图方法,根据视频内容将视频分割为镜头;然后,利用基于像素的帧差法,对得到的检测镜头进行二次检测,优化检测结果;最后,在HSV颜色空间的基础上,计算每个镜头内每帧的图像熵,从而确定关键帧序列.实验结果表明,提出的改进算法所得到的关键帧结构紧凑且分布均匀.  相似文献   

6.
结合互信息量与模糊聚类的关键帧提取方法   总被引:1,自引:0,他引:1  
关键帧是描述一个镜头的关键图像帧,它通常反映一个镜头的主要内容,因此,关键帧提取技术是视频分析和基于内容的视频检索的基础。提出了一种结合互信息量与模糊聚类的关键帧提取方法,一方面通过互信息量算法对视频片段进行镜头检测可以保持视频的时间序列和动态信息,另一方面通过模糊聚类使镜头中的关键帧能很好的反映视频镜头的主要内容。最后构建了一套针对MPEG-4视频的关键帧提取系统,通过实验证明该系统提取的关键帧,可以较好地代表视频内容,并且有利于实现视频分析和检索。  相似文献   

7.
基于内容的视频检索的关键帧提取   总被引:3,自引:0,他引:3  
关键帧提取是基于内容的视频检索中的一个重要技术。本文在总结前人的工作基础上,提出了一种利用视频帧之间互信息量算法来提取关键帧的方法。该方法结合两个连续图像帧的特征互信息量的变化关系来提取关键帧,并与视频聚类的关键帧提取方法进行了比较。实验结果表明,利用该方法提取的关键帧能较好地代表镜头内容且提取关键帧的速度比视频聚类的关键帧提取方法快。  相似文献   

8.
一种从MPEG压缩视频流中提取关键帧的方法   总被引:15,自引:0,他引:15       下载免费PDF全文
在基于视频内容检索的多媒体系统中,由于需要进行镜头分割和提取关键帧,还需要用静态图象来表示视频内容以及该图象的特性进行分析,因此根据视频序列中相邻画面一般具有相似性和连续性这一镜头分割和关键帧提取的共同理论依据,构造了关键帧提取系统,它能直接提取关键帧,而不用先进行镜头分割,且只需要Ⅰ帧信息及其频域直流分量的信息,即能达到最小程度的解码,在关键帧的判定方面,通过分析当前镜头分割技术的特点及其发展方向,提出了质点等价法和基于宏块互异的方法。  相似文献   

9.
钟忺  杨光  卢炎生 《计算机科学》2016,43(6):289-293
随着多媒体技术的发展,当今工作和生活中的多媒体信息日渐丰富。如何通过分析海量视频快速有效地检索出有用信息成为一个日益严重的问题。为了解决上述问题,提出了一种基于双阈值滑动窗口 子镜头分割和完全连通图的关键帧提取方法。该方法采用基于双阈值的镜头分割算法,通过设置双阈值滑动窗口来判断镜头的突变边界和渐变边界,从而划分镜头;并采用基于滑动窗口的子镜头分割算法,通过给视频帧序列加一个滑动窗口,在窗口的范围内利用帧差来对镜头进行再划分,得到子镜头;此外,利用基于子镜头分割的关键帧提取算法,通过处理顶点为帧、边为帧差的完全连通图的方法来提取关键帧。实验结果表明,与其他方法相比,提出的方法平均精确率较高,并且平均关键帧数目较低,可以很好地提取视频的关键帧。  相似文献   

10.
为缩小"语义鸿沟",探讨了将支持向量机(SVM)应用于视频语义检索,提出了利用多变量支持向量机回归方法(SVR)进行语义自动标注,再将SVM分类应用于关键帧的语义检索的反馈中,改进了传统的SVM反馈方法.一方面记忆并累加样本集,优化负例选择,平衡正负样本数目,使训练集样本保持动态增长的平衡状态;另一方面,保存每次满意查询的SVM模型,使本次的反馈信息得以继续使用,从而建立SVM反馈的长期记忆机制.实验结果表明,与基于内容的SVM反馈检索相比,改进后的基于SVM反馈的视频语义关键帧检索的准确率和检索效率都有所提高.  相似文献   

11.
A number of researchers have been building high-level semantic concept detectors such as outdoors, face, building, to help with semantic video retrieval. Our goal is to examine how many concepts would be needed, and how they should be selected and used. Simulating performance of video retrieval under different assumptions of concept detection accuracy, we find that good retrieval can be achieved even when detection accuracy is low, if sufficiently many concepts are combined. We also derive suggestions regarding the types of concepts that would be most helpful for a large concept lexicon. Since our user study finds that people cannot predict which concepts will help their query, we also suggest ways to find the best concepts to use. Ultimately, this paper concludes that "concept-based" video retrieval with fewer than 5000 concepts, detected with a minimal accuracy of 10% mean average precision is likely to provide high accuracy results in broadcast news retrieval.  相似文献   

12.
传统的视频检索大多采用基于关键词的方法,难以获得让用户满意的查准率和查全率。为此提出一种基于本体的视频检索技术,该技术借助于领域本体,以其基本概念为关键词通过互联网图像搜索引擎在线获取样本图像组,提取SIFT特征建立图像特征词典,抽取图像特征直方图并计算相似度,辅助完成视频的自动标注,初始化视频检索库;同时,借助于领域本体,对从用户的查询输入中抽取的关键词进行语义扩展,将以扩展概念集进行检索的结果返回给用户,以此实现基于本体的视频检索。最后,结合实例对该算法进行实现和分析,表明了该方法的可行性和有效性。  相似文献   

13.
Automatic video annotation is to bridge the semantic gap and facilitate concept based video retrieval by detecting high level concepts from video data. Recently, utilizing context information has emerged as an important direction in such domain. In this paper, we present a novel video annotation refinement approach by utilizing extrinsic semantic context extracted from video subtitles and intrinsic context among candidate annotation concepts. The extrinsic semantic context is formed by identifying a set of key terms from video subtitles. The semantic similarity between those key terms and the candidate annotation concepts is then exploited to refine initial annotation results, while most existing approaches utilize textual information heuristically. Similarity measurements including Google distance and WordNet distance have been investigated for such a refinement purpose, which is different with approaches deriving semantic relationship among concepts from given training datasets. Visualness is also utilized to discriminate individual terms for further refinement. In addition, Random Walk with Restarts (RWR) technique is employed to perform final refinement of the annotation results by exploring the inter-relationship among annotation concepts. Comprehensive experiments on TRECVID 2005 dataset have been conducted to demonstrate the effectiveness of the proposed annotation approach and to investigate the impact of various factors.  相似文献   

14.
By introducing the concept detection results to the retrieval process, concept-based video retrieval (CBVR) has been successfully used for semantic content-based video retrieval application. However, how to select and fuse the appropriate concepts for a specific query is still an important but difficult issue. In this paper, we propose a novel and effective concept selection method, named graph-based multi-space semantic correlation propagation (GMSSCP), to explore the relationship between the user query and concepts for video retrieval application. Compared with traditional methods, GMSSCP makes use of a manifold-ranking algorithm to collectively explore the multi-layered relationships between the query and concepts, and the expansion result is more robust to noises. Parallel to this, GMSSCP has a query-adapting property, which can enhance the process of concept correlation propagation and selection with strong pertinence of query cues. Furthermore, it can dynamically update the unified propagation graph by flexibly introducing the multi-modal query cues as additional nodes, and is not only effective for automatic retrieval but also appropriate for the interactive case. Encouraging experimental results on TRECVID datasets demonstrate the effectiveness of GMSSCP over the state-of-the-art concept selection methods. Moreover, we also apply it to the interactive retrieval system??VideoMap and gain an excellent performance and user experience.  相似文献   

15.
Describing visual contents in videos by semantic concepts is an effective and realistic approach that can be used in video applications such as annotation, indexing, retrieval and ranking. In these applications, video data needs to be labelled with some known set of labels or concepts. Assigning semantic concepts manually is not feasible due to the large volume of ever-growing video data. Hence, automatic semantic concept detection of videos is a hot research area. Recently Deep Convolutional Neural Networks (CNNs) used in computer vision tasks are showing remarkable performance. In this paper, we present a novel approach for automatic semantic video concept detection using deep CNN and foreground driven concept co-occurrence matrix (FDCCM) which keeps foreground to background concept co-occurrence values, built by exploiting concept co-occurrence relationship in pre-labelled TRECVID video dataset and from a collection of random images extracted from Google Images. To deal with the dataset imbalance problem, we have extended this approach by making a fusion of two asymmetrically trained deep CNNs and used FDCCM to further improve concept detection. The performance of the proposed approach is compared with state-of-the-art approaches for the video concept detection over the widely used TRECVID data set and is found to be superior to existing approaches.  相似文献   

16.
Automatic semantic concept detection in video is important for effective content-based video retrieval and mining and has gained great attention recently. In this paper, we propose a general post-filtering framework to enhance robustness and accuracy of semantic concept detection using association and temporal analysis for concept knowledge discovery. Co-occurrence of several semantic concepts could imply the presence of other concepts. We use association mining techniques to discover such inter-concept association relationships from annotations. With discovered concept association rules, we propose a strategy to combine associated concept classifiers to improve detection accuracy. In addition, because video is often visually smooth and semantically coherent, detection results from temporally adjacent shots could be used for the detection of the current shot. We propose temporal filter designs for inter-shot temporal dependency mining to further improve detection accuracy. Experiments on the TRECVID 2005 dataset show our post-filtering framework is both efficient and effective in improving the accuracy of semantic concept detection in video. Furthermore, it is easy to integrate our framework with existing classifiers to boost their performance.  相似文献   

17.
Multimedia content has been growing quickly and video retrieval is regarded as one of the most famous issues in multimedia research. In order to retrieve a desirable video, users express their needs in terms of queries. Queries can be on object, motion, texture, color, audio, etc. Low-level representations of video are different from the higher level concepts which a user associates with video. Therefore, query based on semantics is more realistic and tangible for end user. Comprehending the semantics of query has opened a new insight in video retrieval and bridging the semantic gap. However, the problem is that the video needs to be manually annotated in order to support queries expressed in terms of semantic concepts. Annotating semantic concepts which appear in video shots is a challenging and time-consuming task. Moreover, it is not possible to provide annotation for every concept in the real world. In this study, an integrated semantic-based approach for similarity computation is proposed with respect to enhance the retrieval effectiveness in concept-based video retrieval. The proposed method is based on the integration of knowledge-based and corpus-based semantic word similarity measures in order to retrieve video shots for concepts whose annotations are not available for the system. The TRECVID 2005 dataset is used for evaluation purpose, and the results of applying proposed method are then compared against the individual knowledge-based and corpus-based semantic word similarity measures which were utilized in previous studies in the same domain. The superiority of integrated similarity method is shown and evaluated in terms of Mean Average Precision (MAP).  相似文献   

18.
Concept detection stands as an important problem for efficient indexing and retrieval in large video archives. In this work, the KavTan System, which performs high-level semantic classification in one of the largest TV archives of Turkey, is presented. In this system, concept detection is performed using generalized visual and audio concept detection modules that are supported by video text detection, audio keyword spotting and specialized audio-visual semantic detection components. The performance of the presented framework was assessed objectively over a wide range of semantic concepts (5 high-level, 14 visual, 9 audio, 2 supplementary) by using a significant amount of precisely labeled ground truth data. KavTan System achieves successful high-level concept detection performance in unconstrained TV broadcast by efficiently utilizing multimodal information that is systematically extracted from both spatial and temporal extent of multimedia data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号