期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

吴培周激流《智能计算机与应用》2024,(1):70-75+84

新闻视频数量的不断增加,为准确分割用户感兴趣的新闻视频,本文提出了一种基于多模态相似融合的新闻视频故事分割算法。首先,通过选定视频切割点获取候选新闻故事单元边界,将视频分成音频流和视频流;其次,选择静音区间为音频候选切分点,主持人镜头帧和主题字幕帧作为视频候选切分点,根据候选切分点获得新闻故事基本单元,利用语义相似性分析各单元内容进行合并或独立分离,得到最终新闻故事;最后,采用人脸识别、YOLOv5来进行主题字幕检测、语义相似性合并或独立新闻故事基本单元,使得新闻故事边界划分更为准确。该新闻视频故事分割算法在《新闻联播》视频中查全率和查准率分别达到了97.17%和98.19%,为新闻视频导航、检索等应用提供辅助准备。相似文献

2.

一种基于模糊信息粒化的视频时空显著单元提取方法 总被引：1，自引：0，他引：1

下载免费PDF全文

郎丛妍须德李兵《电子学报》2007,35(10):2023-2028

提出一种基于模糊信息粒化的视频时空显著单元提取方法,为视频分析及检索等高层应用提供一个有效的内容表示模式.本文首先提出了一种类相关的特征粒化方法,粒化后的模糊粒特征简化了分类关系且在一定程度上解决了感知主观性问题,因而通过简单的分类器可以有效地提取空域中具有高视觉感知显著度的区域(简称为显著区域);其次,通过对显著区域的时域一致性分析提取视频序列中时域连续的显著区域集合,定义为时空显著单元.提取的时空显著单元能作为一种较为通用的语义级内容表示模式.实验结果分别从时域和空域两个方面验证了本文方法的有效性. 相似文献

3.

综合语义与颜色特征的图像检索技术研究 总被引：2，自引：2，他引：0

袁薇高淼《微电子学与计算机》2006,23(1):36-39,44

针对多媒体搜索引擎系统中的图像检索技术，本文提出了应用图像的高层语义特征和底层颜色特征作为图像检索的综合指标，将图像文本和视觉信息融合起来，给出了一种综合语义和颜色特征的图像检索系统的体系架构．以填补多媒体底层特征和高层语义之间的差异，并在此基础上提出了相关算法，使图像检索能够满足用户的需求．提高图像检索的效率和精度。相似文献

4.

基于运动特征的视频检索

程照辉毋立芳刘健《信号处理》2011,27(5):765-770

运动特征是描述视频内容的一种重要的信息,也是视频区别于图像数据所特有的内容,要想对视频内容进行全面的刻画,运动特征是必不可少的一个方面。本文提出了一种基于时空运动特征的视频检索方案。根据离散的球的位置,获取台球原始运动轨迹,进而修正得到完整的台球运动轨迹;以运动轨迹点序列作为运动特征,同时提出了基于运动轨迹点序列顺序匹配的视频段分步检索方法:首先根据碰撞点将运动轨迹分段,然后对检索视频的轨迹分段重采样,最后利用轨迹模型匹配方法实现检索。实验结果表明,我们提出的方案实现了一定程度的语义检索,能够较好地检索到符合人们主观理解的斯诺克运动轨迹的视频内容。相似文献

5.

一种基于H.264监控视频的快速检索系统

李爱兴于峰崎《电视技术》2011,35(21):112-116

介绍了一种基于H.264格式压缩的监控视频检索系统,该系统基于对象和语义,主要包括视频分析和视频检索两大模块.通过视频分析模块,对H.264压缩域视频解码并进行视频对象提取、分割、特征提取和对象匹配等分析过程,将得到的视频对象及特征信息存入至对象特征数据库;在视频检索模块端,用户可以通过输入语义信息或示例图片直接从特征数据库中进行视频检索查询,避免了视频的重复处理.实验表明,通过提出的系统可以快速有效地进行视频检索. 相似文献

6.

基于内容的视频语义分析关键技术

张良周长胜《电子科技》2011,24(10):111-114

分析了视频数据与文本数据的差异,以及视频数据在视频分析检索方面存在的问题。从视频内容分析领域的研究热点出发,分别对视频语义库、与视频分析相关的视频低层特征、视频对象划分与识别、视频信息描述与编码等方面的技术进行了分析和对比。并提出了一个视频语义分析的框架和分析流程。相似文献

7.

基于语义的视频检索关键技术综述

孔英会刘淑荣张少明范启跃《电子科技》2012,25(8):150-153

随着大量视频的出现,视频内容检索是当今多媒体应用的一个重要研究方向。现有的视频检索技术多是基于低层特征,这些低层特征与高层语义概念相差较多,严重影响了视频内容检索系统的实用性。由于低层特征和高层语义概念间的语义鸿沟,如何从视频内容中提取人类思维中的语义概念,正成为目前视频内容检索中最具有挑战性的研究内容。文中介绍了语义视频检索出现的背景和国内外最新研究动态,分析了现有方法的优缺点,对现有的关键技术进行综述。相似文献

8.

基于语义关联的视频元数据库构建

凌坚蔡国炎练益群《电视技术》2011,35(18):67-69

在视频元数据中引入语义信息能更好地支持基于语义的智能检索。利用视频元数据中关键词的语义信息,建立概念之间语义关联网络,提出了一种构建具有语义关联的视频元数据库方法。该方法从词汇的语义出发,建立关键词和概念之间的关联,并给出了基于关系数据库的语义关联信息存储方法,从而为搜索引擎提供语义信息。相似文献

9.

基于遗传算法的交通视频事件多特征选择方法

孙佳瑶詹永照毛启容王敏超《微电子学与计算机》2013,(7)

为了提高交通视频事件检测的准确性和检测速率,提出了一种基于遗传算法的交通视频多特征选择方法。该方法首先提取交通视频的多种特征,尽可能多地获取各种视频事件的信息,然后将这些特征进行融合,得到一个可以表征事件的高维冗余的特征向量,再采用遗传算法对多特征进行优化筛选,最后使用SVM分类器进行训练获得低维的最优特征子集并应用于交通事件检测。实验结果表明,该方法在降低提取特征的维数的同时,可有效提高交通视频事件检测的准确率和检测速率。相似文献

10.

基于视觉感知的图像检索的研究 总被引：2，自引：0，他引：2

下载免费PDF全文

张菁沈兰荪《电子学报》2008,36(3):494-499

基于内容图像检索的一个突出问题是图像低层特征与高层语义之间存在的巨大鸿沟.针对相关反馈和感兴趣区检测在弥补语义鸿沟时存在主观性强、耗时的缺点,提出了视觉信息是一种客观反映图像高层语义的新特征,基于视觉信息进行图像检索可以有效减小语义鸿沟;并在总结视觉感知的研究进展和实现方法的基础上,给出了基于视觉感知的图像检索在感兴趣区检测、图像分割、相关反馈和个性化检索四个方面的研究思路. 相似文献

11.

Concept-oriented indexing of video databases: toward semantic sensitive retrieval and browsing 总被引：2，自引：0，他引：2

Jianping Fan Hangzai Luo Elmagarmid A.K. 《IEEE transactions on image processing》2004,13(7):974-992

Digital video now plays an important role in medical education, health care, telemedicine and other medical applications. Several content-based video retrieval (CBVR) systems have been proposed in the past, but they still suffer from the following challenging problems: semantic gap, semantic video concept modeling, semantic video classification, and concept-oriented video database indexing and access. In this paper, we propose a novel framework to make some advances toward the final goal to solve these problems. Specifically, the framework includes: 1) a semantic-sensitive video content representation framework by using principal video shots to enhance the quality of features; 2) semantic video concept interpretation by using flexible mixture model to bridge the semantic gap; 3) a novel semantic video-classifier training framework by integrating feature selection, parameter estimation, and model selection seamlessly in a single algorithm; and 4) a concept-oriented video database organization technique through a certain domain-dependent concept hierarchy to enable semantic-sensitive video retrieval and browsing. 相似文献

12.

基于多特征融合的视频检索算法

侯严明李菲菲陈虬《电子科技》2019,32(5):44-49

随着视频等多媒体数据呈指数式迅猛增长,高效快速的视频检索算法引起越来越多的重视。传统的图像特征如颜色直方图以及尺度不变特征变换等对视频拷贝检测中检索速度以及检测精度等问题无法达到很好的效果,因此文中提出一种多特征融合的视频检索方法。该方法利用前后两帧的时空特征进行基于滑动窗口的时间对齐算法,以达到减少检索的范围和提高检索速度的目的。该算法对关键帧进行灰度序列特征、颜色相关图特征以及SIFT局部特征提取,然后融合全局特征和局部特征两者的优势,从而提高检测精度。实验结果表明,该方法可达到较好的视频检索精度。相似文献

13.

基于动态规划的自适应关键帧提取算法 总被引：2，自引：2，他引：0

盛骁杰杨小康《电视技术》2009,33(4)

提出一种基于内容的视频检索系统的关键帧提取新算法,把关键帧提取问题建模为一个可以用动态规划算法隶解的全局优化问题.首先建立二值的帧差矩阵来表示低维特征空间中帧与帧之间的相似性度量,然后使用动态规划算法分割帧差矩阵从而提取出关键帧.该算法具有低计算复杂度和对于视频内容的自适应性,而且保持了关键帧的时间顺序.可以方便地根据需要调节关键帧数目. 相似文献

14.

Similarity-based online feature selection in content-based image retrieval. 总被引：2，自引：0，他引：2

Wei Jiang Guihua Er Qionghai Dai Jinwei Gu 《IEEE transactions on image processing》2006,15(3):702-712

Content-based image retrieval (CBIR) has been more and more important in the last decade, and the gap between high-level semantic concepts and low-level visual features hinders further performance improvement. The problem of online feature selection is critical to really bridge this gap. In this paper, we investigate online feature selection in the relevance feedback learning process to improve the retrieval performance of the region-based image retrieval system. Our contributions are mainly in three areas. 1) A novel feature selection criterion is proposed, which is based on the psychological similarity between the positive and negative training sets. 2) An effective online feature selection algorithm is implemented in a boosting manner to select the most representative features for the current query concept and combine classifiers constructed over the selected features to retrieve images. 3) To apply the proposed feature selection method in region-based image retrieval systems, we propose a novel region-based representation to describe images in a uniform feature space with real-valued fuzzy features. Our system is suitable for online relevance feedback learning in CBIR by meeting the three requirements: learning with small size training set, the intrinsic asymmetry property of training samples, and the fast response requirement. Extensive experiments, including comparisons with many state-of-the-arts, show the effectiveness of our algorithm in improving the retrieval performance and saving the processing time. 相似文献

15.

A Human-Centered Multiple Instance Learning Framework for Semantic Video Retrieval

《IEEE transactions on systems, man and cybernetics. Part C, Applications and reviews》2009,39(2):228-233

This paper proposes a human-centered interactive framework for automatically mining and retrieving semantic events in videos. After preprocessing, the object trajectories and event models are fed into the core components of the framework for learning and retrieval. As trajectories are spatiotemporal in nature, the learning component is designed to analyze time series data. The human feedback to the retrieval results provides progressive guidance for the retrieval component in the framework. The retrieval results are in the form of video sequences instead of contained trajectories for user convenience. Thus, the trajectories are not directly labeled by the feedback as required by the training algorithm. A mapping between semantic video retrieval and multiple instance learning (MIL) is established in order to solve this problem. The effectiveness of the algorithm is demonstrated by experiments on real-life transportation surveillance videos. 相似文献

16.

Research on image feature extraction and retrieval algorithms based on convolutional neural network

《Journal of Visual Communication and Image Representation》2020

With the rapid development of mobile Internet and digital technology, people are more and more keen to share pictures on social networks, and online pictures have exploded. How to retrieve similar images from large-scale images has always been a hot issue in the field of image retrieval, and the selection of image features largely affects the performance of image retrieval. The Convolutional Neural Networks (CNN), which contains more hidden layers, has more complex network structure and stronger ability of feature learning and expression compared with traditional feature extraction methods. By analyzing the disadvantage that global CNN features cannot effectively describe local details when they act on image retrieval tasks, a strategy of aggregating low-level CNN feature maps to generate local features is proposed. The high-level features of CNN model pay more attention to semantic information, but the low-level features pay more attention to local details. Using the increasingly abstract characteristics of CNN model from low to high. This paper presents a probabilistic semantic retrieval algorithm, proposes a probabilistic semantic hash retrieval method based on CNN, and designs a new end-to-end supervised learning framework, which can simultaneously learn semantic features and hash features to achieve fast image retrieval. Using convolution network, the error rate is reduced to 14.41% in this test set. In three open image libraries, namely Oxford, Holidays and ImageNet, the performance of traditional SIFT-based retrieval algorithms and other CNN-based image retrieval algorithms in tasks are compared and analyzed. The experimental results show that the proposed algorithm is superior to other contrast algorithms in terms of comprehensive retrieval effect and retrieval time. 相似文献

17.

Semantic-based surveillance video retrieval.

Weiming Hu Dan Xie Zhouyu Fu Wenrong Zeng Steve Maybank 《IEEE transactions on image processing》2007,16(4):1168-1181

相似文献

18.

A semantic framework for video genre classification and event analysis

Junyong You Guizhong Liu Andrew Perkis 《Signal Processing: Image Communication》2010,25(4):287-302

Semantic video analysis is a key issue in digital video applications, including video retrieval, annotation, and management. Most existing work on semantic video analysis is mainly focused on event detection for specific video genres, while the genre classification is treated as another independent issue. In this paper, we present a semantic framework for weakly supervised video genre classification and event analysis jointly by using probabilistic models for MPEG video streams. Several computable semantic features that can accurately reflect the event attributes are derived. Based on an intensive analysis on the connection between video genres and the contextual relationship among events, as well as the statistical characteristics of dominant event, a hidden Markov model (HMM) and naive Bayesian classifier (NBC) based analysis algorithm is proposed for video genre classification. Another Gaussian mixture model (GMM) is built to detect the contained events using the same semantic features, whilst an event adjustment strategy is proposed according to an analysis on the GMM structure and pre-definition of video events. Subsequently, a special event is recognized based on the detected events by another HMM. The simulative experiments on video genre classification and event analysis using a large number of video data sets demonstrate the promising performance of the proposed framework for semantic video analysis. 相似文献

19.

Multiple instance deep learning for weakly-supervised visual object tracking

《Signal Processing: Image Communication》2020

Intelligently tracking objects with varied shapes, color, lighting conditions, and backgrounds is an extremely useful application in many HCI applications, such as human body motion capture, hand gesture recognition, and virtual reality (VR) games. However, accurately tracking different objects under uncontrolled environments is a tough challenge due to the possibly dynamic object parts, varied lighting conditions, and sophisticated backgrounds. In this work, we propose a novel semantically-aware object tracking framework, wherein the key is weakly-supervised learning paradigm that optimally transfers the video-level semantic tags into various regions. More specifically, give a set of training video clips, each of which is associated with multiple video-level semantic tags, we first propose a weakly-supervised learning algorithm to transfer the semantic tags into various video regions. The key is a MIL (Zhong et al., 2020) [1]-based manifold embedding algorithm that maps the entire video regions into a semantic space, wherein the video-level semantic tags are well encoded. Afterward, for each video region, we use the semantic feature combined with the appearance feature as its representation. We designed a multi-view learning algorithm to optimally fuse the above two types of features. Based on the fused feature, we learn a probabilistic Gaussian mixture model to predict the target probability of each candidate window, where the window with the maximal probability is output as the tracking result. Comprehensive comparative results on a challenging pedestrian tracking task as well as the human hand gesture recognition have demonstrated the effectiveness of our method. Moreover, visualized tracking results have shown that non-rigid objects with moderate occlusions can be well localized by our method. 相似文献

20.

基于三维卷积和哈希方法的视频检索算法

陈汗青李菲菲陈虬《电子科技》2022,35(4):35-39

视频信息检索与其他多媒体检索的最大不同在于视频信息量较大,因此进行视频间相似度计算时的计算量较大。此外,对视频特征的提取中常常忽略视频帧之间的时间相关性,从而导致特征提取不充分,影响视频检索的精度。为此,文中提出基于三维卷积和哈希方法的视频检索方法。该方法构建了一个端到端的框架,使用三维卷积神经网络来提取视频中代表帧的特征,并将视频特征映射到低维的汉明空间中去,在汉明空间计算相似度。在两个视频数据集下的实验结果表明,相较于当前最新的视频检索算法,文中所提方法在精度上有较大的提升。相似文献