首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The massive web videos prompt an imperative demand on efficiently grasping the major events.However, the distinct characteristics of web videos, such as the limited number of features, the noisy text information, and the unavoidable error in near-duplicate keyframes (NDKs) detection, make web video event mining a challenging task.In this paper, we propose a novel four-stage framework to improve the performance of web video event mining.Data preprocessing is the first stage.Multiple Correspondence Analysis (MCA) is then applied to explore the correlation between terms and classes, targeting for bridging the gap between NDKs and high-level semantic concepts.Next, co-occurrence information is used to detect the similarity between NDKs and classes using the NDK-within-video information.Finally, both of them are integrated for web video event mining through negative NDK pruning and positive NDK enhancement.Moreover, both NDKs and terms with relatively low frequencies are treated as useful information in our experiments.Experimental results on large-scale web videos from YouTube demonstrate that the proposed framework outperforms several existing mining methods and obtains good results for web video event mining.  相似文献   

2.
近几年,随着视频数据规模的不断增加,近重复视频数据不断涌现,视频的数据质量问题越来越突出。通过近重复视频清洗方法,有助于提高视频集的数据质量。然而,目前针对近重复视频清洗问题的研究较少,主要集中于近重复视频检索等方面的研究。现有研究方法尽管可以有效识别近重复视频,但较难在保证数据完整性的前提下,自动清洗近重复视频数据,以便改善视频数据质量。为解决上述问题,提出一种融合VGG-16深度网络与FD-means(feature distance-means)聚类的近重复视频清洗方法。该方法借助MOG2模型和中值滤波算法对视频进行背景分割和前景降噪;利用VGG-16深度网络模型提取视频的深度空间特征;构建一种新的FD-means聚类算法模型,通过迭代产生的近重复视频簇,更新簇类中心点,并最终删除簇中中心点之外的近重复视频数据。实验结果表明,该方法能够有效解决近重复视频数据清洗问题,改善视频的数据质量。  相似文献   

3.
YouTube (owned by Google Inc.) is arguably among most popular social media platforms used by millions across the globe. It provides an ever-growing, unique and rich source of content which presents new opportunities and challenges for information discovery and analysis. It is pertinent to explore and understand a topic via YouTube content to discover interesting information about public opinions and sentiments. This paper presents an integrated framework to facilitate the acquisition, storage, management, processing, and visualization of relevant content with the objective to assist in such analysis. It not only collects a significant portion of content, relevant to a given topic, in short time but also offers tools for visual exploratory analysis such as; (i) temporal evolution, (ii) vocabulary network, (iii) authors relative popularity and influence (iv) categories and (v) user communities and influencers. The utility and effectiveness is demonstrated through content analysis of a famous YouTube entertainment topic, the “Gangnam Style”.  相似文献   

4.
With the exponential growth of social media, there exist huge numbers of near-duplicate web videos, ranging from simple formatting to complex mixture of different editing effects. In addition to the abundant video content, the social Web provides rich sets of context information associated with web videos, such as thumbnail image, time duration and so on. At the same time, the popularity of Web 2.0 demands for timely response to user queries. To balance the speed and accuracy aspects, in this paper, we combine the contextual information from time duration, number of views, and thumbnail images with the content analysis derived from color and local points to achieve real-time near-duplicate elimination. The results of 24 popular queries retrieved from YouTube show that the proposed approach integrating content and context can reach real-time novelty re-ranking of web videos with extremely high efficiency, where the majority of duplicates can be rapidly detected and removed from the top rankings. The speedup of the proposed approach can reach 164 times faster than the effective hierarchical method proposed in , with just a slight loss of performance.  相似文献   

5.
A method to track topic evolution via salient keyword matching with consideration of semantic broadness for Web video discovery is presented in this paper. The proposed method enables users to understand the evolution of topics over time for discovering Web videos in which they are interested. A framework that enables extraction and tracking of the hierarchical structure, which contains Web video groups with various degrees of semantic broadness, is newly derived as follows: Based on network analysis using multimodal features, i.e., features of video contents and metadata, our method extracts the hierarchical structure and salient keywords that represent contents of each Web video group. Moreover, salient keyword matching, which is newly developed by considering salient keyword distribution, semantic broadness of each Web video group and initial topic relevance, is applied to each hierarchical structure obtained in different time stamps. Unlike methods in previous works, by considering the semantic broadness as well as the salient keyword distribution, our method can overcome the problem of the desired semantic broadness of topics being different depending on each user. Also, the initial topic relevance enables correction of the gap from an initial topic at the start of tracking. Consequently, it becomes feasible to track the evolution of topics over time for finding Web videos in which the users are interested. Experimental results for real-world datasets containing YouTube videos verify the effectiveness of the proposed method.  相似文献   

6.
Emerging Internet services and applications attract increasing users to involve in diverse video-related activities, such as video searching, video downloading, video sharing and so on. As normal operations, they lead to an explosive growth of online video volume, and inevitably give rise to the massive near-duplicate contents. Near-duplicate video retrieval (NDVR) has always been a hot topic. The primary purpose of this paper is to present a comprehensive survey and an updated reviewof the advance on large-scaleNDVR to supply guidance for researchers. Specifically, we summarize and compare the definitions of near-duplicate videos (NDVs) in the literature, analyze the relationship between NDVR and its related research topics theoretically, describe its generic framework in detail, investigate the existing state-of-the-art NDVR systems. Finally, we present the development trends and research directions of this topic.  相似文献   

7.
Web video categorization is a fundamental task for web video search. In this paper, we explore web video categorization from a new perspective, by integrating the model-based and data-driven approaches to boost the performance. The boosting comes from two aspects: one is the performance improvement for text classifiers through query expansion from related videos and user videos. The model-based classifiers are built based on the text features extracted from title and tags. Related videos and user videos act as external resources for compensating the shortcoming of the limited and noisy text features. Query expansion is adopted to reinforce the classification performance of text features through related videos and user videos. The other improvement is derived from the integration of model-based classification and data-driven majority voting from related videos and user videos. From the data-driven viewpoint, related videos and user videos are treated as sources for majority voting from the perspective of video relevance and user interest, respectively. Semantic meaning from text, video relevance from related videos, and user interest induced from user videos, are combined to robustly determine the video category. Their combination from semantics, relevance and interest further improves the performance of web video categorization. Experiments on YouTube videos demonstrate the significant improvement of the proposed approach compared to the traditional text based classifiers.  相似文献   

8.
9.
10.
Facing the explosive growth of near-duplicate videos, video archaeology is quite desired to investigate the history of the manipulations on these videos. With the determination of derived videos according to the manipulations, a video migration map can be constructed with the pair-wise relationships in a set of near-duplicate videos. In this paper, we propose an improved video archaeology (I-VA) system by extending our previous work (Shen et al. 2010). The extensions include more comprehensive video manipulation detectors and improved techniques for these detectors. Specially, the detectors are used for two categories of manipulations, i.e., semantic-based manipulations and non-semantic-based manipulations. Moreover, the improved detecting algorithms are more stable. The key of I-VA is the construction of a video migration map, which represents the history of how near-duplicate videos have been manipulated. There are various applications based on the proposed I-VA system, such as better understanding of the meaning and context conveyed by the manipulated videos, improving current video search engines by better presentation based on the migration map, and better indexing scheme based on the annotation propagation. The system is tested on a collection of 12,790 videos and 3,481 duplicates. The experimental results show that I-VA can discover the manipulation relation among the near-duplicate videos effectively.  相似文献   

11.
The sharing and re-sharing of videos on social sites, blogs e-mail, and other means has given rise to the phenomenon of viral videos—videos that become popular through internet sharing. In this paper we seek to better understand viral videos on YouTube by analyzing sharing and its relationship to video popularity using millions of YouTube videos. The socialness of a video is quantified by classifying the referrer sources for video views as social (e.g. an emailed link, Facebook referral) or non-social (e.g. a link from related videos). We find that viewership patterns of highly social videos are very different from less social videos. For example, the highly social videos rise to, and fall from, their peak popularity more quickly than less social videos. We also find that not all highly social videos become popular, and not all popular videos are highly social. By using our insights on viral videos we are able develop a method for ranking blogs and websites on their ability to spread viral videos.  相似文献   

12.
State-of-the-art near-duplicate video clip (NDVC) detection for novelty re-ranking uses non-semantic low-level features (color/texture) to detect and eliminate “content-based NDVC” and increases content level novelty in the top results. However, humans may perceive a video as near duplicate from a semantic perspective as well. In this paper, we propose concept-based near-duplicate video clip (CBNDVC) detection technique for novelty re-ranking. We identify “semantic NDVC”, making use of the semantic features (events/concepts) and re-rank the top results to increase the content as well as semantic novelty. Videos are represented as a multivariate time series of confidence values of relevant concepts and thereafter discovery of CBNDVC clusters is achieved by conceptual clustering. Obtained results show higher precision and recall from the user’s perspective.  相似文献   

13.

As one of key technologies in content-based near-duplicate detection and video retrieval, video sequence matching can be used to judge whether two videos exist duplicate or near-duplicate segments or not. Despite a lot of research efforts devoted in recent years, how to precisely and efficiently perform sequence matching among videos (which may be subject to complex audio-visual transformations) from a large-scale database still remains a pretty challenging task. To address this problem, this paper proposes a multiscale video sequence matching (MS-VSM) method, which can gradually detect and locate the similar segments between videos from coarse to fine scales. At the coarse scale, it makes use of the Maximum Weight Matching (MWM) algorithm to rapidly select several candidate reference videos from the database for a given query. Then for each candidate video, its most similar segment with respect to the given query is obtained at the middle scale by the Constrained Longest Ascending Matching Subsequence (CLAMS) algorithm, and then can be used to judge whether that candidate exists near-duplicate or not. If so, the precise locations of the near-duplicate segments in both query and reference videos are determined at the fine scale by using bi-directional scanning to check the matching similarity at the segments’ boundaries. As such, the MS-VSM method can achieve excellent near-duplicate detection accuracy and localization precision with a very high processing efficiency. Extensive experiments show that it outperforms several state-of-the-art methods remarkably on several benchmarks.

  相似文献   

14.
针对当前网络上存在着大量的重复或近似重复的视频问题,提出了一种基于镜头层比较和位置敏感哈希的快速准确的网络视频重复检测方法。通过视频间匹配的镜头数占查询视频总镜头数的比例来判断视频的相似性。除此之外,还利用著名的近似最近邻查找技术——LSH在镜头层来快速查找相似镜头,从而提高检测速度。通过将镜头作为检索单元,把数据库中所有视频的镜头放到一起构建一个新的数据集,将种子(查询)视频的每一个镜头作为一个查询请求,应用基于LSH的近似近邻检索方法,检索出与查询镜头相匹配的所有镜头,最后融合这些返回的结果,得到查询视频的重复或者近似重复的视频集。通过在包含12 790个视频的CC_WEB_VIDEO数据集上的实验结果表明,该方法取得了相比已有方法更好的检测性能。  相似文献   

15.
Concept detection is targeted at automatically labeling video content with semantic concepts appearing in it, like objects, locations, or activities. While concept detectors have become key components in many research prototypes for content-based video retrieval, their practical use is limited by the need for large-scale annotated training sets. To overcome this problem, we propose to train concept detectors on material downloaded from web-based video sharing portals like YouTube, such that training is based on tags given by users during upload, no manual annotation is required, and concept detection can scale up to thousands of concepts. On the downside, web video as training material is a complex domain, and the tags associated with it are weak and unreliable. Consequently, performance loss is to be expected when replacing high-quality state-of-the-art training sets with web video content.This paper presents a concept detection prototype named TubeTagger that utilizes YouTube content for an autonomous training. In quantitative experiments, we compare the performance when training on web video and on standard datasets from the literature. It is demonstrated that concept detection in web video is feasible, and that – when testing on YouTube videos – the YouTube-based detector outperforms the ones trained on standard training sets. By applying the YouTube-based prototype to datasets from the literature, we further demonstrate that: (1) If training annotations on the target domain are available, the resulting detectors significantly outperform the YouTube-based tagger. (2) If no annotations are available, the YouTube-based detector achieves comparable performance to the ones trained on standard datasets (moderate relative performance losses of 11.4% is measured) while offering the advantage of a fully automatic, scalable learning. (3) By enriching conventional training sets with online video material, performance improvements of 11.7% can be achieved when generalizing to domains unseen in training.  相似文献   

16.
In this paper, an automatic image–text alignment algorithm is developed to achieve more effective indexing and retrieval of large-scale web images by aligning web images with their most relevant auxiliary text terms or phrases. First, a large number of cross-media web pages (which contain web images and their auxiliary texts) are crawled and segmented into a set of image–text pairs (informative web images and their associated text terms or phrases). Second, near-duplicate image clustering is used to group large-scale web images into a set of clusters of near-duplicate images according to their visual similarities. The near-duplicate web images in the same cluster share similar semantics and are simultaneously associated with a same or similar set of auxiliary text terms or phrases which co-occur frequently in the relevant text blocks, thus performing near-duplicate image clustering can significantly reduce the uncertainty on the relatedness between the semantics of web images and their auxiliary text terms or phrases. Finally, random walk is performed over a phrase correlation network to achieve more precise image–text alignment by refining the relevance scores between the web images and their auxiliary text terms or phrases. Our experiments on algorithm evaluation have achieved very positive results on large-scale cross-media web pages.  相似文献   

17.
18.
Nowadays, numerous social videos have pervaded on the web. Social web videos are characterized with the accompanying rich contextual information which describe the content of videos and thus greatly facilitate video search and browsing. Generally, those contextual data such as tags are provided at the whole video level, without temporal indication of when they actually appear in the video, let alone the spatial annotation of object related tags in the video frames. However, many tags only describe parts of the video content. Therefore, tag localization, the process of assigning tags to the underlying relevant video segments or frames even regions in frames is gaining increasing research interests and a benchmark dataset for the fair evaluation of tag localization algorithms is highly desirable. In this paper, we describe and release a dataset called DUT-WEBV, which contains about 4,000 videos collected from YouTube portal by issuing 50 concepts as queries. These concepts cover a wide range of semantic aspects including scenes like “mountain”, events like “flood”, objects like “cows”, sites like “gas station”, and activities like “handshaking”, offering great challenges to the tag (i.e., concept) localization task. For each video of a tag, we carefully annotate the time durations when the tag appears in the video and also label the spatial location of object with mask in frames for object related tag. Besides the video itself, the contextual information, such as thumbnail images, titles, and YouTube categories, is also provided. Together with this benchmark dataset, we present a baseline for tag localization using multiple instance learning approach. Finally, we discuss some open research issues for tag localization in web videos.  相似文献   

19.
理想的视频库组织方法应该把语义相关并且特征相似的视频的特征向量相邻存储.针对大规模视频库的特点,在语义监督下基于低层视觉特征对视频库进行层次聚类划分,当一个聚类中只包含一个语义类别的视频时,为这个聚类建立索引项,每个聚类所包含的原始特征数据在磁盘上连续存储.统计低层特征和高层特征的概率联系,构造Bayes分类器.查询时对用户的查询范例,首先确定最可能的候选聚类,然后在候选聚类范围内查询相似视频片段.实验结果表明,文中的方法不仅提高了检索速度而且提高了检索的语义敏感度.  相似文献   

20.
With the pervasiveness of online social media and rapid growth of web data, a large amount of multi-media data is available online. However, how to organize them for facilitating users’ experience and government supervision remains a problem yet to be seriously investigated. Topic detection and tracking, which has been a hot research topic for decades, could cluster web videos into different topics according to their semantic content. However, how to online discover topic and track them from web videos and images has not been fully discussed. In this paper, we formulate topic detection and tracking as an online tracking, detection and learning problem. First, by learning from historical data including labeled data and plenty of unlabeled data using semi-supervised multi-class multi-feature method, we obtain a topic tracker which could also discover novel topics from the new stream data. Second, when new data arrives, an online updating method is developed to make topic tracker adaptable to the evolution of the stream data. We conduct experiments on public dataset to evaluate the performance of the proposed method and the results demonstrate its effectiveness for topic detection and tracking.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号