首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 21 毫秒
1.
With the wide spread of smartphones, a large number of user-generated videos are produced everyday. The embedded sensors, e.g., GPS and the digital compass, make it possible that videos are accessed based on their geo-properties. In our previous work, we have created a framework for integrated, sensor-rich video acquisition (with one instantiation implemented in the form of smartphone applications) which associates a continuous stream of location and viewing direction information with the collected videos, hence allowing them to be expressed and manipulated as spatio-temporal objects. These sensor meta-data are considerably smaller in size compared to the visual content and are helpful in effectively and efficiently searching for geo-tagged videos in large-scale repositories. In this study, we propose a novel three-level grid-based index structure and introduce a number of related query types, including typical spatial queries and ones based on bounded radius and viewing direction restriction. These two criteria are important in many video applications and we demonstrate the importance with a real-world dataset. Moreover, experimental results on a large-scale synthetic dataset show that our approach can provide a significant speed improvements of at least 30 %, considering a mix of queries, compared to a multi-dimensional R-tree implementation.  相似文献   

2.
针对交互式的多媒体学习系统的特点,提出了一种基于自然语言的方法来实现基于内容的视频检索,用户可以用自然语言和系统进行交互,从而方便快捷地找到自己想要的视频片段.该方法集成了自然语言处理、实体名提取,基于帧的索引以及信息检索等技术,从而使系统能够处理用户提出的自然语言问题,根据问题构建简洁明了的问题模板,用问题模板与系统中已建的描述视频的模板进行匹配,从而降低了视频检索问题的复杂度,提高了系统的易用性.  相似文献   

3.
Backward demodulation is a simplification technique used in saturation-based theorem proving with superposition and ordered paramodulation. It requires instance retrieval, i.e., search for instances of some term in a typically large set of terms. Path indexing is a family of indexing techniques that can be used to solve this problem efficiently. We propose a number of powerful optimisations to standard path indexing. We also describe a novel framework that combines path indexing with relational joins. The main advantage of the proposed scheme is flexibility, which we illustrate by sketching how to adapt the scheme to instance retrieval modulo commutativity and backward subsumption on multi-literal clauses.  相似文献   

4.
5.
Video similarity matching has broad applications such as copyright detection, news tracking and commercial monitoring, etc. Among these applications, one typical task is to detect the local similarity between two videos without the knowledge on positions and lengths of each matched subclip pair. However, most studies so far on video detection investigate the global similarity between two short clips using a pre-defined distance function. Although there are a few works on video subsequence detection, all these proposals fail to provide an effective query processing mechanism. In this paper, we first generalize the problem of video similarity matching. Then, a novel solution called consistent keyframe matching (CKM) is proposed to solve the problem of subsequence matching based on video segmentation. CKM is designed with two goals: (1) good scalability in terms of the query sequence length and the size of video database and (2) fast video subsequence matching in terms of processing time. Good scalability is achieved by employing a batch query paradigm, where keyframes sharing the same query space are summarized and ordered. As such, the redundancy of data access is eliminated, leading to much faster video query processing. Fast subsequence matching is achieved by comparing the keyframes of different video sequences. Specifically, a keyframe matching graph is first constructed and then divided into matched candidate subgraphs. We have evaluated our proposed approach over a very large real video database. Extensive experiments demonstrate the effectiveness and efficiency of our approach.  相似文献   

6.
Robin Burke 《Knowledge》1996,9(8):491-499
Selecting an instructive story from a video case base is an information retrieval problem, but standard indexing and retrieval techniques [1] were not developed with such applications in mind. The classical model assumes a passive retrieval system queried by interested and well-informed users. In educational situations, students cannot be expected to form appropriate queries or to identify their own ignorance. Systems that teach must, therefore, be active retrievers that formulate their own retrieval cues and reason about the appropriateness of intervention.

The Story Producer for InteractivE Learning (SPIEL) is an active retrieval system for recalling stories to tell to students who are learning social skills in a simulated environment [2, 3]. SPIEL is a component of the Guided Social Simulation (GuSS) architecture [4] used to build YELLO, a program that teaches account executives the fine points of selling Yellow Pages advertising. SPIEL uses structured, conceptual indices derived from research in case-based reasoning [5, 6]. SPIEL's manually-created indices are detailed representations of what stories are about, and they are needed to make precise assessments of stories' relevance.

SPIEL's opportunistic retrieval architecture operates in two phases. During the storage phase, the system uses its educational knowledge encapsulated in a library of “storytelling strategies” to determine, for each story, what an opportunity to tell that story would look like. During the retrieval phase, the system tries to recognize those opportunities while the student interacts with the simulation. This design is similar to “opportunistic memory” architectures proposed for opportunistic planning [7, 8].  相似文献   


7.
8.
Bag-of-visual-words (BoW) has recently become a popular representation to describe video and image content. Most existing approaches, nevertheless, neglect inter-word relatedness and measure similarity by bin-to-bin comparison of visual words in histograms. In this paper, we explore the linguistic and ontological aspects of visual words for video analysis. Two approaches, soft-weighting and constraint-based earth mover’s distance (CEMD), are proposed to model different aspects of visual word linguistics and proximity. In soft-weighting, visual words are cleverly weighted such that the linguistic meaning of words is taken into account for bin-to-bin histogram comparison. In CEMD, a cross-bin matching algorithm is formulated such that the ground distance measure considers the linguistic similarity of words. In particular, a BoW ontology which hierarchically specifies the hyponym relationship of words is constructed to assist the reasoning. We demonstrate soft-weighting and CEMD on two tasks: video semantic indexing and near-duplicate keyframe retrieval. Experimental results indicate that soft-weighting is superior to other popular weighting schemes such as term frequency (TF) weighting in large-scale video database. In addition, CEMD shows excellent performance compared to cosine similarity in near-duplicate retrieval.  相似文献   

9.
Semantic filtering and retrieval of multimedia content is crucial for efficient use of the multimedia data repositories. Video query by semantic keywords is one of the most difficult problems in multimedia data retrieval. The difficulty lies in the mapping between low-level video representation and high-level semantics. We therefore formulate the multimedia content access problem as a multimedia pattern recognition problem. We propose a probabilistic framework for semantic video indexing, which call support filtering and retrieval and facilitate efficient content-based access. To map low-level features to high-level semantics we propose probabilistic multimedia objects (multijects). Examples of multijects in movies include explosion, mountain, beach, outdoor, music etc. Semantic concepts in videos interact and to model this interaction explicitly, we propose a network of multijects (multinet). Using probabilistic models for six site multijects, rocks, sky, snow, water-body forestry/greenery and outdoor and using a Bayesian belief network as the multinet we demonstrate the application of this framework to semantic indexing. We demonstrate how detection performance can be significantly improved using the multinet to take interconceptual relationships into account. We also show how the multinet can fuse heterogeneous features to support detection based on inference and reasoning  相似文献   

10.
As a powerful and expressive nontextual media that can capture and present information, instructional videos are extensively used in e-learning (Web-based distance learning). Since each video may cover many subjects, it is critical for an e-learning environment to have content-based video searching capabilities to meet diverse individual learning needs. In this paper, we present an interactive multimedia-based e-learning environment that enables users to interact with it to obtain knowledge in the form of logically segmented video clips. We propose a natural language approach to content-based video indexing and retrieval to identify appropriate video clips that can address users' needs. The method integrates natural language processing, named entity extraction, frame-based indexing, and information retrieval techniques to explore knowledge-on-demand in a video-based interactive e-learning environment. A preliminary evaluation shows that precision and recall of this approach are better than those of the traditional keyword based approach.  相似文献   

11.
Content-based indexing of multimedia databases   总被引:1,自引:0,他引:1  
Content-based retrieval of multimedia database calls for content-based indexing techniques. Different from conventional databases, where data items are represented by a set of attributes of elementary data types, multimedia objects in multimedia databases are represented by a collection of features; similarity of object contents depends on context and frame of reference; and features of objects are characterized by multimodal feature measures. These lead to great challenges for content-based indexing. On the other hand, there are special requirements on content-based indexing: to support visual browsing, similarity retrieval, and fuzzy retrieval, nodes of the index should represent certain meaningful categories. That is to say that certain semantics must be added when performing indexing. ContIndex, the context-based indexing technique presented in this paper, is proposed to meet these challenges and special requirements. The indexing tree is formally defined by adapting a classification-tree concept. Horizontal links among nodes in the same level enhance the flexibility of the index. A special neural-network model, called Learning based on Experiences and Perspectives (FEP), has been developed to create node categories by fusing multimodal feature measures. It brings into the index the capability of self-organizing nodes with respect to certain context and frames of reference. An icon image is generated for each intermediate node to facilitate visual browsing. Algorithms have been developed to support multimedia object archival and retrieval using Contlndex  相似文献   

12.
Storing and querying high-dimensional data are important problems in designing an information retrieval system. Two crucial issues, time and space efficiencies, must be considered when evaluating the performance of such a system. The KDB-tree and its variants have been reported to have good performance by using them as the index structure for retrieving multidimensional data. However, they all suffer from low storage utilization problem caused by imperfect “splitting policies.” Unnecessary splits increase the size of the index structure and deteriorate the performance of the system. In this paper, a new data insertion algorithm with a better splitting policy was proposed, which arranges data entries in the leaf nodes as many as possible. Our new index scheme can increase the storage utilization up to nearly 100% and reduce the index size to a smaller scale. As a result, both time and space efficiencies are significantly improved. Analytical and experimental results show that our indexing method outperforms the traditional KDB-tree and its variants.  相似文献   

13.
Widely used in data-driven computer animation, motion capture data exhibits its complexity both spatially and temporally. The indexing and retrieval of motion data is a hard task that is not totally solved. In this paper, we present an efficient motion data indexing and retrieval method based on self-organizing map and Smith–Waterman string similarity metric. Existing motion clips are first used to train a self-organizing map and then indexed by the nodes of the map to get the motion strings. The Smith–Waterman algorithm, a local similarity measure method for string comparison, is used in clustering the motion strings. Then the motion motif of each cluster is extracted for the retrieval of example-based query. As an unsupervised learning approach, our method can cluster motion clips automatically without needing to know their motion types. Experiment results on a dataset of various kinds of motion show that the proposed method not only clusters the motion data accurately but also retrieves appropriate motion data efficiently.  相似文献   

14.
Video in digital format is now commonplace and widespread in both professional use, and in domestic consumer products from camcorders to mobile phones. Video content is growing in volume and while we can capture, compress, store, transmit and display video with great facility, editing videos and manipulating them based on their content is still a non-trivial activity. In this paper, we give a brief review of the state of the art of video analysis, indexing and retrieval and we point to research directions which we think are promising and could make searching and browsing of video archives based on video content, as easy as searching and browsing (text) web pages. We conclude the paper with a list of grand challenges for researchers working in the area.  相似文献   

15.
The amount of captured video is growing with the increased numbers of video cameras, especially the increase of millions of surveillance cameras that operate 24 hours a day. Since video browsing and retrieval is time consuming, most captured video is never watched or examined. Video synopsis is an effective tool for browsing and indexing of such a video. It provides a short video representation, while preserving the essential activities of the original video. The activity in the video is condensed into a shorter period by simultaneously showing multiple activities, even when they originally occurred at different times. The synopsis video is also an index into the original video by pointing to the original time of each activity. Video Synopsis can be applied to create a synopsis of an endless video streams, as generated by webcams and by surveillance cameras. It can address queries like "Show in one minute the synopsis of this camera broadcast during the past day'. This process includes two major phases: (i) An online conversion of the endless video stream into a database of objects and activities (rather than frames). (ii) A response phase, generating the video synopsis as a response to the user's query.  相似文献   

16.
We present several algorithms suitable for analysis of broadcast video. First, we show how wavelet analysis of frames of video can be used to detect transitions between shots in a video stream, thereby dividing the stream into segments. Next we describe how each segment can be inserted into a video database using an indexing scheme that involves a wavelet-based “signature.” Finally, we show that during a subsequent broadcast of a similar or identical video clip, the segment can be found in the database by quickly searching for the relevant signature. The method is robust against noise and typical variations in the video stream, even global changes in brightness that can fool histogram-based techniques. In the paper, we compare experimentally our shot transition mechanism to a color histogram implementation, and also evaluate the effectiveness of our database-searching scheme. Our algorithms are very efficient and run in realtime on a desktop computer. We describe how this technology could be employed to construct a “smart VCR” that was capable of alerting the viewer to the beginning of a specific program or identifying  相似文献   

17.
This article addresses the problem of indexing and retrieving first-order predicate calculus terms in the context of automated deduction programs. The four retrieval operations of concern are to find variants, generalizations, instances, and terms that unify with a given term. Discrimination-tree indexing is reviewed, and several variations are presented. The path-indexing method is also reviewed. Experiments were conducted on large sets of terms to determine how the properties of the terms affect the performance of the two indexing methods. Results of the experiments are presented.This was supported by the Applied Mathematical Sciences subprogram of the Office of Energy Research, U.S. Department of Energy, under Contract W-31-109-Eng-38.  相似文献   

18.
The perceptual video hash function defines a feature vector that characterizes a video depending on its perceptual contents. This function must be robust to the content preserving manipulations and sensitive to the content changing manipulations. In the literature, the subspace projection techniques such as the reduced rank PARAllel FACtor analysis (PARAFAC), have been successfully applied to extract perceptual hash for the videos. We propose a robust perceptual video hash function based on Tucker decomposition, a multi-linear subspace projection method. We also propose a method to find the optimum number of components in the factor matrices of the Tucker decomposition. The Receiver Operating Characteristics (ROC) curves are used to evaluate the performance of the proposed algorithm compared to the other state-of-the-art projection techniques. The proposed algorithm shows superior performance for most of the image processing attacks. An application for indexing and retrieval of near-identical videos is developed using the proposed algorithm and the performance is evaluated using average recall/precision curves. The experimental results show that the proposed algorithm is suitable for indexing and retrieval of near-identical videos.  相似文献   

19.
Authenticated indexing for outsourced spatial databases   总被引:1,自引:0,他引:1  
In spatial database outsourcing, a data owner delegates its data management tasks to a location-based service (LBS), which indexes the data with an authenticated data structure (ADS). The LBS receives queries (ranges, nearest neighbors) originating from several clients/subscribers. Each query initiates the computation of a verification object (VO) based on the ADS. The VO is returned to the client that can verify the result correctness using the public key of the owner. Our first contribution is the MR-tree, a space-efficient ADS that supports fast query processing and verification. Our second contribution is the MR*-tree, a modified version of the MR-tree, which significantly reduces the VO size through a novel embedding technique. Finally, whereas most ADSs must be constructed and maintained by the owner, we outsource the MR- and MR*-tree construction and maintenance to the LBS, thus relieving the owner from this computationally intensive task.  相似文献   

20.
Automatic parsing and indexing of news video   总被引:9,自引:0,他引:9  
Automatic construction of content-based indices for video source material requires general semantic interpretation of both images and their accompanying sounds; but such a broadly-based semantic analysis is beyond the capabilities of the current technologies of machine vision and audio signal analysis. However, if one can assume a limited and well-demarcated body of domain knowledge for describing the content of a body of video, then it becomes easier to interpret a video source in terms of that domain knowledge. This paper presents our work on using domain knowledge to parse news video programs and to index them on the basis of their visual content. Models based on both the spatial structure of image frames and the temporal structure of the entire program have been developed for news videos, along with algorithms that apply these models by locating and identifying instances of their elements. Experimental results are also discussed in detail to evaluate both the models and the algorithms that use them. Finally, proposals for future work are summarized.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号

京公网安备 11010802026262号