期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Luca Canini Sergio Benini Riccardo Leonardi 《Multimedia Tools and Applications》2013,62(1):51-73

相似文献

2.

An information theoretic approach to camera control for crowded scenes

Cagatay Turkay Emre Koc Selim Balcisoy 《The Visual computer》2009,25(5-7):451-459

Navigation and monitoring of large and crowded virtual environments is a challenging task and requires intuitive camera control techniques to assist users. In this paper, we present a novel automatic camera control technique providing a scene analysis framework based on information theory. The developed framework contains a probabilistic model of the scene to build entropy and expectancy maps. These maps are utilized to find interest points which represent either characteristic behaviors of the crowd or novel events occurring in the scene. After an interest point is chosen, the camera is updated accordingly to display this point. We tested our model in a crowd simulation environment and it performed successfully. Our method can be integrated into existent camera control modules in computer games, crowd simulations and movie pre-visualization applications. 相似文献

3.

QuickLook: Movie summarization using scene-based leading characters with psychological cues fusion

《Information Fusion》2021

Due to recent advances in the film industry, the production of movies has grown exponentially, which has led to challenges in what is referred to as discoverability: given the overwhelming number of choices, choosing which film to watch has become a tedious task for audiences. Movie summarization (MS) could help, as it presents the central theme of the movie in a compact format and makes browsing more efficient for the audience. In this paper, we present an automatic MS framework coined as ‘QuickLook’, which identifies the leading characters and fuses multiple cues extracted from a movie. Firstly, the movie data is preprocessed for its division into scenes, followed by shot segmentation. Secondly, the leading characters in each segmented scene are determined. Next, four visual cues that capture the film's scenic beauty, memorability, informativeness and emotional resonance are extracted from shots containing the leading characters. These extracted features are then intelligently fused based on the assignment of different weights; shots with a fusion score above a certain threshold are selected for the final summary. The proposed MS framework is assessed by comparison with official trailers from ten Hollywood movies, providing a novel baseline for future fair comparison in the MS literature. The proposed framework is shown to outperform other state-of-the-art MS methods in terms of enjoyability and informativeness. 相似文献

4.

Automated camera planning to film robot operations

Khaled Belghith Froduald Kabanza Philipe Bellefeuille Leo Hartman 《Artificial Intelligence Review》2012,37(4):313-330

Automatic 3D animation generation techniques are becoming increasingly popular in different areas related to computer graphics such as video games and animated movies. They help automate the filmmaking process even by non professionals without or with minimal intervention of animators and computer graphics programmers. Based on specified cinematographic principles and filming rules, they plan the sequence of virtual cameras that the best render a 3D scene. In this paper, we present an approach for automatic movie generation using linear temporal logic to express these filming and cinematography rules. We consider the filming of a 3D scene as a sequence of shots satisfying given filming rules, conveying constraints on the desirable configuration (position, orientation, and zoom) of virtual cameras. The selection of camera configurations at different points in time is understood as a camera plan, which is computed using a temporal-logic based planning system (TLPlan) to obtain a 3D movie. The camera planner is used within an automated planning application for generating 3D tasks demonstrations involving a teleoperated robot arm on the the International Space Station (ISS). A typical task demonstration involves moving the robot arm from one configuration to another. The main challenge is to automatically plan the configurations of virtual cameras to film the arm in a manner that conveys the best awareness of the robot trajectory to the user. The robot trajectory is generated using a path-planner. The camera planner is then invoked to find a sequence of configurations of virtual cameras to film the trajectory. 相似文献

5.

Scaled layout recovery with wide field of view RGB-D

《Image and vision computing》2019

In this work, we propose a method that integrates depth and fisheye cameras to obtain a wide 3D scene reconstruction with scale in one single shot. The motivation of such integration is to overcome the narrow field of view in consumer RGB-D cameras and lack of depth and scale information in fisheye cameras. The hybrid camera system we use is easy to build and calibrate, and currently consumer devices with similar configuration are already available in the market. With this system, we have a portion of the scene with shared field of view that provides simultaneously color and depth. In the rest of the color image we estimate the depth by recovering the structural information of the scene. Our method finds and ranks corners in the scene combining the extraction of lines in the color image and the depth information. These corners are used to generate plausible layout hypotheses, which have real-world scale due to the usage of depth. The wide angle camera captures more information from the environment (e.g. the ceiling), which helps to overcome severe occlusions. After an automatic evaluation of the hypotheses, we obtain a scaled 3D model expanding the original depth information with the wide scene reconstruction. We show in our experiments with real images from both home-made and commercial systems that our method achieves high success ratio in different scenarios and that our hybrid camera system outperforms the single color camera set-up while additionally providing scale in one single shot. 相似文献

6.

STRUCTURED REPRESENTATION AND AUTOMATIC INDEXING OF MOVIE INFORMATION CONTENT

J.M. CORRIDONI A.DEL BIMBO 《Pattern recognition》1998,31(12):2027-2045

The development of video applications for digital multimedia has highlighted the need for indexing tools, enabling the access to meaningful segments of video. The high cost of manual indexing creates a demand for the development of automatic algorithms, able to extract such indices with little intervention. In this paper we present new editing model–based algorithms that automatically extract low–level features in a movie: camera shots and camera motion. Rules of film making are used to derive higher-level elements, such as shot-reverse shot sequences. The algorithms have been tested on 20 h of movies and comparison with techniques in the literature is provided. 相似文献

7.

一种电影视频场景的自动构造方法

邱建雄黄少年《计算机工程与科学》2011,33(11):128

根据电影拍摄的"轴线规律",本文给出了一种简单的电影场景的定义方法,并根据该场景定义,提出了一种电影场景检测算法。算法首先使用改进像素点匹配二次差分法进行电影镜头的检测,然后根据自定义的镜头相似性判断原则进行镜头聚类得到电影场景边界。实验表明,该算法可以有效地检测出电影场景边界。相似文献

8.

A Probabilistic Framework for Extracting Narrative Act Boundaries and Semantics in Motion Pictures

Brett?Adams Email author Svetha?Venketesh Hung?H.?Bui Chitra?Dorai 《Multimedia Tools and Applications》2005,27(2):195-213

This work constitutes the first attempt to extract the important narrative structure, the 3-Act storytelling paradigm in film. Widely prevalent in the domain of film, it forms the foundation and framework in which a film can be made to function as an effective tool for story telling, and its extraction is a vital step in automatic content management for film data. The identification of act boundaries allows for structuralizing film at a level far higher than existing segmentation frameworks, which include shot detection and scene identification, and provides a basis for inferences about the semantic content of dramatic events in film. A novel act boundary likelihood function for Act 1 and 2 is derived using a Bayesian formulation under guidance from film grammar, tested under many configurations and the results are reported for experiments involving 25 full-length movies. The result proves to be a useful tool in both the automatic and semi-interactive setting for semantic analysis of film, with potential application to analogues occuring in many other domains, including news, training video, sitcoms. 相似文献

9.

A multiple-baseline stereo 总被引：25，自引：0，他引：25

Okutomi M. Kanade T. 《IEEE transactions on pattern analysis and machine intelligence》1993,15(4):353-363

A stereo matching method that uses multiple stereo pairs with various baselines generated by a lateral displacement of a camera to obtain precise distance estimates without suffering from ambiguity is presented. Matching is performed simply by computing the sum of squared-difference (SSD) values. The SSD functions for individual stereo pairs are represented with respect to the inverse distance and are then added to produce the sum of SSDs. This resulting function is called the SSSD-in-inverse-distance. It is shown that the SSSD-in-inverse-distance function exhibits a unique and clear minimum at the correct matching position, even when the underlying intensity patterns of the scene include ambiguities or repetitive patterns. The authors first define a stereo algorithm based on the SSSD-in-inverse-distance and present a mathematical analysis to show how the algorithm can remove ambiguity and increase precision. Experimental results with real stereo images are presented to demonstrate the effectiveness of the algorithm 相似文献

10.

MPEG-7标准在抽取动画镜头关键帧中的应用 总被引：1，自引：0，他引：1

李伟王树梅王玲刘军张军《微计算机信息》2006,22(33):294-296

在电影动画的制作过程中,经常以关键帧来代表一段动画镜头。快速,有效地抽取关键帧可以减轻多媒体数据库的负担,更重要的是,可以提高对动画镜头检索的正确率。用MPEG-7标准中的主颜色描述符对组成镜头的一系列帧进行颜色特征描述,通过计算帧间的颜色特征差异距离,可以达到有效抽取关键帧,提高检索正确率的目的。相似文献

11.

Shot clustering techniques for story browsing 总被引：1，自引：0，他引：1

Tavanapong W. Junyu Zhou 《Multimedia, IEEE Transactions on》2004,6(4):517-527

Automatic video segmentation is the first and necessary step for organizing a long video file into several smaller units. The smallest basic unit is a shot. Relevant shots are typically grouped into a high-level unit called a scene. Each scene is part of a story. Browsing these scenes unfolds the entire story of a film, enabling users to locate their desired video segments quickly and efficiently. Existing scene definitions are rather broad, making it difficult to compare the performance of existing techniques and to develop a better one. This paper introduces a stricter scene definition for narrative films and presents ShotWeave, a novel technique for clustering relevant shots into a scene using the stricter definition. The crux of ShotWeave is its feature extraction and comparison. Visual features are extracted from selected regions of representative frames of shots. These regions capture essential information needed to maintain viewers' thought in the presence of shot breaks. The new feature comparison is developed based on common continuity-editing techniques used in film making. Experiments were performed on full-length films with a wide range of camera motions and a complex composition of shots. The experimental results show that ShotWeave outperforms two recent techniques utilizing global visual features in terms of segmentation accuracy and time. 相似文献

12.

基于宽基线图像远距离场景的自动三维重建 总被引：1，自引：1，他引：0

张峰许振辉史利民孙凤梅胡占义《计算机辅助设计与图形学学报》2010,22(2)

图像获取时距离场景较远,由于图像对应点间视差较小,如果相机模型选取不合适,将导致本来不在同一个平面上的点重建在一个平面上.针对该问题,提出一种基于宽基线图像远距离场景的自动三维重建方法.该方法对相机模型的成像过程进行分析,给出了适用于远距离场景的透视相机模型;通过调整约束方程的权重系数,提高了仿射相机模型自标定过程的鲁棒性;利用分块的捆绑调整技术与分解法,解决了多视角重建结果融合的问题.室外远距离场景的重建实验结果表明,采用文中方法可取得较好的视觉效果. 相似文献

13.

Movie scene segmentation using background information

Liang-Hua Chen Yu-Chun Lai Hong-Yuan Mark Liao 《Pattern recognition》2008,41(3):1056-1065

Scene extraction is the first step toward semantic understanding of a video. It also provides improved browsing and retrieval facilities to users of video database. This paper presents an effective approach to movie scene extraction based on the analysis of background images. Our approach exploits the fact that shots belonging to one particular scene often have similar backgrounds. Although part of the video frame is covered by foreground objects, the background scene can still be reconstructed by a mosaic technique. The proposed scene extraction algorithm consists of two main components: determination of the shot similarity measure and a shot grouping process. In our approach, several low-level visual features are integrated to compute the similarity measure between two shots. On the other hand, the rules of film-making are used to guide the shot grouping process. Experimental results show that our approach is promising and outperforms some existing techniques. 相似文献

14.

Extraction of Film Takes for Cinematic Analysis

Ba?Tu?Truong Email author Svetha?Venkatesh Chitra?Dorai 《Multimedia Tools and Applications》2005,26(3):277-298

In this paper, we focus on the ‘reverse editing’ problem in movie analysis, i.e., the extraction of film takes, original camera shots that a film editor extracts and arranges to produce a finished scene. The ability to disassemble final scenes and shots into takes is essential for nonlinear browsing, content annotation and the extraction of higher order cinematic constructs from film. A two-part framework for take extraction is proposed. The first part focuses on the filtering out action-driven scenes for which take extraction is not useful. The second part focuses on extracting film takes using agglomerative hierarchical clustering methods along with different similarity metrics and group distances and demonstrates our findings with 10 movies. 相似文献

15.

A motion-based scene tree for browsing and retrieval of compressed videos

Haoran Yi Deepu RajanLiang-Tien Chia 《Information Systems》2006

This paper describes a fully automatic content-based approach for browsing and retrieval of MPEG-2 compressed video. The first step of the approach is the detection of shot boundaries based on motion vectors available from the compressed video stream. The next step involves the construction of a scene tree from the shots obtained earlier. The scene tree is shown to capture some semantic information as well as to provide a construct for hierarchical browsing of compressed videos. Finally, we build a new model for video similarity based on global as well as local motion associated with each node in the scene tree. To this end, we propose new approaches to camera motion and object motion estimation. The experimental results demonstrate that the integration of the above techniques results in an efficient framework for browsing and searching large video databases. 相似文献

16.

To watch from distance: an interactive film model based on Brechtian film theory

Metin Çavuş Oğuzhan Özcan 《Digital Creativity》2013,24(2):127-140

With the emergence of new media, interactive film projects have mainly struggled to resolve the contradiction between dramatic structures and interaction. Dramatic film presents identification with the main character, where the viewer is constantly oppressed by the narrative, and therefore lost in illusion. In this context, when we bring on the scene interaction, the drama apparently starts to lose its power.

In this article, a new interactive film model based on Brechtian film theory is proposed. This model presents a new way of spatiotemporal construction where different audiovisual combinations can be viewed successively, and this way the viewer can actively construct his/her own story. Theoretical framework of the Brechtian interactive film model is supported by an interactive film application, named Academia. The main feature of the model is that, while interaction is very simple, the continuity of the narrative is preserved and the film requiring an intellectual level of interpretation. 相似文献

17.

Movie Scene Recognition Using Panoramic Frame and Representative Feature Patches简

下载免费PDF全文

高广宇马华尔《计算机科学技术学报》2014,29(1):155-164

Recognizing scene information in images or has attracted much attention in computer vision or videos, such as locating the objects and answering ＂Where am research field. Many existing scene recognition methods focus on static images, and cannot achieve satisfactory results on videos which contain more complex scenes features than images. In this paper, we propose a robust movie scene recognition approach based on panoramic frame and representative feature patch. More specifically, the movie is first efficiently segmented into video shots and scenes. Secondly, we introduce a novel key-frame extraction method using panoramic frame and also a local feature extraction process is applied to get the representative feature patches （RFPs） in each video shot. Thirdly, a Latent Dirichlet Allocation （LDA） based recognition model is trained to recognize the scene within each individual video scene clip. The correlations between video clips are considered to enhance the recognition performance. When our proposed approach is implemented to recognize the scene in realistic movies, the experimental results shows that it can achieve satisfactory performance. 相似文献

18.

Shot Partitioning Based Recognition of TV Commercials 总被引：1，自引：0，他引：1

Sánchez Juan M. Binefa Xavier Vitrià Jordi 《Multimedia Tools and Applications》2002,18(3):233-247

Digital video applications exploit the intrinsic structure of video sequences. In order to obtain and represent this structure for video annotation and indexing tasks, the main initial step is automatic shot partitioning. This paper analyzes the problem of automatic TV commercials recognition, and a new algorithm for scene break detection is then introduced. The structure of each commercial is represented by the set of its key-frames, which are automatically extracted from the video stream. The particular characteristics of commercials make commonly used shot boundary detection techniques obtain worse results than with other video content domains. These techniques are based on individual image features or visual cues, which show significant performance lacks when they are applied to complex video content domains like commercials. We present a new scene break detection algorithm based on the combined analysis of edge and color features. Local motion estimation is applied to each edge in a frame, and the continuity of the color around them is then checked in the following frame. By separately considering both sides of each edge, we rely on the continuous presence of the objects and/or the background of the scene during each shot. Experimental results show that this approach outperforms single feature algorithms in terms of precision and recall. 相似文献

19.

Editing Object Behaviour in Video Sequences

Volker Scholz Sascha El-Abed Hans-Peter Seidel Marcus Magnor 《Computer Graphics Forum》2009,28(6):1632-1643

While there are various commercial-strength editing tools available today for still images, object-based manipulation of real-world video footage is still a challenging problem. In this system paper, we present a framework for interactive video editing. Our focus is on footage from a single, conventional video camera. By relying on spatio-temporal editing techniques operating on the video cube, we do not need to recover 3D scene geometry. Our framework is capable of removing and inserting objects, object motion editing, non-rigid object deformations, keyframe interpolation, as well as emulating camera motion. We demonstrate how movie shots with moderate complexity can be persuasively modified during post-processing. 相似文献

20.

Audience real-time bio-signal-processing-based computational intelligence model for narrative scene editing

Pyoung Won Kim Suzie Lee 《Multimedia Tools and Applications》2017,76(23):24833-24845

Narrative scene editing is carried out by directors as an eidetic technique. Continuity editing is a style of editing used in film making to make films as realistic as possible for the audience. While non-continuity editing (e.g., flashbacks, jump cuts, montages, etc.) is also a critical factor that reflects the character of a director, most scenes demand continuity editing to maximize the audience’s narrative immersion. In this paper, we present an algorithm for continuity editing that determines the size of a shot (field of view) by evaluating the psychical distance between the characters and viewers via the measurement of electrodermal activity (or the galvanic skin response). The use of this continuity editing algorithm is expected to result in more audience-friendly videos by reflecting the level of identification between the actors and the audience. 相似文献