首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In the current Internet environment, a lot of multimedia information is navigated on the on-line computer systems. Among the multimedia information, video sequence has the most valuable and meaningful influence on human emotions. Therefore, one human’s emotions to see and feel the same video can be different from that of others depending on the person’s mental state. In this research, we propose a new real-time emotion retrieval scheme in video with image sequence features. The features of image sequence consist of color information, key frame extraction, video sound, and optical flow. Each video feature is combined with the weight for the emotion retrieval. The experimental results show the new approach of real-time emotion retrieval in video with the better results compared to the previous studies. The proposed scheme will be applied to the many multimedia fields: movie, computer game, video conference, and so on.  相似文献   

2.
人脸检测广泛用于计算机视觉和模式识别领域。结合肤色检测和镶嵌图方法,提出一种对视频流中人脸进行快速检测的算法。该方法首先根据肤色信息和人脸的几何规则初步得到可能的人脸区,然后在候选区中利用改进的镶嵌图方法准确定位人脸。实验表明,该方法能快速而且准确地在视频流中进行人脸检测。  相似文献   

3.
This work details a new method for loop-closure detection based on using multiple orthogonal projections to generate a global signature for each image of a video sequence. The new multi-projection function permits the detection of images corresponding to the same scene, but taken from different points of view. The signature generation process preserves enough information for robust loop-closure detection, although it transforms each image to a simple and compact representation. Thanks to these characteristics, a real-time operation is possible, even for long sequences with thousands of images. In addition, it has proved to work on very different scenarios without the need to change the parameters or to perform an onffline training stage, which makes it very independent on the environment and camera configuration. Results of an extensive set of experiments of the algorithm on several datasets, both indoors and outdoors and including underwater scenarios, are presented. Furthermore, an implementation, named HALOC, is available at a public repository as a C++ library for its use under the BSD license.  相似文献   

4.
We consider the problem of reconstructing the 3D coordinates of a moving point seen from a monocular moving camera, i.e., to reconstruct moving objects from line-of-sight measurements only. The task is feasible only when some constraints are placed on the shape of the trajectory of the moving point. We coin the family of such tasks as “trajectory triangulation.” We investigate the solutions for points moving along a straight-line and along conic-section trajectories, We show that if the point is moving along a straight line, then the parameters of the line (and, hence, the 3D position of the point at each time instant) can be uniquely recovered, and by linear methods, from at least five views. For the case of conic-shaped trajectory, we show that generally nine views are sufficient for a unique reconstruction of the moving point and fewer views when the conic is of a known type (like a circle in 3D Euclidean space for which seven views are sufficient). The paradigm of trajectory triangulation, in general, pushes the envelope of processing dynamic scenes forward. Thus static scenes become a particular case of a more general task of reconstructing scenes rich with moving objects (where an object could be a single point)  相似文献   

5.

Generating dynamic 2D image-based facial expressions is a challenging task for facial animation. Much research work focused on performance-driven facial animation from given videos or images of a target face, while animating a single face image driven by emotion labels is a less explored problem. In this work, we treat the task of animating single face image from emotion labels as a conditional video prediction problem, and propose a novel framework by combining factored conditional restricted boltzmann machines (FCRBM) and reconstruction contractive auto-encoder (RCAE). A modified RCAE with an associated efficient training strategy is used to extract low dimensional features and reconstruct face images. FCRBM is used as animator to predict facial expression sequence in the feature space given discrete emotion labels and a frontal neutral face image as input. Both quantitative and qualitative evaluations on two facial expression databases, and comparison to state-of-the-art showed the effectiveness of our proposed framework for animating frontal neutral face image from given emotion labels.

  相似文献   

6.
Pattern Analysis and Applications - In the last decades, iris features have been widely used in biometric systems. Because iris features are virtually unique for each person, their usage is highly...  相似文献   

7.
Scene change detection techniques for video database systems   总被引:1,自引:0,他引:1  
Scene change detection (SCD) is one of several fundamental problems in the design of a video database management system (VDBMS). It is the first step towards the automatic segmentation, annotation, and indexing of video data. SCD is also used in other aspects of VDBMS, e.g., hierarchical representation and efficient browsing of the video data. In this paper, we provide a taxonomy that classifies existing SCD algorithms into three categories: full-video-image-based, compressed-video-based, and model-based algorithms. The capabilities and limitations of the SCD algorithms are discussed in detail. The paper also proposes a set of criteria for measuring and comparing the performance of various SCD algorithms. We conclude by discussing some important research directions.  相似文献   

8.
Multimedia Tools and Applications - The ubiquitous utilization of video applications in recent years has made research on video quality of experience paramount. Lack of sufficient bandwidth deters...  相似文献   

9.

As one of key technologies in content-based near-duplicate detection and video retrieval, video sequence matching can be used to judge whether two videos exist duplicate or near-duplicate segments or not. Despite a lot of research efforts devoted in recent years, how to precisely and efficiently perform sequence matching among videos (which may be subject to complex audio-visual transformations) from a large-scale database still remains a pretty challenging task. To address this problem, this paper proposes a multiscale video sequence matching (MS-VSM) method, which can gradually detect and locate the similar segments between videos from coarse to fine scales. At the coarse scale, it makes use of the Maximum Weight Matching (MWM) algorithm to rapidly select several candidate reference videos from the database for a given query. Then for each candidate video, its most similar segment with respect to the given query is obtained at the middle scale by the Constrained Longest Ascending Matching Subsequence (CLAMS) algorithm, and then can be used to judge whether that candidate exists near-duplicate or not. If so, the precise locations of the near-duplicate segments in both query and reference videos are determined at the fine scale by using bi-directional scanning to check the matching similarity at the segments’ boundaries. As such, the MS-VSM method can achieve excellent near-duplicate detection accuracy and localization precision with a very high processing efficiency. Extensive experiments show that it outperforms several state-of-the-art methods remarkably on several benchmarks.

  相似文献   

10.
A multimedia content is composed of several streams that carry information in audio, video or textual channels. Classification and clustering multimedia contents require extraction and combination of information from these streams. The streams constituting a multimedia content are naturally different in terms of scale, dynamics and temporal patterns. These differences make combining the information sources using classic combination techniques difficult. We propose an asynchronous feature level fusion approach that creates a unified hybrid feature space out of the individual signal measurements. The target space can be used for clustering or classification of the multimedia content. As a representative application, we used the proposed approach to recognize basic affective states from speech prosody and facial expressions. Experimental results over two audiovisual emotion databases with 42 and 12 subjects revealed that the performance of the proposed system is significantly higher than the unimodal face based and speech based systems, as well as synchronous feature level and decision level fusion approaches.  相似文献   

11.
眼镜边框是影响精确提取人脸图像特征的因素之一,为此提出了一种眼镜检测和边框去除的方法。该方法由眼镜检测、眼镜边框定位和被遮挡图像修复三部分构成。提取眼睛估计区域的边缘特征并基于神经网络的方法检测眼镜;利用二值化和数学形态学的方法定位眼镜边框;通过插值的方法修复被眼镜边框遮挡的图像。实验结果表明,该方法与传统基于PCA的方法相比,眼镜去除后的人脸图像更加自然。同时,实验结果也表明该方法有助于人脸识别性能的提升。  相似文献   

12.
为快速稳定地检测图像序列中角度变化较大、遮挡较为严重的人脸,结合快速精确的目标检测模型MobileNet-SSD (MS)和快速跟踪模型核相关滤波(KCF),提出一种新的自动检测-跟踪-检测(DTD)模式,即MS-KCF人脸检测模型。首先,利用MS模型快速精确地对人脸进行检测,并且更新跟踪模型;其次,将检测到的人脸坐标信息输入到KCF跟踪模型中进行稳定的跟踪,并加快整体的检测速度;最后,为了防止跟踪丢失,跟踪数帧后再次更新检测模型,重新对人脸进行检测。实验显示,在FDDB人脸检测基准中,MS-KCF模型的召回率为93.60%;在WIDER FACE人脸检测基准的Easy、Medium和Hard数据集中,MS-KCF模型的召回率分别为93.11%、92.18%和82.97%,平均速度为193帧/s。实验结果表明,MS-KCF模型具有稳定性和快速性,在图像序列中对严重遮挡和角度变化大的人脸具有很好的检测效果。  相似文献   

13.
视频序列中运动目标的检测是目标识别、标记和追踪的重要组成部分,背景减除法是运动目标检测中被广泛应用的算法。针对光线变化、噪声和局部运动等影响运动目标检测效果的问题,提出一种基于背景减除法的视频序列运动目标检测算法。该算法结合背景减除法和帧间差分法,对当前帧像素点的运动状态进行判断,分别对静止和运动的像素点进行替换和更新,采用最大类间方差(Otsu)法对差分图像进行目标提取,并使用数学形态学运算去除目标中的噪声和冗余信息。实验结果表明,所提算法对于视频序列中运动目标的检测具有较好的视觉效果和较高的准确度,能够克服局部运动以及噪声等缺陷。  相似文献   

14.
Example-based learning for view-based human face detection   总被引:34,自引:0,他引:34  
We present an example-based learning approach for locating vertical frontal views of human faces in complex scenes. The technique models the distribution of human face patterns by means of a few view-based “face” and “nonface” model clusters. At each image location, a difference feature vector is computed between the local image pattern and the distribution-based model. A trained classifier determines, based on the difference feature vector measurements, whether or not a human face exists at the current image location. We show empirically that the distance metric we adopt for computing difference feature vectors, and the “nonface” clusters we include in our distribution-based model, are both critical for the success of our system  相似文献   

15.
陈云平 《计算机时代》2012,(5):37-38,40
利用数字图像模式识别技术实现了人脸的自动检测及特征定位.对数字图像处理中的颜色模型、肤色建模的原理及在人脸识别中的应用进行了概述,分析了人脸识别过程中存在的困难,展望了人脸识别技术的发展方向.  相似文献   

16.
一类视频序列中的人脸检测与实时跟踪算法   总被引:1,自引:0,他引:1  
提出一种新的人脸快速检测与实时跟踪算法,能够对视频序列中的人脸进行快速、准确地检测和跟踪。算法分为开始状态、目标丢失状态的人脸检测和连续状态的目标跟踪。首先预测人脸两眼之间的中心位置,得到人脸的预测位置并对预测位置处的图像进行模板匹配,快速检测出人脸准确位置。然后利用检测出的人脸修正人脸模板,并在检测出的位置、旋转度、缩放比例等条件下,对后面序列图像进行小位置、小角度的快速跟踪。实验采用了多种环境下的大量视频,结果显示该算法能够快速跟踪视频序列中的人脸并具有很高的准确性、鲁棒性。  相似文献   

17.
This paper proposes an automatic method based on the deterministic simulated annealing (DSA) approach for solving the image change detection problem between two images where one of them is the reference image. Each pixel in the reference image is considered as a node with a state value in a network of nodes. This state determines the magnitude of the change. The DSA optimization approach tries to achieve the most network stable configuration based on the minimization of an energy function. The DSA scheme allows the mapping of interpixel contextual dependencies which has been used favorably in some existing image change detection strategies. The main contribution of the DSA is exactly its ability for avoiding local minima during the optimization process thanks to the annealing scheme. Local minima have been detected when using some optimization strategies, such as Hopfield neural networks, in images with large amount of changes, greater than the 20%. The DSA performs better than other optimization strategies for images with a large amount of changes and obtain similar results for images where the changes are small. Hence, the DSA approach appears to be a general method for image change detection independently of the amount of changes. Its performance is compared against some recent image change detection methods.
Gonzalo PajaresEmail:
  相似文献   

18.
We propose a framework to reconstruct the 3D pose of a human for animation from a sequence of single-view video frames. The framework for pose construction starts with background estimation and the performer?s silhouette is extracted using image subtraction for each frame. Then the body silhouettes are automatically labeled using a model-based approach. Finally, the 3D pose is constructed from the labeled human silhouette by assuming orthographic projection. The proposed approach does not require camera calibration. It assumes that the input video has a static background, it has no significant perspective effects, and the performer is in an upright position. The proposed approach requires minimal user interaction.  相似文献   

19.
提出一种新的适用于驾驶中视觉疲劳实时检测的人脸定位及眼睛状态分析算法。采用差分法快速找到视频图像中的目标运动区域,结合YCbCr色彩空间进行肤色分割定位人脸。对脸部区域进行灰度积分投影并结合Hough变换检测眼睑。对检测到的眼睑进行数据分析,得到眼睛开闭情况,结合眨眼分析,获得EOD值来判断驾驶员是否疲劳。实验结果显示该方法能在复杂背景下快速定位人脸,检测到眼睛睁开时的EOD值,满足视觉疲劳检测的实时需要。  相似文献   

20.
本文提出了一种视频序列中人脸检测的算法.算法首先使用边缘检测和轮廓提取的方法滤去了大量的非人脸窗口然后使用基于Haar特征的检测方法对过滤处理结果进行再次检测.实验结果表明该系统能够实时地时于人脸进行检测,可以被应用在视频监控方面.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号