首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
In many scenarios a dynamic scene is filmed by multiple video cameras located at different viewing positions. Visualizing such multi-view data on a single display raises an immediate question—which cameras capture better views of the scene? Typically, (e.g. in TV broadcasts) a human producer manually selects the best view. In this paper we wish to automate this process by evaluating the quality of a view, captured by every single camera. We regard human actions as three-dimensional shapes induced by their silhouettes in the space-time volume. The quality of a view is then evaluated based on features of the space-time shape, which correspond with limb visibility. Resting on these features, two view quality approaches are proposed. One is generic while the other can be trained to fit any preferred action recognition method. Our experiments show that the proposed view selection provide intuitive results which match common conventions. We further show that it improves action recognition results.  相似文献   

2.
By overlaying timeline-synchronized user comments on videos, Danmaku commenting creates a unique co-viewing experience of online videos. This study aims to understand the reasons for watching or not watching Danmaku videos. From a review of the literature and a pilot study, an initial pool of motivations and hindrances to Danmaku video viewing was gathered. Then, a survey involving 248 participants to identify the underlying factor structures of motivations and hindrances was conducted. Their influences on users’ attitude and behaviors with Danmaku videos were also examined. The results showed that people viewed Danmaku videos to obtain information, entertainment, and social connectedness. Introverted young men with high openness to new experience are more likely to view Danmaku videos. Infrequent viewers refused to watch Danmaku videos mainly because of the visual clutter that resulted from Danmaku comments.  相似文献   

3.
This paper discusses the role of geometry in achieving automation of the overall finite element analysis process. Emphasis is placed on the geometry requirements for two of the key technologies within this process: fully automatic mesh generation and adaptive analysis. A geometric framework that permits the implementation of automated finite element procedures is presented. This includes high-level geometry-based problem specification and control, powerful data structures, and the geometric functionality that is necessary to support automation. An open architecture system, called TAGUS, which incorporates these notions and permits manipulation of geometry, topology, and attribute data from within an applications program, is also presented. In addition, the paper contrasts the geometry requirements of problems with static domains versus the special considerations that must be given for dynamically changing domains. Finally, a view of an integrated system architecture for analysis automation is presented.  相似文献   

4.
基于单视图的多姿态人脸识别算法   总被引:14,自引:0,他引:14  
针对基于多视图的多姿态人脸识别方法的缺陷,即需要对每个人脸拍摄多个视图为前提条件,提出了基于单视图的多姿态人脸识别技术,首先基于二元高次多项式函数最小二乘拟合方法由单视图通过变形生成多姿态人脸图像,然后基于该单视图和生成的多姿态图像进行多姿态人脸识别。实验结果表明该文算法识别的正确率远高于经典算法。  相似文献   

5.
International Journal of Computer Vision - Recent methods for video action recognition have reached outstanding performances on existing benchmarks. However, they tend to leverage context such as...  相似文献   

6.
View Invariance for Human Action Recognition   总被引:4,自引:0,他引:4  
This paper presents an approach for viewpoint invariant human action recognition, an area that has received scant attention so far, relative to the overall body of work in human action recognition. It has been established previously that there exist no invariants for 3D to 2D projection. However, there exist a wealth of techniques in 2D invariance that can be used to advantage in 3D to 2D projection. We exploit these techniques and model actions in terms of view-invariant canonical body poses and trajectories in 2D invariance space, leading to a simple and effective way to represent and recognize human actions from a general viewpoint. We first evaluate the approach theoretically and show why a straightforward application of the 2D invariance idea will not work. We describe strategies designed to overcome inherent problems in the straightforward approach and outline the recognition algorithm. We then present results on 2D projections of publicly available human motion capture data as well on manually segmented real image sequences. In addition to robustness to viewpoint change, the approach is robust enough to handle different people, minor variabilities in a given action, and the speed of aciton (and hence, frame-rate) while encoding sufficient distinction among actions. This work was done when the author was a graduate student in the Department of Computer Science and was partially supported by the NSF Grant ECS-02-5475. The author is curently with Siemens Corporate Research, Princeton, NJ. Dr. Chellappa is with the Department of Electrical and Computer Engineering.  相似文献   

7.
单幅图像中物体的定位需要估计物体在3维空间中的位置和姿态,在静态场景中等价于摄像机定标,具有广泛的应用价值.针对具有3维模型,但缺失纹理信息的不规则物体的定位问题,提出一种基于轮廓匹配的定位方法.首先使用基于图像分割的方法提取输入图像中物体轮廓线,然后可以将图像轮廓线与给定位置和姿态参数下渲染3维模型的轮廓线进行匹配,匹配误差可以表示为位置与姿态参数的函数.由于该函数不能解析表达与求解,需要通过离散采样计算导数及目标函数值.位置与姿态参数的最优值可以通过LM(Levenberg-Marquardt)方法进行求解.实验结果表明,该方法可以快速收敛,并具有很高的精确性和鲁棒性.  相似文献   

8.
We consider the least-squares (L2) minimization problems in multiple view geometry for triangulation, homography, camera resectioning and structure-and-motion with known rotation, or known plane. Although optimal algorithms have been given for these problems under an L-infinity cost function, finding optimal least-squares solutions to these problems is difficult, since the cost functions are not convex, and in the worst case may have multiple minima. Iterative methods can be used to find a good solution, but this may be a local minimum. This paper provides a method for verifying whether a local-minimum solution is globally optimal, by providing a simple and rapid test involving the Hessian of the cost function. The basic idea is that by showing that the cost function is convex in a restricted but large enough neighbourhood, a sufficient condition for global optimality is obtained. The method is tested on numerous problem instances of real data sets. In the vast majority of cases we are able to verify that the solutions are optimal, in particular, for small to medium-scale problems.  相似文献   

9.
In this paper, we provide a principled explanation of how knowledge in global 3-D structural invariants, typically captured by a group action on a symmetric structure, can dramatically facilitate the task of reconstructing a 3-D scene from one or more images. More importantly, since every symmetric structure admits a canonical coordinate frame with respect to which the group action can be naturally represented, the canonical pose between the viewer and this canonical frame can be recovered too, which explains why symmetric objects (e.g., buildings) provide us overwhelming clues to their orientation and position. We give the necessary and sufficient conditions in terms of the symmetry (group) admitted by a structure under which this pose can be uniquely determined. We also characterize, when such conditions are not satisfied, to what extent this pose can be recovered. We show how algorithms from conventional multiple-view geometry, after properly modified and extended, can be directly applied to perform such recovery, from all hidden images of one image of the symmetric structure. We also apply our results to a wide range of applications in computer vision and image processing such as camera self-calibration, image segmentation and global orientation, large baseline feature matching, image rendering and photo editing, as well as visual illusions (caused by symmetry if incorrectly assumed).  相似文献   

10.
We propose a method for converting a single image of a transparent object into multi-view photo that enables users observing the object from multiple new angles, without inputting any 3D shape. The complex light paths formed by refraction and reflection makes it challenging to compute the lighting effects of transparent objects from a new angle. We construct an encoder–decoder network for normal reconstruction and texture extraction, which enables synthesizing novel views of transparent object from a set of new views and new environment maps using only one RGB image. By simultaneously considering the optical transmission and perspective variation, our network learns the characteristics of optical transmission and the change of perspective as guidance to the conversion from RGB colours to surface normals. A texture extraction subnetwork is proposed to alleviate the contour loss phenomenon during normal map generation. We test our method using 3D objects within and without our training data, including real 3D objects that exists in our lab, and completely new environment maps that we take using our phones. The results show that our method performs better on view synthesis of transparent objects in complex scenes using only a single-view image.  相似文献   

11.
Periodicity has been recognized as an important cue for tasks like activity recognition and gait analysis. However, most existing techniques analyze periodic motions only in image coordinates, making them very dependent on the viewing angle. In this paper we show that it is possible to reconstruct a periodic trajectory in 3D given only its appearance in image coordinates from a single camera view. We draw a strong analogy between this problem and that of reconstructing an object from multiple views, which allows us to rely on well-known theoretical results from the multi-view geometry domain and obtain significant guarantees regarding the solvability of the estimation problem. We present two different formulations of the problem, along with techniques for performing the reconstruction in both cases, and an algorithm for estimating the period of motion from its image-coordinate trajectory. Experimental results demonstrate the feasibility of the proposed techniques.  相似文献   

12.
Guo  Chuan  Zuo  Xinxin  Wang  Sen  Liu  Xinshuang  Zou  Shihao  Gong  Minglun  Cheng  Li 《International Journal of Computer Vision》2022,130(2):285-315

We aim to tackle the interesting yet challenging problem of generating videos of diverse and natural human motions from prescribed action categories. The key issue lies in the ability to synthesize multiple distinct motion sequences that are realistic in their visual appearances. It is achieved in this paper by a two-step process that maintains internal 3D pose and shape representations, action2motion and motion2video. Action2motion stochastically generates plausible 3D pose sequences of a prescribed action category, which are processed and rendered by motion2video to form 2D videos. Specifically, the Lie algebraic theory is engaged in representing natural human motions following the physical law of human kinematics; a temporal variational auto-encoder is developed that encourages diversity of output motions. Moreover, given an additional input image of a clothed human character, an entire pipeline is proposed to extract his/her 3D detailed shape, and to render in videos the plausible motions from different views. This is realized by improving existing methods to extract 3D human shapes and textures from single 2D images, rigging, animating, and rendering to form 2D videos of human motions. It also necessitates the curation and reannotation of 3D human motion datasets for training purpose. Thorough empirical experiments including ablation study, qualitative and quantitative evaluations manifest the applicability of our approach, and demonstrate its competitiveness in addressing related tasks, where components of our approach are compared favorably to the state-of-the-arts.

  相似文献   

13.
Human weight estimation is useful in a variety of potential applications, e.g., targeted advertisement, entertainment scenarios and forensic science. However, estimating weight only from color cues is particularly challenging since these cues are quite sensitive to lighting and imaging conditions. In this article, we propose a novel weight estimator based on a single RGB-D image, which utilizes the visual color cues and depth information. Our main contributions are three-fold.First, we construct the W8-RGBD dataset including RGB-D images of different people with ground truth weight. Second,the novel sideview shape feature and the feature fusion model are proposed to facilitate weight estimation. Additionally, we consider gender as another important factor for human weight estimation. Third, we conduct comprehensive experiments using various regression models and feature fusion models on the new weight dataset, and encouraging results are obtained based on the proposed features and models.  相似文献   

14.
虚拟人几何建模及运动控制方法的研究   总被引:1,自引:0,他引:1  
对目前虚拟人几何模型建市的几种常用方法进行了阐述,并对它们的优缺点进行了对比分析,同时对走步、跑步等几种典型的运动的控制方法进行了概述和对比研究.  相似文献   

15.
3-D Head Model Retrieval Using a Single Face View Query   总被引:1,自引:0,他引:1  
In this paper, a novel 3D head model retrieval approach is proposed, in which only a single 2D face view query is required. The proposed approach will be important for multimedia application areas such as virtual world construction and game design, in which 3D virtual characters with a given set of facial features can be rapidly constructed based on 2D view queries, instead of having to generate each model anew. To achieve this objective, we construct an adaptive mapping through which each 2D view feature vector is associated with its corresponding 3D model feature vector. Given this estimated 3D model feature vector, similarity matching can then be performed in the 3D model feature space. To avoid the explicit specification of the complex relationship between the 2D and 3D feature spaces, a neural network approach is adopted in which the required mapping is implicitly specified through a set of training examples. In addition, for efficient feature representation, principal component analysis (PCA) is adopted to achieve dimensionality reduction for facilitating both the mapping construction and the similarity matching process. Since the linear nature of the original PCA formulation may not be adequate to capture the complex characteristics of 3D models, we also consider the adoption of its nonlinear counterpart, i.e., the so-called kernel PCA approach, in this work. Experimental results show that the proposed approach is capable of successfully retrieving the set of 3D models which are similar in appearance to a given 2D face view.  相似文献   

16.
Robotic manipulation of objects in clutter remains a challenging problem to date. The challenge is posed by various levels of complexity involved in interaction among objects. Understanding these semantic interactions among different objects is important to manipulate in complex settings. It can play a significant role in extending the scope of manipulation to cluttered environment involving generic objects, and both direct and indirect physical contact. In our work, we aim at learning semantic interaction among objects of generic shapes and sizes lying in clutter involving physical contact. We infer three types of support relationships: “support from below”, “support from side”, and “containment”. Subsequently, the learned semantic interaction or support relationship is used to derive a sequence or order in which the objects surrounding the object of interest should be removed without causing damage to the environment. The generated sequence is called support order. We also extend understanding of semantic interaction from single view to multiple views and predict support order in multiple views. Using multiple views addresses those cases that are not handled when using single view such as scenarios of occlusion or missing support relationships. We have created two RGBD datasets for our experiments on support order prediction in single view and multiple views respectively. The datasets contains RGB images, point clouds and depth maps of various objects used in day-to-day life present in clutter with physical contact and overlap. We captured many different cluttered settings involving different kinds of object-object interaction and successfully learned support relationship and performed Support Order Prediction in these settings.  相似文献   

17.
视频监控中针对拥挤人群的人体分割与跟踪   总被引:1,自引:0,他引:1  
视频监控中,拥挤人群的相互遮挡给人体分割和跟踪带来很大困难.为了解决该问题,提出人体模型和人体边缘曲线相结合的人体分割方法.针对分割可能造成人体特征值存在较大的缺损、畸变问题,采用具有较高鲁棒性的BP(Back Propaga-tion)神经网络作为跟踪模型.为了提高BP网络的自主学习能力,采用分层Dirichlet过程来判断是否有新类别的人体特征数据产生,进而为BP网络的学习提供决策.通过仿真实验证实:本文提出的遮挡处理方法能够有效解决人体部分遮挡问题,与其他方法相比,具有简单且实时性好的优点;此外,分层Dirichlet过程与BP网络的结合提高了跟踪系统的自主学习能力.  相似文献   

18.
The future Internet is expected to connect billions of people, things and services having the potential to deliver a new set of applications by deriving new insights from the data generated from these diverse data sources. This highly interconnected global network brings new types of challenges in analysing and making sense of data. This is why machine learning is expected to be a crucial technology in the future, in making sense of data, in improving business and decision making, and in doing so, providing the potential to solve a wide range of problems in health care, telecommunications, urban computing, and others. Machine learning algorithms can learn how to perform certain tasks by generalizing examples from a range of sampling. This is a totally different paradigm than traditional programming language approaches, which are based on writing programs that process data to produce an output. However, choosing a suitable machine learning algorithm for a particular application requires a substantial amount of time and effort that is hard to undertake even with excellent research papers and textbooks. In order to reduce the time and effort, this paper introduces the TCDC (train, compare, decide, and change) approach, which can be thought as a ‘Machine Learning as a Service’ approach, to aid machine learning researchers and practitioners to choose the optimum machine learning model to use for achieving the best trade-off between accuracy and interpretability, computational complexity, and ease of implementation. The paper includes the results of testing and evaluating the recommenders based on the TCDC approach (in comparison with the traditional default approach) applied to 12 datasets that are available as open-source datasets drawn from diverse domains including health care, agriculture, aerodynamics and others. Our results indicate that the proposed approach selects the best model in terms of predictive accuracy in 62.5 % for regression tests performed and 75 % for classification tests.  相似文献   

19.
单片机作为PLC新用法   总被引:1,自引:0,他引:1  
简述了单片机作为可编程控制器的一种新方法,介绍了微机PLC的硬件系统构成原理以及软件设计思想。  相似文献   

20.
基于单目体系的可见手重构算法研究   总被引:1,自引:0,他引:1  
首先确立单相机加单平面镜的体系结构,然后研究在该体系下实现三维重构的基本理论和基本方法,具体探讨了以下4个关键问题:(1)手边沿的提取;(2)对应关系的获取;(3)3D重构的基本方法;(4)校准算法,通过揭示出空间物点在像平面上的投影、该物点的对称点在同一像平面上的投影、镜面以及该物点本身这四者之间的关系,得到三维重构的新方法,既便于理论分析,又便于程序设计;既使校准过程简单易行,又保证了三维重构的精度.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号