首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
为解决基于随机森林的3D人体姿态估计算法容易出现的误分类问题, 提出一种基于自适应融合特征提取和误分类处理机制的改进算法.该算法利用自适应融合特征提取方法自适应提取深度融合特征, 此特征可表达图像距离信息和部位尺寸信息, 增强特征的表征能力; 针对识别部位误分类问题, 分别从识别部位误分点聚集情况和迭代整合思想出发, 提出误分类处理机制, 改善部位识别结果; 最后提出可进一步处理误分点的改进主方向分析(Principal direction analysis, PDA)算法, 自适应计算出部位主方向向量, 实现3D人体姿态估计.结果表明, 该算法能有效去除部位误分点, 并显著改善了3D人体姿态估计.  相似文献   

2.
苏乐  柴金祥  夏时洪 《软件学报》2016,27(S2):172-183
提出一种基于局部姿态先验的从深度图像中实时在线捕获3D人体运动的方法.关键思路是根据从捕获的深度图像中自动提取具有语义信息的虚拟稀疏3D标记点,从事先建立的异构3D人体姿态数据库中快速检索K个姿态近邻并构建局部姿态先验模型,通过迭代优化求解最大后验概率,实时地在线重建3D人体姿态序列.实验结果表明,该方法能够实时跟踪重建出稳定、准确的3D人体运动姿态序列,并且只需经过个体化人体参数自动标定过程,可跟踪身材尺寸差异较大的不同表演者;帧率约25fps.因此,所提方法可应用于3D游戏/电影制作、人机交互控制等领域.  相似文献   

3.
We present a novel approach to track full human body mesh with a single depth camera, e.g. Microsoft Kinect, using a template body model. The proposed observation-oriented tracking mainly targets at fitting the body mesh silhouette to the 2D user boundary in video stream by deforming the body. It is fast to be integrated into real-time or interactive applications, which is impossible with traditional iterative optimization based approaches. Our method is a composite of two main stages: user-specific body shape estimation and on-line body tracking. We first develop a novel method to fit a 3D morphable human model to the actual body shape of the user in front of the depth camera. A strategy, making use of two constrains, i.e. point clouds from depth images and correspondence between foreground user mask contour and the boundary of projected body model, is designed. On-line tracking is made possible in successive steps. At each frame, the joint angles of template skeleton are optimized towards the captured Kinect skeleton. Then, the aforementioned contour correspondence is adopted to adjust the projected body model vertices towards the contour points of foreground user mask, using a Laplacian deformation technique. Experimental results show that our method achieves fast and high quality tracking. We also show that the proposed method is benefit to three applications: virtual try-on, full human body scanning and applications in manufacturing systems.  相似文献   

4.
This paper presents a novel method for reconstructing a 3D human body pose from stereo image sequences based on a top-down learning method. However, it is inefficient to build a statistical model using all training data. Therefore, the training data is hierarchically divided into several clusters to reduce the complexity of the learning problem. In the learning stage, the human body model database is hierarchically constructed by classifying the training data into several sub-clusters with silhouette images. The data of each cluster in the bottom level is represented by a linear combination of examples. In the reconstruction stage, the proposed method hierarchically searches a cluster for the best matching silhouette image using a silhouette history image (SHI). Then, the 3D human body pose is reconstructed from a depth image using a linear combination of examples method. By using depth information to reconstruct 3D human body pose, the similar poses in silhouette images are estimated as different 3D human body poses. The experimental results demonstrate that the proposed method is efficient and effective for reconstructing 3D human body poses.  相似文献   

5.
Networked 3D virtual environments allow multiple users to interact over the Internet by means of avatars and to get some feeling of a virtual telepresence. However, avatar control may be tedious. 3D sensors for motion capture systems based on 3D sensors have reached the consumer market, but webcams remain more widespread and cheaper. This work aims at animating a user’s avatar by real-time motion capture using a personal computer and a plain webcam. In a classical model-based approach, we register a 3D articulated upper-body model onto video sequences and propose a number of heuristics to accelerate particle filtering while robustly tracking user motion. Describing the body pose using wrists 3D positions rather than joint angles allows efficient handling of depth ambiguities for probabilistic tracking. We demonstrate experimentally the robustness of our 3D body tracking by real-time monocular vision, even in the case of partial occlusions and motion in the depth direction.  相似文献   

6.
In this paper, we present a novel approach for recovering a 3-D pose from a single human body depth silhouette using nonrigid point set registration and body part tracking. In our method, a human body depth silhouette is presented as a set of 3-D points and matched to another set of 3-D points using point correspondences. To recognize and maintain body part labels, we initialize the first set of points to corresponding human body parts, resulting in a body part-labeled map. Then, we transform the points to a sequential set of points based on point correspondences determined by nonrigid point set registration. After point registration, we utilize the information from tracked body part labels and registered points to create a human skeleton model. A 3-D human pose gets recovered by mapping joint information from the skeleton model to a 3-D synthetic human model. Quantitative and qualitative evaluation results on synthetic and real data show that complex human poses can be recovered more reliably with lower errors compared to other conventional techniques for 3-D pose recovery.  相似文献   

7.
利用深度传感器估计三维人体姿态是计算机视觉领域的一个重要问题,在人机交互、虚拟现实和动画设计等领域有重要的应用价值.针对该问题的主流方法是自底向上的方法,这类方法一般采用分类、回归或检索技术,可以直接从深度数据中估计三维肢体姿态,在人机交互中得到了很广泛的应用.但是这类方法依赖于大规模的姿态数据库,而且结果不够精确.本文提出一种结合个性化人体建模和深度数据的三维姿态估计方法,首先对运动对象建立三维虚拟人模型,然后利用该个性化的虚拟人模型与深度数据之间的点匹配关系构造姿态优化的目标函数,通过迭代优化目标函数,估计出与深度数据相吻合的三维姿态.与传统方法相比,本文方法不需要任何姿态数据库.实验表明,本文方法得到的结果更加精确.  相似文献   

8.
Simultaneous tracking and action recognition for single actor human actions   总被引:1,自引:0,他引:1  
This paper presents an approach to simultaneously tracking the pose and recognizing human actions in a video. This is achieved by combining a Dynamic Bayesian Action Network (DBAN) with 2D body part models. Existing DBAN implementation relies on fairly weak observation features, which affects the recognition accuracy. In this work, we use a 2D body part model for accurate pose alignment, which in turn improves both pose estimate and action recognition accuracy. To compensate for the additional time required for alignment, we use an action entropy-based scheme to determine the minimum number of states to be maintained in each frame while avoiding sample impoverishment. In addition, we also present an approach to automation of the keypose selection task for learning 3D action models from a few annotations. We demonstrate our approach on a hand gesture dataset with 500 action sequences, and we show that compared to DBAN our algorithm achieves 6% improvement in accuracy.  相似文献   

9.
3D human pose estimation in motion is a hot research direction in the field of computer vision. However, the performance of the algorithm is affected by the complexity of 3D spatial information, self-occlusion of human body, mapping uncertainty and other problems. In this paper, we propose a 3D human joint localization method based on multi-stage regression depth network and 2D to 3D point mapping algorithm. First of all, we use a single RGB image as the input, through the introduction of heatmap and multi-stage regression to constantly optimize the coordinates of human joint points. Then we input the 2D joint points into the mapping network for calculation, and get the coordinates of 3D human body joint points, and then to complete the 3D human body pose estimation task. The MPJPE of the algorithm in Human3.6 M dataset is 40.7. The evaluation of dataset shows that our method has obvious advantages.  相似文献   

10.
In this paper, we present a method for human full-body pose estimation from depth data that can be obtained using Time of Flight (ToF) cameras or the Kinect device. Our approach consists of robustly detecting anatomical landmarks in the 3D data and fitting a skeleton body model using constrained inverse kinematics. Instead of relying on appearance-based features for interest point detection that can vary strongly with illumination and pose changes, we build upon a graph-based representation of the depth data that allows us to measure geodesic distances between body parts. As these distances do not change with body movement, we are able to localize anatomical landmarks independent of pose. For differentiation of body parts that occlude each other, we employ motion information, obtained from the optical flow between subsequent intensity images. We provide a qualitative and quantitative evaluation of our pose tracking method on ToF and Kinect sequences containing movements of varying complexity.  相似文献   

11.
Detecting objects, estimating their pose, and recovering their 3D shape are critical problems in many vision and robotics applications. This paper addresses the above needs using a two stages approach. In the first stage, we propose a new method called DEHV – Depth-Encoded Hough Voting. DEHV jointly detects objects, infers their categories, estimates their pose, and infers/decodes objects depth maps from either a single image (when no depth maps are available in testing) or a single image augmented with depth map (when this is available in testing). Inspired by the Hough voting scheme introduced in [1], DEHV incorporates depth information into the process of learning distributions of image features (patches) representing an object category. DEHV takes advantage of the interplay between the scale of each object patch in the image and its distance (depth) from the corresponding physical patch attached to the 3D object. Once the depth map is given, a full reconstruction is achieved in a second (3D modelling) stage, where modified or state-of-the-art 3D shape and texture completion techniques are used to recover the complete 3D model. Extensive quantitative and qualitative experimental analysis on existing datasets [2], [3], [4] and a newly proposed 3D table-top object category dataset shows that our DEHV scheme obtains competitive detection and pose estimation results. Finally, the quality of 3D modelling in terms of both shape completion and texture completion is evaluated on a 3D modelling dataset containing both in-door and out-door object categories. We demonstrate that our overall algorithm can obtain convincing 3D shape reconstruction from just one single uncalibrated image.  相似文献   

12.
In this work, we consider the problem of estimating the 3D position of multiple humans in a scene as well as their body shape and articulation from a single RGB video recorded with a static camera. In contrast to expensive marker-based or multi-view systems, our lightweight setup is ideal for private users as it enables an affordable 3D motion capture that is easy to install and does not require expert knowledge. To deal with this challenging setting, we leverage recent advances in computer vision using large-scale pre-trained models for a variety of modalities, including 2D body joints, joint angles, normalized disparity maps, and human segmentation masks. Thus, we introduce the first non-linear optimization-based approach that jointly solves for the 3D position of each human, their articulated pose, their individual shapes as well as the scale of the scene. In particular, we estimate the scene depth and person scale from normalized disparity predictions using the 2D body joints and joint angles. Given the per-frame scene depth, we reconstruct a point-cloud of the static scene in 3D space. Finally, given the per-frame 3D estimates of the humans and scene point-cloud, we perform a space-time coherent optimization over the video to ensure temporal, spatial and physical plausibility. We evaluate our method on established multi-person 3D human pose benchmarks where we consistently outperform previous methods and we qualitatively demonstrate that our method is robust to in-the-wild conditions including challenging scenes with people of different sizes. Code: https://github.com/dluvizon/scene-aware-3d-multi-human  相似文献   

13.
Scanning 3D full human bodies using Kinects   总被引:3,自引:0,他引:3  
Depth camera such as Microsoft Kinect, is much cheaper than conventional 3D scanning devices, and thus it can be acquired for everyday users easily. However, the depth data captured by Kinect over a certain distance is of extreme low quality. In this paper, we present a novel scanning system for capturing 3D full human body models by using multiple Kinects. To avoid the interference phenomena, we use two Kinects to capture the upper part and lower part of a human body respectively without overlapping region. A third Kinect is used to capture the middle part of the human body from the opposite direction. We propose a practical approach for registering the various body parts of different views under non-rigid deformation. First, a rough mesh template is constructed and used to deform successive frames pairwisely. Second, global alignment is performed to distribute errors in the deformation space, which can solve the loop closure problem efficiently. Misalignment caused by complex occlusion can also be handled reasonably by our global alignment algorithm. The experimental results have shown the efficiency and applicability of our system. Our system obtains impressive results in a few minutes with low price devices, thus is practically useful for generating personalized avatars for everyday users. Our system has been used for 3D human animation and virtual try on, and can further facilitate a range of home–oriented virtual reality (VR) applications.  相似文献   

14.
We propose a framework to reconstruct the 3D pose of a human for animation from a sequence of single-view video frames. The framework for pose construction starts with background estimation and the performer?s silhouette is extracted using image subtraction for each frame. Then the body silhouettes are automatically labeled using a model-based approach. Finally, the 3D pose is constructed from the labeled human silhouette by assuming orthographic projection. The proposed approach does not require camera calibration. It assumes that the input video has a static background, it has no significant perspective effects, and the performer is in an upright position. The proposed approach requires minimal user interaction.  相似文献   

15.
Recent studies have demonstrated that high-level semantics in data can be captured using sparse representation. In this paper, we propose an approach to human body pose estimation in static images based on sparse representation. Given a visual input, the objective is to estimate 3D human body pose using feature space information and geometrical information of the pose space. On the assumption that each data point and its neighbors are likely to reside on a locally linear patch of the underlying manifold, our method learns the sparse representation of the new input using both feature and pose space information and then estimates the corresponding 3D pose by a linear combination of the bases of the pose dictionary. Two strategies for dictionary construction are presented: (i) constructing the dictionary by randomly selecting the frames of a sequence and (ii) selecting specific frames of a sequence as dictionary atoms. We analyzed the effect of each strategy on the accuracy of pose estimation. Extensive experiments on datasets of various human activities show that our proposed method outperforms state-of-the-art methods.  相似文献   

16.
基于时空权重姿态运动特征的人体骨架行为识别研究   总被引:1,自引:0,他引:1  
人体行为识别在视觉领域的广泛应用使得它在过去的几十年里一直都是备受关注的研究热点.近些年来,深度传感器的普及以及基于深度图像实时骨架估测算法的提出,使得基于骨架序列的人体行为识别研究越来越吸引人们的注意.已有的研究工作大部分提取帧内骨架不同关节点的空间域信息和帧间骨架关节点的时间域信息来表征行为序列,但没有考虑到不同关节点和姿态对判定行为类别所起作用是不同的.因此本文提出了一种基于时空权重姿态运动特征的行为识别方法,采用双线性分类器迭代计算得到关节点和静止姿态相对于该类别动作的权重,确定那些信息量大的关节点和姿态;同时,为了对行为特征进行更好的时序分析,本文引入了动态时间规整和傅里叶时间金字塔算法进行时序建模,最后采用支持向量机完成行为分类.在多个数据集上的实验结果表明,该方法与其它一些方法相比,表现出了相当大的竞争力,甚至更好的识别效果.  相似文献   

17.
提出一种双目视觉测量空间轴对称目标姿态的新方法,应用线性方法获取轴对称目标在像面的轴线,通过两像面轴线与各自光心所成的平面相交,得到被测目标的轴线的方向矢量,进行三维姿态测量,避免了传统姿态测量中左右像面目标特征点的特征匹配或灰度匹配。模拟实验结果表明:该方法姿态角测量误差小于0.5°;且计算速度快,结果稳定,能够满足实时处理的要求。  相似文献   

18.
Easy editing of a clothed 3D human avatar is central to many practical applications. However, it is easy to produce implausible, unnatural looking results, since subtle reshaping or pose alteration of avatars requires global consistency and agreement with human anatomy. Here, we present a parametric editing system for a clothed human body, based on use of a revised SCAPE model. We show that the parameters of the model can be estimated directly from a clothed avatar, and that it can be used as a basis for realistic, real-time editing of the clothed avatar mesh via a novel 3D body-aware warping scheme. The avatar can be easily controlled by a few semantically meaningful parameters, 12 biometric attributes controlling body shape, and 17 bones controlling pose. Our experiments demonstrate that our system can interactively produce visually pleasing results.  相似文献   

19.
针对单模态特征鉴别行为动作类别的能力有限问题,提出基于RGB-D视频中多模态视觉特征融合和实例化多重核超限学习(Exemplars-MKL-ELM)的动作分类方法.首先,利用骨架表面拟合和密集轨迹提取稳健的密集运动姿态特征,以稠密点云法平面感知人体3维几何的稀疏化有向主成分直方图特征,提取外观纹理嵌入身体节点空-时邻域的三维梯度直方图特征.然后,采用半径边缘约束多重核超限学习机融合多模态视觉特征,并利用对比数据法挖掘每个行为类别的代表性实例集合.最后,每个样本结合融合视觉特征和即得实例集合,采用Exemplars-MKL-ELM模型和贪婪预测思想分层分类识别行为.实验表明,文中方法在分类准确度和计算效率上都较优.  相似文献   

20.
Hand pose estimation benefits large human computer interaction applications. The hand pose has high dimensions of freedom (dof) for joints, and various hand poses are flexible. Hand pose estimation is still a challenge problem. Since hand joints on the hand skeleton topology model have strict relationships between each other, we propose a hierarchical topology based approach to estimate 3D hand poses. First, we determine palm positions and palm orientations by detecting hand fingertips and calculating their directions in depth images. It is the global topology of hand poses. Moreover, we define connection relationships of finger joints as the local topology of hand model. Based on hierarchical topology, we extract angle features to describe hand poses, and adopt the regression forest algorithm to estimate 3D coordinates of hand joints. We further use freedom forrest algorithm to refine ambiguous poses in estimation to solve error accumulation problem. The hierarchical topology based approach ensures estimated hand poses in a reasonable topology, and improves estimation accuracy. We evaluate our approach on two public databases, and experiments illustrate its efficiency. Compared with state-of-the-art approaches, our approach improves estimation accuracy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号