首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
This paper presents a real-time video understanding system which automatically recognises activities occurring in environments observed through video surveillance cameras. Our approach consists in three main stages: Scene Tracking, Coherence Maintenance, and Scene Understanding. The main challenges are to provide a robust tracking process to be able to recognise events in outdoor and in real applications conditions, to allow the monitoring of a large scene through a camera network, and to automatically recognise complex events involving several actors interacting with each others. This approach has been validated for Airport Activity Monitoring in the framework of the European project AVITRACK.  相似文献   

3.
This paper describes a probabilistic integrated object recognition and tracking framework called PIORT, together with two specific methods derived from it, which are evaluated experimentally in several test video sequences. The first step in the proposed framework is a static recognition module that provides class probabilities for each pixel of the image from a set of local features. These probabilities are updated dynamically and supplied to a tracking decision module capable of handling full and partial occlusions. The two specific methods presented use RGB color features and differ in the classifier implemented: one is a Bayesian method based on maximum likelihood and the other one is based on a neural network. The experimental results obtained have shown that, on one hand, the neural net based approach performs similarly and sometimes better than the Bayesian approach when they are integrated within the tracking framework. And on the other hand, our PIORT methods have achieved better results when compared to other published tracking methods in video sequences taken with a moving camera and including full and partial occlusions of the tracked object.  相似文献   

4.
The study of human activity is applicable to a large number of science and technology fields, such as surveillance, biomechanics or sports applications. This article presents BB6-HM, a block-based human model for real-time monitoring of a large number of visual events and states related to human activity analysis, which can be used as components of a library to describe more complex activities in such important areas as surveillance, for example, luggage at airports, clients’ behaviour in banks and patients in hospitals. BB6-HM is inspired by the proportionality rules commonly used in Visual Arts, i.e., for dividing the human silhouette into six rectangles of the same height. The major advantage of this proposal is that analysis of the human can be easily broken down into regions, so that we can obtain information of activities. The computational load is very low, so it is possible to define a very fast implementation. Finally, this model has been applied to build classifiers for the detection of primitive events and visual attributes using heuristic rules and machine learning techniques.  相似文献   

5.
Object tracking is an active research area nowadays due to its importance in human computer interface, teleconferencing and video surveillance. However, reliable tracking of objects in the presence of occlusions, pose and illumination changes is still a challenging topic. In this paper, we introduce a novel tracking approach that fuses two cues namely colour and spatio-temporal motion energy within a particle filter based framework. We conduct a measure of coherent motion over two image frames, which reveals the spatio-temporal dynamics of the target. At the same time, the importance of both colour and motion energy cues is determined in the stage of reliability evaluation. This determination helps maintain the performance of the tracking system against abrupt appearance changes. Experimental results demonstrate that the proposed method outperforms the other state of the art techniques in the used test datasets.  相似文献   

6.
Formalizing computational models for everyday human activities remains an open challenge. Many previous approaches towards this end assume prior knowledge about the structure of activities, using which explicitly defined models are learned in a completely supervised manner. For a majority of everyday environments however, the structure of the in situ activities is generally not known a priori. In this paper we investigate knowledge representations and manipulation techniques that facilitate learning of human activities in a minimally supervised manner. The key contribution of this work is the idea that global structural information of human activities can be encoded using a subset of their local event subsequences, and that this encoding is sufficient for activity-class discovery and classification.In particular, we investigate modeling activity sequences in terms of their constituent subsequences that we call event n-grams. Exploiting this representation, we propose a computational framework to automatically discover the various activity-classes taking place in an environment. We model these activity-classes as maximally similar activity-cliques in a completely connected graph of activities, and describe how to discover them efficiently. Moreover, we propose methods for finding characterizations of these discovered classes from a holistic as well as a by-parts perspective. Using such characterizations, we present a method to classify a new activity to one of the discovered activity-classes, and to automatically detect whether it is anomalous with respect to the general characteristics of its membership class. Our results show the efficacy of our approach in a variety of everyday environments.  相似文献   

7.
8.
This paper presents a novel method that leverages reasoning capabilities in a computer vision system dedicated to human action recognition. The proposed methodology is decomposed into two stages. First, a machine learning based algorithm – known as bag of words – gives a first estimate of action classification from video sequences, by performing an image feature analysis. Those results are afterward passed to a common-sense reasoning system, which analyses, selects and corrects the initial estimation yielded by the machine learning algorithm. This second stage resorts to the knowledge implicit in the rationality that motivates human behaviour. Experiments are performed in realistic conditions, where poor recognition rates by the machine learning techniques are significantly improved by the second stage in which common-sense knowledge and reasoning capabilities have been leveraged. This demonstrates the value of integrating common-sense capabilities into a computer vision pipeline.  相似文献   

9.
In this paper, we propose a gait recognition algorithm that fuses motion and static spatio-temporal templates of sequences of silhouette images, the motion silhouette contour templates (MSCTs) and static silhouette templates (SSTs). MSCTs and SSTs capture the motion and static characteristic of gait. These templates would be computed from the silhouette sequence directly. The performance of the proposed algorithm is evaluated experimentally using the SOTON data set and the USF data set. We compared our proposed algorithm with other research works on these two data sets. Experimental results show that the proposed templates are efficient for human identification in indoor and outdoor environments. The proposed algorithm has a recognition rate of around 85% on the SOTON data set. The recognition rate is around 80% in intrinsic difference group (probes A-C) of USF data set.  相似文献   

10.
11.
There is a tendency for accidents and even fatalities to arise when people enter hazardous work areas during the construction of projects in urban areas. A limited amount of research has been devoted to developing vision-based proximity warning systems that can determine when people enter a hazardous area automatically. Such systems, however, are unable to identify specific hazards and the status of a piece of plant (e.g., excavator) in real-time. In this paper, we address this limitation and develop a real-time smart video surveillance system that can detect people and the status of plant (i.e. moving or stationary) in a hazardous area. The application of this approach is demonstrated during the construction of a mega-project, the Wuhan Rail Transit System in China. We reveal that our combination of computer vision and deep learning can accurately recognize people in a hazardous work area in real-time during the construction of transport projects. Our developed systems can provide instant feedback concerning unsafe behavior and thus enable appropriate actions to be put in place to prevent their re-occurrence.  相似文献   

12.
The manual signs in sign languages are generated and interpreted using three basic building blocks: handshape, motion, and place of articulation. When combined, these three components (together with palm orientation) uniquely determine the meaning of the manual sign. This means that the use of pattern recognition techniques that only employ a subset of these components is inappropriate for interpreting the sign or to build automatic recognizers of the language. In this paper, we define an algorithm to model these three basic components form a single video sequence of two-dimensional pictures of a sign. Recognition of these three components are then combined to determine the class of the signs in the videos. Experiments are performed on a database of (isolated) American Sign Language (ASL) signs. The results demonstrate that, using semi-automatic detection, all three components can be reliably recovered from two-dimensional video sequences, allowing for an accurate representation and recognition of the signs.  相似文献   

13.
In this study a new approach is presented for the recognition of human actions of everyday life with a fixed camera. The originality of the presented method consists in characterizing sequences by a temporal succession of semi-global features, which are extracted from “space-time micro-volumes”. The advantage of this approach lies in the use of robust features (estimated on several frames) associated with the ability to manage actions with variable durations and easily segment the sequences with algorithms that are specific to time-varying data. Each action is actually characterized by a temporal sequence that constitutes the input of a Hidden Markov Model system for the recognition. Results presented of 1,614 sequences performed by several persons validate the proposed approach.  相似文献   

14.
A survey on vision-based human action recognition   总被引:10,自引:0,他引:10  
Vision-based human action recognition is the process of labeling image sequences with action labels. Robust solutions to this problem have applications in domains such as visual surveillance, video retrieval and human–computer interaction. The task is challenging due to variations in motion performance, recording settings and inter-personal differences. In this survey, we explicitly address these challenges. We provide a detailed overview of current advances in the field. Image representations and the subsequent classification process are discussed separately to focus on the novelties of recent research. Moreover, we discuss limitations of the state of the art and outline promising directions of research.  相似文献   

15.
In this paper a novel framework for the development of computer vision applications that exploit sensors available in mobile devices is presented. The framework is organized as a client–server application that combines mobile devices, network technologies and computer vision algorithms with the aim of performing object recognition starting from photos captured by a phone camera. The client module on the mobile device manages the image acquisition and the query formulation tasks, while the recognition module on the server executes the search on an existing database and sends back relevant information to the client. To show the effectiveness of the proposed solution, the implementation of two possible plug-ins for specific problems is described: landmark recognition and fashion shopping. Experiments on four different landmark datasets and one self-collected dataset of fashion accessories show that the system is efficient and robust in the presence of objects with different characteristics.  相似文献   

16.
The head trajectory is an interesting source of information for behavior recognition and can be very useful for video surveillance applications, especially for fall detection. Consequently, much work has been done to track the head in the 2D image plane using a single camera or in a 3D world using multiple cameras. Tracking the head in real-time with a single camera could be very useful for fall detection. Thus, in this article, an original method to extract the 3D head trajectory of a person in a room is proposed using only one calibrated camera. The head is represented as a 3D ellipsoid, which is tracked with a hierarchical particle filter based on color histograms and shape information. Experiments demonstrated that this method can run in quasi-real-time, providing reasonable 3D errors for a monocular system. Results on fall detection using the head 3D vertical velocity or height obtained from the 3D trajectory are also presented.  相似文献   

17.
A novel, computationally efficient and robust scheme for multiple initial point prediction has been proposed in this paper. A combination of spatial and temporal predictors has been used for initial motion vector prediction, determination of magnitude and direction of motion and search pattern selection. Initially three predictors from the spatio-temporal neighboring blocks are selected. If all these predictors point to the same quadrant then a simple search pattern based on the direction and magnitude of the predicted motion vector is selected. However if the predictors belong to different quadrants then we start the search from multiple initial points to get a clear idea of the location of minimum point. We have also defined local minimum elimination criteria to avoid being trapped in local minimum. In this case multiple rood search patterns are selected. The predictive search center is closer to the global minimum and thus decreases the effect of monotonic error surface assumption and its impact on the motion field. Its additional advantage is that it moves the search closer to the global minimum hence increases the computation speed. Further computational speed up has been obtained by considering the zero-motion threshold for no motion blocks. The image quality measured in terms of PSNR also shows good results.  相似文献   

18.
This paper describes an application of computer vision techniques to road surveillance. It reports on a project undertaken in collaboration with the Research and Innovation group at the Ordnance Survey. The project aims to produce a system that detects and tracks vehicles in real traffic scenes to generate meaningful parameters for use in traffic management. The system has now been implemented using two different approaches: a feature-based approach that detects and groups corner features in a scene into potential vehicle objects, and an appearance-based approach that trains a cascade of classifiers to learn the appearances of vehicles as an arrangement of a set of pre-defined simple Haar features. Potential vehicles detected are then tracked through an image sequence, using the Kalman filter motion tracker. Experimental results of the algorithms are presented in this paper.  相似文献   

19.
一种适用于实时应用的快速运动估计算法   总被引:3,自引:0,他引:3       下载免费PDF全文
提出了一种适用于实时应用的快速运动估计搜索算法,该算法按照一定的顺序依次搜索候选运动矢量,并且尽可能多地采用小菱形搜索模式,同时采用了高效的提前截止准则。实验结果表明,在保持图像质量基本不变的前提下,该文算法的搜索速度是MPEG-4标准中的快速运动估计算法的两倍左右,该文提出的算法在搜索速度和搜索效果两方面具有很大的优势,更适合实时应用。  相似文献   

20.
We propose a novel pervasive system to recognise human daily activities from a wearable device. The system is designed in a form of reading glasses, named ‘Smart Glasses’, integrating a 3-axis accelerometer and a first-person view camera. Our aim is to classify subject’s activities of daily living (ADLs) based on their vision and head motion data. This ego-activity recognition system not only allows caretakers to track on a specific person (such as disabled patient or elderly people), but also has the potential to remind/warn people with cognitive impairments of hazardous situations. We present the following contributions: a feature extraction method from accelerometer and video; a classification algorithm integrating both locomotive (body motions) and stationary activities (without or with small motions); a novel multi-scale dynamic graphical model for structured classification over time. In this paper, we collect, train and validate our system on two large datasets: 20 h of elder ADLs datasets and 40 h of patient ADLs datasets, containing 12 and 14 different activities separately. The results show that our method efficiently improves the system performance (F-Measure) over conventional classification approaches by an average of 20%–40% up to 84.45%, with an overall accuracy of 90.04% for elders. Furthermore, we also validate our method on 30 patients with different disabilities, achieving an overall accuracy up to 77.07%.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号