首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 22 毫秒
1.
The abnormal visual event detection is an important subject in Smart City surveillance where a lot of data can be processed locally in edge computing environment. Real-time and detection effectiveness are critical in such an edge environment. In this paper, we propose an abnormal event detection approach based on multi-instance learning and autoregressive integrated moving average model for video surveillance of crowded scenes in urban public places, focusing on real-time and detection effectiveness. We propose an unsupervised method for abnormal event detection by combining multi-instance visual feature selection and the autoregressive integrated moving average model. In the proposed method, each video clip is modeled as a visual feature bag containing several subvideo clips, each of which is regarded as an instance. The time-transform characteristics of the optical flow characteristics within each subvideo clip are considered as a visual feature instance, and time-series modeling is carried out for multiple visual feature instances related to all subvideo clips in a surveillance video clip. The abnormal events in each surveillance video clip are detected using the multi-instance fusion method. This approach is verified on publically available urban surveillance video datasets and compared with state-of-the-art alternatives. Experimental results demonstrate that the proposed method has better abnormal event detection performance for crowded scene of urban public places with an edge environment.  相似文献   

2.
3.
HMM模型具有良好的适应性,可以自动学习,对预测随机时序数据性能良好。场景是足球视频的基本特征,场景的转换体现了足球视频的摄制、编辑模式,表现了足球视频的语义。提出了一种基于场景分析和HMM的视频语义分析框架,用于识别足球视频中的一些语义事件。为了克服以往基于主颜色和其他底层特征的视频场景分析中存在的较大误差,又提出基于视觉注意模型对足球视频中的场景进行分析。实验结果表明,基于场景分析和HMM的事件识别方法对足球视频中的任意球事件有良好的识别效果  相似文献   

4.
We propose a novel unsupervised learning framework to model activities and interactions in crowded and complicated scenes. Hierarchical Bayesian models are used to connect three elements in visual surveillance: low-level visual features, simple "atomic" activities, and interactions. Atomic activities are modeled as distributions over low-level visual features, and multi-agent interactions are modeled as distributions over atomic activities. These models are learnt in an unsupervised way. Given a long video sequence, moving pixels are clustered into different atomic activities and short video clips are clustered into different interactions. In this paper, we propose three hierarchical Bayesian models, Latent Dirichlet Allocation (LDA) mixture model, Hierarchical Dirichlet Process (HDP) mixture model, and Dual Hierarchical Dirichlet Processes (Dual-HDP) model. They advance existing language models, such as LDA [1] and HDP [2]. Our data sets are challenging video sequences from crowded traffic scenes and train station scenes with many kinds of activities co-occurring. Without tracking and human labeling effort, our framework completes many challenging visual surveillance tasks of board interest such as: (1) discovering typical atomic activities and interactions; (2) segmenting long video sequences into different interactions; (3) segmenting motions into different activities; (4) detecting abnormality; and (5) supporting high-level queries on activities and interactions.  相似文献   

5.
动画视频分析中,实时在线地检测场景切换点是一个基础任务。传统基于像素和阈值的检测方法,不仅需要存储整个动画视频,同时检测结果受目标运动和噪声的影响较大,且阈值设定也不太适用复杂的场景变换。提出一种基于在线Bayesian决策的动画场景切换检测方法,新方法首先对动画帧图像分块并提取其HSV颜色特征,然后将连续帧的相似度存入一个固定长度的缓存队列中,最后基于动态Bayesian决策判定是否有场景切换。多类动画视频的对比实验结果表明,新方法能够在线且更稳健地检测出动画场景切换。  相似文献   

6.
This study addresses the problem of choosing the most suitable probabilistic model selection criterion for unsupervised learning of visual context of a dynamic scene using mixture models. A rectified Bayesian Information Criterion (BICr) and a Completed Likelihood Akaike’s Information Criterion (CL-AIC) are formulated to estimate the optimal model order (complexity) for a given visual scene. Both criteria are designed to overcome poor model selection by existing popular criteria when the data sample size varies from small to large and the true mixture distribution kernel functions differ from the assumed ones. Extensive experiments on learning visual context for dynamic scene modelling are carried out to demonstrate the effectiveness of BICr and CL-AIC, compared to that of existing popular model selection criteria including BIC, AIC and Integrated Completed Likelihood (ICL). Our study suggests that for learning visual context using a mixture model, BICr is the most appropriate criterion given sparse data, while CL-AIC should be chosen given moderate or large data sample sizes.  相似文献   

7.
A key problem in video content analysis using dynamic graphical models is to learn a suitable model structure given observed visual data. We propose a completed likelihood AIC (CL-AIC) scoring function for solving the problem. CL-AIC differs from existing scoring functions in that it aims to optimise explicitly both the explanation and prediction capabilities of a model simultaneously. CL-AIC is derived as a general scoring function suitable for both static and dynamic graphical models with hidden variables. In particular, we formulate CL-AIC for determining the number of hidden states for a hidden Markov model (HMM) and the topology of a dynamically multi-linked HMM (DML-HMM). The effectiveness of CL-AIC on learning the optimal structure of a dynamic graphical model especially given sparse and noisy visual date is shown through comparative experiments against existing scoring functions including Bayesian information criterion (BIC), Akaike’s information criterion (AIC), integrated completed likelihood (ICL), and variational Bayesian (VB). We demonstrate that CL-AIC is superior to the other scoring functions in building dynamic graphical models for solving two challenging problems in video content analysis: (1) content based surveillance video segmentation and (2) discovering causal/temporal relationships among visual events for group activity modelling.  相似文献   

8.
This paper explores the manipulation of time in video editing, enabling to control the chronological time of events. These time manipulations include slowing down (or postponing) some dynamic events while speeding up (or advancing) others. When a video camera scans a scene, aligning all the events to a single time interval will result in a panoramic movie. Time manipulations are obtained by first constructing an aligned space-time volume from the input video, and then sweeping a continuous 2D slice (time front) through that volume, generating a new sequence of images. For dynamic scenes, aligning the input video frames poses an important challenge. We propose to align dynamic scenes using a new notion of "dynamics constancy", which is more appropriate for this task than the traditional assumption of "brightness constancy".Another challenge is to avoid visual seams inside moving objects and other visual artifacts resulting from sweeping the space-time volumes with time fronts of arbitrary geometry. To avoid such artifacts, we formulate the problem of finding optimal time front geometry as one of finding a minimal cut in a 4D graph, and solve it using max-flow methods.  相似文献   

9.
10.
Society is rapidly accepting the use of video cameras in many new and varied locations, but effective methods to utilize and manage the massive resulting amounts of visual data are only slowly developing. This paper presents a framework for live video analysis in which the behaviors of surveillance subjects are described using a vocabulary learned from recurrent motion patterns, for real-time characterization and prediction of future activities, as well as the detection of abnormalities. The repetitive nature of object trajectories is utilized to automatically build activity models in a 3-stage hierarchical learning process. Interesting nodes are learned through Gaussian mixture modeling, connecting routes formed through trajectory clustering, and spatio-temporal dynamics of activities probabilistically encoded using hidden Markov models. Activity models are adapted to small temporal variations in an online fashion using maximum likelihood regression and new behaviors are discovered from a periodic retraining for long-term monitoring. Extensive evaluation on various data sets, typically missing from other work, demonstrates the efficacy and generality of the proposed framework for surveillance-based activity analysis.  相似文献   

11.
We propose a novel framework for automatic discovering and learning of behavioural context for video-based complex behaviour recognition and anomaly detection. Our work differs from most previous efforts on learning visual context in that our model learns multi-scale spatio-temporal rather than static context. Specifically three types of behavioural context are investigated: behaviour spatial context, behaviour correlation context, and behaviour temporal context. To that end, the proposed framework consists of an activity-based semantic scene segmentation model for learning behaviour spatial context, and a cascaded probabilistic topic model for learning both behaviour correlation context and behaviour temporal context at multiple scales. These behaviour context models are deployed for recognising non-exaggerated multi-object interactive and co-existence behaviours in public spaces. In particular, we develop a method for detecting subtle behavioural anomalies against the learned context. The effectiveness of the proposed approach is validated by extensive experiments carried out using data captured from complex and crowded outdoor scenes.  相似文献   

12.
In this paper, a novel probabilistic topic model is proposed for mining activities from complex video surveillance scenes. In order to handle the temporal nature of the video data, we devise a dynamical causal topic model (DCTM) that can detect the latent topics and causal interactions between them. The model is based on the assumption that all temporal relationships between latent topics at neighboring time steps follow a noisy-OR distribution. And the parameter of the noisy-OR distribution is estimated by a data driven approach based on the idea of nonparametric Granger causality statistic. Furthermore, for convergence analysis during model learning process, the Kullback-Leibler between the prior and the posterior distributions is calculated. At last, using the causality matrix learned by DCTM, the total causal influence of each topic is measured. We evaluate the proposed model through experimentations on several challenging datasets and demonstrate that our model can identify the high influence activity in crowded scenes.  相似文献   

13.
视觉目标跟踪任务中的遮挡问题是最具挑战的场景属性之一,研究有效的抗遮挡模型学习方案,对构建适应复杂场景的长期鲁棒跟踪模型具有重要意义。剖析了遮挡影响跟踪性能的本质原因,以抗遮挡性能较好的先进跟踪算法为研究对象,系统分析了模型学习中有效抗遮挡机制,并对其改善长短期遮挡问题的有效性进行比较分析,包括以硬负样本挖掘、有效样本管理、类遮挡硬正样本生成的训练样本提质策略,提供模型充足判别信息;以时间一致性学习、自适应外观学习的被动稳定学习方式和基于多域属性、目标感知、干扰感知、特征融合等适用跟踪任务的主动学习策略,构建可抵抗场景干扰、目标形变等因素可适用跟踪的鲁棒模型;以手工置信度评估、自适应决策、时序记忆库、自适应估计模板的更新策略,平衡模型在线跟踪状态变化目标的适应性与稳定性。通过对代表跟踪算法在遮挡及背景杂乱、出视野、平面内外旋转、形变场景下的性能比较,详尽分析了各策略抗遮挡有效性,指出相比更新策略,数据处理、学习策略设计更有利于提高抗遮挡性能;同时分析了各策略对长期遮挡、背景杂乱、出视野等属性的适用性及适用多类复杂场景的策略。总结了有效抗遮挡策略,提出骨干网替换及迁移场景理解、运动规律等先验信息至跟踪任务的研究方向。  相似文献   

14.
杨诚 《计算机应用》2017,37(10):2866-2870
当前主流的在线广告点击率(CTR)预估算法主要通过机器学习方法从大规模日志数据中挖掘用户与广告间的相关性从而提升点击率预估精度,其不足之处在于没有充分考虑用户实时行为对CTR的影响。对大规模真实在线广告日志进行分析后发现,在会话中,用户CTR的动态变化和用户先前的反馈行为高度相关,不同的用户行为对用户实时CTR的影响不尽相同。基于上述分析结果,提出一种基于用户实时反馈的点击率预估算法。首先,从大规模真实在线广告日志数据中定量分析用户反馈和点击率预估精度的相关关系;然后,根据分析结果将用户的反馈行为特征化;最后,使用机器学习方法对用户的行为进行建模,并根据用户的反馈实时动态调整广告投放,从而提升在线广告系统的点击率预估精度。实验结果表明,用户实时反馈特征和用户点击率高度相关;相比于传统没有用户实时反馈信息的预测模型,该算法在测试集上对AUC(Area Under the Curve)和RIG(Relative Information Gain)指标提升分别为0.83%和6.68%。实验结果表明,用户实时反馈特征显著提高点击率预估的精度。  相似文献   

15.
Visual saliency is an important research topic in the field of computer vision due to its numerous possible applications. It helps to focus on regions of interest instead of processing the whole image or video data. Detecting visual saliency in still images has been widely addressed in literature with several formulations. However, visual saliency detection in videos has attracted little attention, and is a more challenging task due to additional temporal information. A common approach for obtaining a spatio-temporal saliency map is to combine a static saliency map and a dynamic saliency map. In our work, we model the dynamic textures in a dynamic scene with local binary patterns to compute the dynamic saliency map, and we use color features to compute the static saliency map. Both saliency maps are computed using a bio-inspired mechanism of human visual system with a discriminant formulation known as center surround saliency, and are fused in a proper way. The proposed model has been extensively evaluated with diverse publicly available datasets which contain several videos of dynamic scenes, and comparison with state-of-the art methods shows that it achieves competitive results.  相似文献   

16.
跨镜行人追踪是计算机视觉和视频监控公共安全体系构建等领域的重要课题。伴随大规模数据集的发展和深度学习网络的广泛研究,深度学习在跨镜行人追踪问题中取得了良好效果。然而在应用中,除了监控视频自身的不同摄像头、不同视角引起的不同视觉表象变化外,面向跨镜行人追踪的整体数据集偏小,具有标记的训练数据样本量更小,从而制约了基于深度学习的跨镜行人追踪效果。提出了改进型深度迁移学习的跨镜行人追踪算法,将在大数据集上训练好的成熟模型进行微调并迁移到目标数据集上,结合目标数据进行优化,使其能更好地针对新数据集做特征提取。在模型训练过程中,通过改进三元组损失函数,拉近相同样本之间的距离,加大不同样本之间的距离,同时设定正样本之间的最大距离阈值,从而保证特征空间生成的簇不会太大,利于模型的优化。该算法减少了深度学习训练模型的时间,避免了小数据集上数据量不足等缺点,提高了跨镜行人追踪的准确度。在五个基准数据集上的跨镜行人追踪对比实验显示,改进算法取得了良好效果。  相似文献   

17.
We develop hierarchical, probabilistic models for objects, the parts composing them, and the visual scenes surrounding them. Our approach couples topic models originally developed for text analysis with spatial transformations, and thus consistently accounts for geometric constraints. By building integrated scene models, we may discover contextual relationships, and better exploit partially labeled training images. We first consider images of isolated objects, and show that sharing parts among object categories improves detection accuracy when learning from few examples. Turning to multiple object scenes, we propose nonparametric models which use Dirichlet processes to automatically learn the number of parts underlying each object category, and objects composing each scene. The resulting transformed Dirichlet process (TDP) leads to Monte Carlo algorithms which simultaneously segment and recognize objects in street and office scenes.  相似文献   

18.
We propose an efficient approach for authoring dynamic and realistic waterfall scenes based on an acquired video sequence. Traditional video based techniques generate new images by synthesizing 2D samples, i.e., texture sprites chosen from a video sequence. However, they are limited to one fixed viewpoint and cannot provide arbitrary walkthrough into 3D scenes. Our approach extends this scheme by synthesizing dynamic 2D texture sprites and projecting them into 3D space. We first generate a set of basis texture sprites, which capture the representative appearance and motions of waterfall scenes contained in the video sequence. To model the shape and motion of a new waterfall scene, we interactively construct a set of flow lines taking account of physical principles. Along each flow line, the basis texture sprites are manipulated and animated dynamically, yielding a sequence of dynamic texture sprites in 3D space. These texture sprites are displayed using the point splatting technique, which can be accelerated efficiently by graphics hardware. By choosing varied basis texture sprites, waterfall scenes with different appearance and shapes can be conveniently simulated. The experimental results demonstrate that our approach achieves realistic effects and real‐time frame rates on consumer PC platforms. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号