共查询到20条相似文献,搜索用时 46 毫秒
1.
Motion detection with nonstationary background 总被引:4,自引:0,他引:4
Abstract. This paper proposes a new background subtraction method for detecting moving foreground objects from a nonstationary background.
While background subtraction has traditionally worked well for a stationary background, the same cannot be implied for a nonstationary
viewing sensor. To a limited extent, motion compensation for the nonstationary background can be applied. However, in practice,
it is difficult to realize the motion compensation to sufficient pixel accuracy, and the traditional background subtraction
algorithm will fail for a moving scene. The problem is further complicated when the moving target to be detected/tracked is
small, since the pixel error in motion that is compensating the background will subsume the small target. A spatial distribution
of Gaussians (SDG) model is proposed to deal with moving object detection having motion compensation that is only approximately
extracted. The distribution of each background pixel is temporally and spatially modeled. Based on this statistical model,
a pixel in the current frame is then classified as belonging to the foreground or background. For this system to perform under
lighting and environmental changes over an extended period of time, the background distribution must be updated with each
incoming frame. A new background restoration and adaptation algorithm is developed for the nonstationary background. Test
cases involving the detection of small moving objects within a highly textured background and with a pan-tilt tracking system
are demonstrated successfully.
Received: 30 July 2001 / Accepted: 20 April 2002
Correspondence to: Chin-Seng Chau 相似文献
2.
Kolmogorov V Criminisi A Blake A Cross G Rother C 《IEEE transactions on pattern analysis and machine intelligence》2006,28(9):1480-1492
This paper describes models and algorithms for the real-time segmentation of foreground from background layers in stereo video sequences. Automatic separation of layers from color/contrast or from stereo alone is known to be error-prone. Here, color, contrast, and stereo matching information are fused to infer layers accurately and efficiently. The first algorithm, layered dynamic programming (LDP), solves stereo in an extended six-state space that represents both foreground/background layers and occluded regions. The stereo-match likelihood is then fused with a contrast-sensitive color model that is learned on-the-fly and stereo disparities are obtained by dynamic programming. The second algorithm, layered graph cut (LGC), does not directly solve stereo. Instead, the stereo match likelihood is marginalized over disparities to evaluate foreground and background hypotheses and then fused with a contrast-sensitive color model like the one used in LDP. Segmentation is solved efficiently by ternary graph cut. Both algorithms are evaluated with respect to ground truth data and found to have similar performance, substantially better than either stereo or color/contrast alone. However, their characteristics with respect to computational efficiency are rather different. The algorithms are demonstrated in the application of background substitution and shown to give good quality composite video output. 相似文献
3.
文中提出一种羽毛球比赛的2D视频转换到3D视频的算法。在这类视频中,前景是最受关注的部分,准确地从背景中提取出前景对象是获取深度图的关键。文中采用一种改进的图割算法来获取前景,并根据场景结构构建背景深度模型,获取背景深度图;在背景深度图的基础上,根据前景与镜头之间的距离关系为前景对象进行深度赋值,从而得到前景深度图。然后,融合背景深度图和前景深度图,得到完整的深度图。最后,通过基于深度图像的虚拟视点绘制技术DIBR来获取用于3D显示的立体图像对。实验结果表明,最终生成的立体图像对具有较好的3D效果。 相似文献
4.
We propose a system that simultaneously utilizes the stereo disparity and optical flow information of real-time stereo grayscale
multiresolution images for the recognition of objects and gestures in human interactions. For real-time calculation of the
disparity and optical flow information of a stereo image, the system first creates pyramid images using a Gaussian filter.
The system then determines the disparity and optical flow of a low-density image and extracts attention regions in a high-density
image. The three foremost regions are recognized using higher-order local autocorrelation features and linear discriminant
analysis. As the recognition method is view based, the system can process the face and hand recognitions simultaneously in
real time. The recognition features are independent of parallel translations, so the system can use unstable extractions from
stereo depth information. We demonstrate that the system can discriminate the users, monitor the basic movements of the user,
smoothly learn an object presented by users, and can communicate with users by hand signs learned in advance.
Received: 31 January 2000 / Accepted: 1 May 2001
Correspondence to: I. Yoda (e-mail: yoda@ieee.org, Tel.: +81-298-615941, Fax: +81-298-613313) 相似文献
5.
A system to navigate a robot into a ship structure 总被引:1,自引:0,他引:1
Markus Vincze Minu Ayromlou Carlos Beltran Antonios Gasteratos Simon Hoffgaard Ole Madsen Wolfgang Ponweiser Michael Zillich 《Machine Vision and Applications》2003,14(1):15-25
Abstract. A prototype system has been built to navigate a walking robot into a ship structure. The 8-legged robot is equipped with
an active stereo head. From the CAD-model of the ship good view points are selected, such that the head can look at locations
with sufficient edge features, which are extracted automatically for each view. The pose of the robot is estimated from the
features detected by two vision approaches. One approach searches in stereo images for junctions and measures the 3-D position.
The other method uses monocular image and tracks 2-D edge features. Robust tracking is achieved with a method of edge projected
integration of cues (EPIC). Two inclinometres are used to stabilise the head while the robot moves. The results of the final
demonstration to navigate the robot within centimetre accuracy are given. 相似文献
6.
Zhengyou Zhang 《Machine Vision and Applications》1997,10(1):27-34
This paper describes a complete stereovision system, which was originally developed for planetary applications, but can be
used for other applications such as object modeling. A new effective on-site calibration technique has been developed, which
can make use of the information from the surrounding environment as well as the information from the calibration apparatus.
A correlation-based stereo algorithm is used, which can produce sufficient dense range maps with an algorithmic structure
for fast implementations. A technique based on iterative closest-point matching has been developed for registration of successive
depth maps and computation of the displacements between successive positions. A statistical method based on the distance distribution
is integrated into this registration technique, which allows us to deal with such important problems as outliers, occlusion,
appearance, and disappearance. Finally, the registered maps are expressed in the same coordinate system and are fused, erroneous
data are eliminated through consistency checking, and a global digital elevation map is built incrementally. 相似文献
7.
Yoshimitsu Aoki Shuji Hashimoto Masahiko Terajima Akihiko Nakasima 《The Visual computer》2001,17(2):121-131
We propose a prototype of a facial surgery simulation system for surgical planning and the prediction of facial deformation.
We use a physics-based human head model. Our head model has a 3D hierarchical structure that consists of soft tissue and the
skull, constructed from the exact 3D CT patient data. Anatomic points measured on X-ray images from both frontal and side
views are used to fire the model to the patient's head.
The purposes of this research is to analyze the relationship between changes of mandibular position and facial morphology
after orthognathic surgery, and to simulate the exact postoperative 3D facial shape. In the experiment, we used our model
to predict the facial shape after surgery for patients with mandibular prognathism. Comparing the simulation results and the
actual facial images after the surgery shows that the proposed method is practical. 相似文献
8.
We present a scheme for reliable and accurate surface reconstruction from stereoscopic images containing only fine texture
and no stable high-level features. Partial shape information is used to improve surface computation: first by fitting an approximate,
global, parametric model, and then by refining this model via local correspondence processes. This scheme eliminates the window
size selection problem in existing area-based stereo correspondence schemes. These ideas are integrated in a practical vision
system that is being used by environmental scientists to study wind erosion of bulk material such as coal ore being transported
in open rail cars.
Received: 14 August 1995 / Accepted: 27 May 1997 相似文献
9.
Abstract. In this paper, we present a system that was developed for the European Space Agency (ESA) for the support of planetary exploration.
The system that is sent to the planetary surface consists of a rover and a lander. The lander contains a stereo head equipped
with a pan-tilt mechanism. This vision system is used both for modeling the terrain and for localization of the rover. Both
tasks are necessary for the navigation of the rover. Due to the stress that occurs during the flight, a recalibration of the
stereo-vision system is required once it is deployed on the planet. Practical limitations make it unfeasible to use a known
calibration pattern for this purpose; therefore, a new calibration procedure had to be developed that could work on images
of the planetary environment. This automatic procedure recovers the relative orientation of the cameras and the pan and tilt
axes, as well as the exterior orientation for all the images. The same images are subsequently used to reconstruct the 3-D
structure of the terrain. For this purpose, a dense stereo-matching algorithm is used that (after rectification) computes
a disparity map. Finally, all the disparity maps are merged into a single digital terrain model. In this paper, a simple and
elegant procedure is proposed that achieves that goal. The fact that the same images can be used for both calibration and
3-D reconstruction is important, since, in general, the communication bandwidth is very limited. In addition to navigation
and path planning, the 3-D model of the terrain is also used for virtual-reality simulations of the mission, wherein the model
is texture mapped with the original images. The system has been implemented, and the first tests on the ESA planetary terrain
testbed were successful. 相似文献
10.
Automatic high-resolution optoelectronic photogrammetric 3D surface geometry acquisition system 总被引:1,自引:0,他引:1
A fast, high-resolution, automatic, non-contact 3D surface geometry measuring system using a photogrammetric optoelectronic
technique based on lateral-photoeffect diode detectors has been developed. Designed for the acquisition of surface geometries
such as machined surfaces, biological surfaces, and deformed parts, the system can be used in design, manufacturing, inspection,
and range finding. A laser beam is focused and scanned onto the surface of the object to be measured. Two cameras in stereo
positions capture the reflected light from the surface at 10 kHz. Photogrammetric triangulation quickly transforms the pair
of 2D signals created by the camera detectors into 3D coordinates of the light spot. Because only one small spot on the object
is illuminated at a time, the stereo correspondence problem is solved in real time. The resolution is determined by a 12-bit
A/D converter and can be improved up to 25 60025 600 by oversampling. The irregular 3D data can be regularized for use with image-based algorithms.
Received: 8 October 1996 / Accepted: 3 February 1997 相似文献
11.
Practical volumetric sculpting 总被引:3,自引:1,他引:2
sculpture metaphor for rapid shape prototyping. The sculpted shape is the isosurface of a scalar field spatially sampled. The user can deposit
material wherever he desires in space and then iteratively refine it, using a tool to add, remove, paint, or smooth some material.
We allow the use of free-form tools that can be designed inside the application. We also propose a technique to mimic local
deformations so that we can use the tool as a stamp to make imprints on an existing shape. We focus on the rendering quality
too, exploiting lighting variations and environment textures that simulate good-quality highlights on the surface. Both greatly
enhance the shape estimation, which is a crucial step in this iterative design process, in our opinion. The use of stereo
also greatly eases the understanding of spatial relationships. Our current implementation is based on GLUT and can run the
application both on Unix-based systems, such as Irix and Linux, and on Windows systems. We obtain interactive response times,
strongly related to the size of the tool. The performance issues and limitations are discussed. 相似文献
12.
Katrin Franke Mario Köppen 《International Journal on Document Analysis and Recognition》2001,3(4):218-231
Computer-based forensic handwriting analysis requires sophisticated methods for the pre-processing of digitized paper documents,
in order to provide high-quality digitized handwriting, which represents the original handwritten product as accurately as
possible. Due to the requirement of processing a huge amount of different document types, neither a standardized queue of
processing stages, fixed parameter sets nor fixed image operations are qualified for such pre-processing methods. Thus, we
present an open layered framework that covers adaptation abilities at the parameter, operator, and algorithm levels. Moreover,
an embedded module, which uses genetic programming, might generate specific filters for background removal on-the-fly. The
framework is understood as an assistance system for forensic handwriting experts and has been in use by the Bundeskriminalamt,
the federal police bureau in Germany, for two years. In the following, the layered framework will be presented, fundamental
document-independent filters for textured, homogeneous background removal and for foreground removal will be described, as
well as aspects of the implementation. Results of the framework-application will also be given.
Received July 12, 2000 / Revised October 13, 2000 相似文献
13.
A connection between two hosts across a wide-area network may consist of many sessions over time, each called an incarnation. A connection is synchronized using a connection establishment protocol, based on a handshake mechanism, to allow reliable exchange of data. This paper identifies the precise level of handshake needed under different assumptions
on the nodes and on the network, using a formal model for connection management. In particular, the following parameters are
studied: the size of the memory at the nodes, the information retained between incarnations, and the existence of time constraints
on the network. Among the results we obtain are: (1) If both nodes have bounded memory, no incarnation management protocol
exists. (2) If the nodes have unbounded memory, then a two-way handshake incarnation management protocol exists. (3) If the
nodes have unbounded memory, and the server does not retain connection-specific information between incarnations, then a three-way
handshake incarnation management protocol exists. On the other hand, a two-way handshake incarnation management protocol does
not exist, even if some global information is retained. (4) If a bound on maximum packet lifetime (MPL) is known, then a two-way
handshake incarnation management protocol exists, in which the server does not retain connection-specific information between
incarnations.
Received: July 1995 / Accepted: July 1997 相似文献
14.
背景感知相关滤波算法将目标背景和前景一起建模[1],利用包含背景信息的负样本进行训练,得到高鲁棒性核相关滤波器,实现跟踪目标与背景的分离。这种单目实现的相关滤波跟踪算法能够做到实时跟踪,却以丢失深度为代价。本文提出了一种将双目立体视觉深度提取与基于单目的背景感知相关滤波相结合的算法,该算法在做到实时跟踪的同时,能够对视频序列中的若干帧利用改进的双目立体匹配SGM算法得到视差图,反馈出跟踪目标的深度信息。实验结果表明该算法具备实时性,且三维坐标定位准确。 相似文献
15.
Automatic mineral identification using evolutionary computation technology is discussed. Thin sections of mineral samples
are photographed digitally using a computer-controlled rotating polarizer stage on a petrographic microscope. A suite of image
processing functions is applied to the images. Filtered image data for identified mineral grains is then selected for use
as training data for a genetic programming system, which automatically synthesizes computer programs that identify these grains.
The evolved programs use a decision-tree structure that compares the mineral image values with one other, resulting in a thresholding
analysis of the multi-dimensional colour and textural space of the mineral images.
Received: 18 October 1999 / Accepted: 20 January 2001 相似文献
16.
Yin P Criminisi A Winn J Essa I 《IEEE transactions on pattern analysis and machine intelligence》2011,33(1):30-42
This paper presents an automatic segmentation algorithm for video frames captured by a (monocular) webcam that closely approximates depth segmentation from a stereo camera. The frames are segmented into foreground and background layers that comprise a subject (participant) and other objects and individuals. The algorithm produces correct segmentations even in the presence of large background motion with a nearly stationary foreground. This research makes three key contributions: First, we introduce a novel motion representation, referred to as "motons," inspired by research in object recognition. Second, we propose estimating the segmentation likelihood from the spatial context of motion. The estimation is efficiently learned by random forests. Third, we introduce a general taxonomy of tree-based classifiers that facilitates both theoretical and experimental comparisons of several known classification algorithms and generates new ones. In our bilayer segmentation algorithm, diverse visual cues such as motion, motion context, color, contrast, and spatial priors are fused by means of a conditional random field (CRF) model. Segmentation is then achieved by binary min-cut. Experiments on many sequences of our videochat application demonstrate that our algorithm, which requires no initialization, is effective in a variety of scenes, and the segmentation results are comparable to those obtained by stereo systems. 相似文献
17.
We present an autonomous mobile robot navigation system using stereo fish-eye lenses for navigation in an indoor structured
environment and for generating a model of the imaged scene. The system estimates the three-dimensional (3D) position of significant
features in the scene, and by estimating its relative position to the features, navigates through narrow passages and makes
turns at corridor ends. Fish-eye lenses are used to provide a large field of view, which images objects close to the robot
and helps in making smooth transitions in the direction of motion. Calibration is performed for the lens-camera setup and
the distortion is corrected to obtain accurate quantitative measurements. A vision-based algorithm that uses the vanishing
points of extracted segments from a scene in a few 3D orientations provides an accurate estimate of the robot orientation.
This is used, in addition to 3D recovery via stereo correspondence, to maintain the robot motion in a purely translational
path, as well as to remove the effects of any drifts from this path from each acquired image. Horizontal segments are used
as a qualitative estimate of change in the motion direction and correspondence of vertical segment provides precise 3D information
about objects close to the robot. Assuming detected linear edges in the scene as boundaries of planar surfaces, the 3D model
of the scene is generated. The robot system is implemented and tested in a structured environment at our research center.
Results from the robot navigation in real environments are presented and discussed.
Received: 25 September 1996 / Accepted: 20 October 1996 相似文献
18.
Stefan Manegold Peter A. Boncz Martin L. Kersten 《The VLDB Journal The International Journal on Very Large Data Bases》2000,9(3):231-246
In the past decade, advances in the speed of commodity CPUs have far out-paced advances in memory latency. Main-memory access
is therefore increasingly a performance bottleneck for many computer applications, including database systems. In this article,
we use a simple scan test to show the severe impact of this bottleneck. The insights gained are translated into guidelines
for database architecture, in terms of both data structures and algorithms. We discuss how vertically fragmented data structures
optimize cache performance on sequential data access. We then focus on equi-join, typically a random-access operation, and
introduce radix algorithms for partitioned hash-join. The performance of these algorithms is quantified using a detailed analytical
model that incorporates memory access cost. Experiments that validate this model were performed on the Monet database system.
We obtained exact statistics on events such as TLB misses and L1 and L2 cache misses by using hardware performance counters
found in modern CPUs. Using our cost model, we show how the carefully tuned memory access pattern of our radix algorithms
makes them perform well, which is confirmed by experimental results.
Received April 20, 2000 / Accepted June 23, 2000 相似文献
19.
Stereo Matching with Transparency and Matting 总被引:2,自引:2,他引:0
This paper formulates and solves a new variant of the stereo correspondence problem: simultaneously recovering the disparities, true colors, and opacities of visible surface elements. This problem arises in newer applications of stereo reconstruction, such as view interpolation and the layering of real imagery with synthetic graphics for special effects and virtual studio applications. While this problem is intrinsically more difficult than traditional stereo correspondence, where only the disparities are being recovered, it provides a principled way of dealing with commonly occurring problems such as occlusions and the handling of mixed (foreground/background) pixels near depth discontinuities. It also provides a novel means for separating foreground and background objects (matting), without the use of a special blue screen. We formulate the problem as the recovery of colors and opacities in a generalized 3D (x, y, d) disparity space, and solve the problem using a combination of initial evidence aggregation followed by iterative energy minimization. 相似文献
20.
Data overload is a generic and tremendously difficult problem that has only grown with each new wave of technological capabilities.
As a generic and persistent problem, three observations are in need of explanation: Why is data overload so difficult to address?
Why has each wave of technology exacerbated, rather than resolved, data overload? How are people, as adaptive responsible
agents in context, able to cope with the challenge of data overload? In this paper, first we examine three different characterisations
that have been offered to capture the nature of the data overload problem and how they lead to different proposed solutions.
As a result, we propose that (a) data overload is difficult because of the context sensitivity problem – meaning lies, not
in data, but in relationships of data to interests and expectations and (b) new waves of technology exacerbate data overload
when they ignore or try to finesse context sensitivity. The paper then summarises the mechanisms of human perception and cognition
that enable people to focus on the relevant subset of the available data despite the fact that what is interesting depends
on context. By focusing attention on the root issues that make data overload a difficult problem and on people’s fundamental
competence, we have identified a set of constraints that all potential solutions must meet. Notable among these constraints
is the idea that organisation precedes selectivity. These constraints point toward regions of the solution space that have
been little explored. In order to place data in context, designers need to display data in a conceptual space that depicts
the relationships, events and contrasts that are informative in a field of practice. 相似文献