首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
When an image is viewed at varying resolutions, it is known to create discrete perceptual jumps or transitions amid the continuous intensity changes. In this paper, we study a perceptual scale-space theory which differs from the traditional image scale-space theory in two aspects. (i) In representation, the perceptual scale-space adopts a full generative model. From a Gaussian pyramid it computes a sketch pyramid where each layer is a primal sketch representation (Guo et al. in Comput. Vis. Image Underst. 106(1):5–19, 2007)—an attribute graph whose elements are image primitives for the image structures. Each primal sketch graph generates the image in the Gaussian pyramid, and the changes between the primal sketch graphs in adjacent layers are represented by a set of basic and composite graph operators to account for the perceptual transitions. (ii) In computation, the sketch pyramid and graph operators are inferred, as hidden variables, from the images through Bayesian inference by stochastic algorithm, in contrast to the deterministic transforms or feature extraction, such as computing zero-crossings, extremal points, and inflection points in the image scale-space. Studying the perceptual transitions under the Bayesian framework makes it convenient to use the statistical modeling and learning tools for (a) modeling the Gestalt properties of the sketch graph, such as continuity and parallelism etc; (b) learning the most frequent graph operators, i.e. perceptual transitions, in image scaling; and (c) learning the prior probabilities of the graph operators conditioning on their local neighboring sketch graph structures. In experiments, we learn the parameters and decision thresholds through human experiments, and we show that the sketch pyramid is a more parsimonious representation than a multi-resolution Gaussian/Wavelet pyramid. We also demonstrate an application on adaptive image display—showing a large image in a small screen (say PDA) through a selective tour of its image pyramid. In this application, the sketch pyramid provides a means for calculating information gain in zooming-in different areas of an image by counting a number of operators expanding the primal sketches, such that the maximum information is displayed in a given number of frames. A short version was published in ICCV05 (Wang et al. 2005).  相似文献   

2.
We present a novel region-based curve evolution algorithm which has three primary contributions: (i) non-parametric estimation of probability distributions using the recently developed NP windows method; (ii) an inequality-constrained least squares method to model the image histogram with a mixture of nonparametric probability distributions; and (iii) accommodation of the partial volume effect, which is primarily due to low resolution images, and which often poses a significant challenge in medical image analysis (our primary application area). We first approximate the image intensity histogram as a mixture of non-parametric probability density functions (PDFs), justifying its use with respect to medical image analysis. The individual densities in the mixture are estimated using the recent NP windows PDF estimation method, which builds a continuous representation of discrete signals. A Bayesian framework is then formulated in which likelihood probabilities are given by the non-parametric PDFs and prior probabilities are calculated using an inequality constrained least squares method. The non-parametric PDFs are then learnt and the segmentation solution is spatially regularised using a level sets framework. The log ratio of the posterior probabilities is used to drive the level set evolution. As background to our approach, we recall related developments in level set methods. Results are presented for a set of synthetic and natural images as well as simulated and real medical images of various anatomical organs. Results on a range of images show the effectiveness of the proposed algorithm.  相似文献   

3.
In this paper, a new systolic multiprocessor architecture for soft tomography algorithms that explores the intrinsic parallelisms and hardware resources which are available in recent Field Programmable Gate Arrays architectures is presented. The soft tomography algorithms such as Electrical Capacitance Tomography (ECT), Magnetic Inductance Tomography (MIT), and Electrical Impedance Tomography (EIT), while they use different sensors and data acquisition modules, they feature common computation requirements which consist of intensive matrix multiplications and fast/frequent memory accesses. Using the variable bit-width and fixed-point multipliers array available in the DSP blocks, which cooperatively perform the partial matrix product with associated Arithmetic and Logic Units (ALU), and distributed memory available in Stratix V FPGA, a dedicated scalable architecture is suggested to host the Landweber algorithm. The experimental results indicate that 16,949 frames of (32 × 32 pixels) can be reconstructed in one second if each element of the matrix is attributed to 18 bits and using a clock frequency of 400 MHz. This is more than enough in most process imaging applications. In addition, the accuracy of the image reconstruction using 18 bits/operand is found to be acceptable since it exceeds 86%. More accuracy can be achieved up to 99% if 36 bits/operand are used which leads to an image reconstruction throughput of 1272 frames /s (for image size 32 × 32).  相似文献   

4.
An algorithm is proposed for the design of "on-line" learning controllers to control a discrete stochastic plant. The subjective probabilities of applying control actions from a finite set of allowable actions using random strategy, after any plant-environment situation (called an "event") is observed, are modified through the algorithm. The subjective probability for the optimal action is proved to approach one with probability one for any observed event. The optimized performance index is the conditional expectation of the instantaneous performance evaluations with respect to the observed events and the allowable actions. The algorithm is described through two transformations, T1and T2. After the "ordering transformation" T1is applied on the estimates of the performance indexes of the allowable actions, the "learning transformation" T2modifies the subjective probabilities. The cases of discrete and continuous features are considered. In the latter, the Potential Function Method is employed. The algorithm is compared with a linear reinforcement scheme and computer simulation results are presented.  相似文献   

5.
Fractal image compression (FIC) is a very popular coding technique use in image/video applications due to its simplicity and superior performance. The major drawback with FIC is that it is very time consuming algorithm, especially when a full search is attempted. Hence, it is very challenging to achieve a real-time operation if this algorithm is implemented on general processors. In this paper, a new parallel architecture with bit-width reduction scheme is implemented. The hardware is synthesized on Altera Cyclone II FPGA whose architecture is optimized at circuit level in order to achieve a real-time operation. The performance of the proposed architecture is evaluated in terms of runtime, peak-signal-to-noise-ratio (PSNR) and compression efficiency. On average the speedup of 3 was attainable through a bit-width reduction while the PSNR was maintained at acceptable level. Empirical results demonstrate that this firmware is competitive when compared to other existing hardware with PSNR averaging at 29.9 dB, 5.82% compression efficiency and a runtime equivalent to video speed of 101 frames per second (fps).  相似文献   

6.
An autoadaptive neuro-fuzzy segmentation and edge detection architecture is presented. The system consists of a multilayer perceptron (MLP)-like network that performs image segmentation by adaptive thresholding of the input image using labels automatically pre-selected by a fuzzy clustering technique. The proposed architecture is feedforward, but unlike the conventional MLP the learning is unsupervised. The output status of the network is described as a fuzzy set. Fuzzy entropy is used as a measure of the error of the segmentation system as well as a criterion for determining potential edge pixels. The proposed system is capable to perform automatic multilevel segmentation of images, based solely on information contained by the image itself. No a priori assumptions whatsoever are made about the image (type, features, contents, stochastic model, etc.). Such an "universal" algorithm is most useful for applications that are supposed to work with different (and possibly initially unknown) types of images. The proposed system can be readily employed, "as is," or as a basic building block by a more sophisticated and/or application-specific image segmentation algorithm. By monitoring the fuzzy entropy relaxation process, the system is able to detect edge pixels  相似文献   

7.
In the paper we present an algorithm for creating region-adjacency-graph (RAG) pyramids on TurboNet, an experimental parallel computer system. Each level of these hierarchies of irregular tessellations is generated by independent stochastic processes that adapt the structure of the pyramid to the content of the image. RAGs can be used in multiresolution image analysis to extract connected components from labeled images. The implementation of the algorithm is discussed and performance results are presented for three different communication techniques which are supported by the TurboNet's hybrid architecture. The results indicate that efficient communications are vital to good performance of the algorithm. © 1997 by John Wiley & Sons, Ltd.  相似文献   

8.
In this paper, a statistical model called statistical local spatial relations (SLSR) is presented as a novel technique of a learning model with spatial and statistical information for semantic image classification. The model is inspired by probabilistic Latent Semantic Analysis (PLSA) for text mining. In text analysis, PLSA is used to discover topics in a corpus using the bag-of-word document representation. In SLSR, we treat image categories as topics, therefore an image containing instances of multiple categories can be modeled as a mixture of topics. More significantly, SLSR introduces spatial relation information as a factor which is not present in PLSA. SLSR has rotation, scale, translation and affine invariant properties and can solve partial occlusion problems. Using the Dirichlet process and variational Expectation-Maximization learning algorithm, SLSR is developed as an implementation of an image classification algorithm. SLSR uses an unsupervised process which can capture both spatial relations and statistical information simultaneously. The experiments are demonstrated on some standard data sets and show that the SLSR model is a promising model for semantic image classification problems.
Wenhui Li (Corresponding author)Email:

Dongfeng Han   received the B.Sc. 2002 and M.S. 2005 in computer science and technology from Jilin University, Changchun, P. R. China. From 2005, he pursuits the PhD degree in computer science and technology Jilin University. His research interests include computer vision, image processing, machine learning and pattern recognition. Wenhui Li   received the PhD degree in computer science from Jilin University in 1996. Now he is a professor of Jilin University. His research interests include computer vision, computer graphic and virtual reality. Zongcheng Li   undergraduated student of Shandong University of Technology, P. R. China. His research interests include computer vision and image processing.   相似文献   

9.
In this paper, an information-theoretic approach for multimodal image registration is presented. In the proposed approach, image registration is carried out by maximizing a Tsallis entropy-based divergence using a modified simultaneous perturbation stochastic approximation algorithm. This divergence measure achieves its maximum value when the conditional intensity probabilities of the transformed target image given the reference image are degenerate distributions. Experimental results are provided to demonstrate the registration accuracy of the proposed approach in comparison to existing entropic image alignment techniques. The feasibility of the proposed algorithm is demonstrated on medical images from magnetic resonance imaging, computer tomography, and positron emission tomography.  相似文献   

10.
11.
In this paper we present a novel hardware architecture for real-time image compression implementing a fast, searchless iterated function system (SIFS) fractal coding method. In the proposed method and corresponding hardware architecture, domain blocks are fixed to a spatially neighboring area of range blocks in a manner similar to that given by Furao and Hasegawa. A quadtree structure, covering from 32 × 32 blocks down to 2 × 2 blocks, and even to single pixels, is used for partitioning. Coding of 2 × 2 blocks and single pixels is unique among current fractal coders. The hardware architecture contains units for domain construction, zig-zag transforms, range and domain mean computation, and a parallel domain-range match capable of concurrently generating a fractal code for all quadtree levels. With this efficient, parallel hardware architecture, the fractal encoding speed is improved dramatically. Additionally, attained compression performance remains comparable to traditional search-based and other searchless methods. Experimental results, with the proposed hardware architecture implemented on an Altera APEX20K FPGA, show that the fractal encoder can encode a 512 × 512 × 8 image in approximately 8.36 ms operating at 32.05 MHz. Therefore, this architecture is seen as a feasible solution to real-time fractal image compression.
David Jeff JacksonEmail:
  相似文献   

12.
In this paper we present a Bayesian framework for parsing images into their constituent visual patterns. The parsing algorithm optimizes the posterior probability and outputs a scene representation as a parsing graph, in a spirit similar to parsing sentences in speech and natural language. The algorithm constructs the parsing graph and re-configures it dynamically using a set of moves, which are mostly reversible Markov chain jumps. This computational framework integrates two popular inference approaches—generative (top-down) methods and discriminative (bottom-up) methods. The former formulates the posterior probability in terms of generative models for images defined by likelihood functions and priors. The latter computes discriminative probabilities based on a sequence (cascade) of bottom-up tests/filters. In our Markov chain algorithm design, the posterior probability, defined by the generative models, is the invariant (target) probability for the Markov chain, and the discriminative probabilities are used to construct proposal probabilities to drive the Markov chain. Intuitively, the bottom-up discriminative probabilities activate top-down generative models. In this paper, we focus on two types of visual patterns—generic visual patterns, such as texture and shading, and object patterns including human faces and text. These types of patterns compete and cooperate to explain the image and so image parsing unifies image segmentation, object detection, and recognition (if we use generic visual patterns only then image parsing will correspond to image segmentation (Tu and Zhu, 2002. IEEE Trans. PAMI, 24(5):657–673). We illustrate our algorithm on natural images of complex city scenes and show examples where image segmentation can be improved by allowing object specific knowledge to disambiguate low-level segmentation cues, and conversely where object detection can be improved by using generic visual patterns to explain away shadows and occlusions.  相似文献   

13.
14.
In this paper, we propose a parallel algorithm for data classification, and its application for Magnetic Resonance Images (MRI) segmentation. The studied classification method is the well-known c-means method. The use of the parallel architecture in the classification domain is introduced in order to improve the complexities of the corresponding algorithms, so that they will be considered as a pre-processing procedure. The proposed algorithm is assigned to be implemented on a parallel machine, which is the reconfigurable mesh computer (RMC). The image of size (m × n) to be processed must be stored on the RMC of the same size, one pixel per processing element (PE).  相似文献   

15.
This paper presents a study of the parallelism of a Principal Component Analysis (PCA) algorithm and its adaptation to a manycore MPPA (Massively Parallel Processor Array) architecture, which gathers 256 cores distributed among 16 clusters. This study focuses on porting hyperspectral image processing into manycore platforms by optimizing their processing to fulfill real-time constraints, fixed by the image capture rate of the hyperspectral sensor. Real-time is a challenging objective for hyperspectral image processing, as hyperspectral images consist of extremely large volumes of data and this problem is often solved by reducing image size before starting the processing itself. To tackle the challenge, this paper proposes an analysis of the intrinsic parallelism of the different stages of the PCA algorithm with the objective of exploiting the parallelization possibilities offered by an MPPA manycore architecture. Furthermore, the impact on internal communication when increasing the level of parallelism is also analyzed.Experimenting with medical images obtained from two different surgical use cases, an average speedup of 20 is achieved. Internal communications are shown to rapidly become the bottleneck that reduces the achievable speedup offered by the PCA parallelization. As a result of this study, PCA processing time is reduced to less than 6 s, a time compatible with the targeted brain surgery application requiring 1 frame-per-minute.  相似文献   

16.
Evolvable hardware is a system that modifies its architecture and behavior to adapt with changes of the environment. It is formed by reconfigurable processing elements driven by an evolutionary algorithm. In this paper, we study a reconfigurable HexCell-based systolic array architecture for evolvable systems on FPGA. HexCell is a processing element with a tile-able hexagonal-shaped cell for reconfigurable systolic arrays on FPGAs. The cell has three input ports feed into an internal functional-unit connected to three output ports. The functional-unit is configured using dynamic partial reconfiguration (DPR), and the output ports, in contrast, are configured using virtual reconfiguration circuit (VRC). Our proposed architecture combines the merits of both DPR and VRC to achieve fast reconfiguration and accelerated evolution. A HexCell-based 4 × 4 array was implemented on FPGA and utilized 32.5% look-up tables, 31.3% registers, and 1.4% block RAMs of Artix-7 (XC7Z020) while same-size conventional array consumed 8.7%, 5.1%, and 20.7% of the same FPGA, respectively. As a case study, we used an adaptive image filter as a test application. Results showed that the fitness of the best filters generated by our proposed architecture were generally fitter than those generated by the conventional state-of-the-art systolic array on the selected application. Also, performing 900,000 evaluations on HexCell array was 2.6 × faster than the conventional one.  相似文献   

17.
We present a hierarchical architecture and learning algorithm for visual recognition and other visual inference tasks such as imagination, reconstruction of occluded images, and expectation-driven segmentation. Using properties of biological vision for guidance, we posit a stochastic generative world model and from it develop a simplified world model (SWM) based on a tractable variational approximation that is designed to enforce sparse coding. Recent developments in computational methods for learning overcomplete representations (Lewicki & Sejnowski, 2000; Teh, Welling, Osindero, & Hinton, 2003) suggest that overcompleteness can be useful for visual tasks, and we use an overcomplete dictionary learning algorithm (Kreutz-Delgado, et al., 2003) as a preprocessing stage to produce accurate, sparse codings of images. Inference is performed by constructing a dynamic multilayer network with feedforward, feedback, and lateral connections, which is trained to approximate the SWM. Learning is done with a variant of the back-propagation-through-time algorithm, which encourages convergence to desired states within a fixed number of iterations. Vision tasks require large networks, and to make learning efficient, we take advantage of the sparsity of each layer to update only a small subset of elements in a large weight matrix at each iteration. Experiments on a set of rotated objects demonstrate various types of visual inference and show that increasing the degree of overcompleteness improves recognition performance in difficult scenes with occluded objects in clutter.  相似文献   

18.
An algorithm is proposed for the design of ``on-line' learning controllers to control a discrete stochastic plant. The subjective probabilities of applying control actions from a finite set of allowable actions using random strategy, after any plant-environment situation (called an ``event') is observed, are modified through the algorithm. The subjective probability for the optimal action is proved to approach one with probability one for any observed event. The optimized performance index is the conditional expectation of the instantaneous performance evaluations with respect to the observed events and the allowable actions. The algorithm is described through two transformations, T1, and T2. After the ``ordering transformation' T1 is applied on the estimates of the performance indexes of the allowable actions, the ``learning transformation' T2 modifies the subjective probabilities. The cases of discrete and continuous features are considered. In the latter, the Potential Function Method is employed. The algorithm is compared with a linear reinforcement schenme and computer simulation results are presented.  相似文献   

19.
We consider stochastic neural networks, the objective of which is robust prediction for spatial control. We develop neural structures and operations, in which the representations of the environment are preprocessed and provided in quantized format to the prediction layer, and in which the response of each neuron is binary. We also identify the pertinent stochastic network parameters, and subsequently develop a supervised learning algorithm for them. The on-line learning algorithm is based an the Kullback-Leibler performance criterion, it induces backpropagation, and guarantees fast convergence to the prediction probabilities induced by the environment, with probability one.  相似文献   

20.
This paper proposes a computer-aided diagnosis tool for the early detection of atherosclerosis. This pathology is responsible for major cardiovascular diseases, which are the main cause of death worldwide. Among preventive measures, the intima-media thickness (IMT) of the common carotid artery stands out as early indicator of atherosclerosis and cardiovascular risk. In particular, IMT is evaluated by means of ultrasound scans. Usually, during the radiological examination, the specialist detects the optimal measurement area, identifies the layers of the arterial wall and manually marks pairs of points on the image to estimate the thickness of the artery. Therefore, this manual procedure entails subjectivity and variability in the IMT evaluation. Instead, this article suggests a fully automatic segmentation technique for ultrasound images of the common carotid artery. The proposed methodology is based on machine learning and artificial neural networks for the recognition of IMT intensity patterns in the images. For this purpose, a deep learning strategy has been developed to obtain abstract and efficient data representations by means of auto-encoders with multiple hidden layers. In particular, the considered deep architecture has been designed under the concept of extreme learning machine (ELM). The correct identification of the arterial layers is achieved in a totally user-independent and repeatable manner, which not only improves the IMT measurement in daily clinical practice but also facilitates the clinical research. A database consisting of 67 ultrasound images has been used in the validation of the suggested system, in which the resulting automatic contours for each image have been compared with the average of four manual segmentations performed by two different observers (ground-truth). Specifically, the IMT measured by the proposed algorithm is 0.625 ± 0.167 mm (mean ± standard deviation), whereas the corresponding ground-truth value is 0.619 ± 0.176 mm. Thus, our method shows a difference between automatic and manual measures of only 5.79 ± 34.42 μm. Furthermore, different quantitative evaluations reported in this paper indicate that this procedure outperforms other methods presented in the literature.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号