期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A new feature selection method for Gaussian mixture clustering

Hong Zeng Author Vitae Author Vitae 《Pattern recognition》2009,42(2):243-250

With the wide applications of Gaussian mixture clustering, e.g., in semantic video classification [H. Luo, J. Fan, J. Xiao, X. Zhu, Semantic principal video shot classification via mixture Gaussian, in: Proceedings of the 2003 International Conference on Multimedia and Expo, vol. 2, 2003, pp. 189-192], it is a nontrivial task to select the useful features in Gaussian mixture clustering without class labels. This paper, therefore, proposes a new feature selection method, through which not only the most relevant features are identified, but the redundant features are also eliminated so that the smallest relevant feature subset can be found. We integrate this method with our recently proposed Gaussian mixture clustering approach, namely rival penalized expectation-maximization (RPEM) algorithm [Y.M. Cheung, A rival penalized EM algorithm towards maximizing weighted likelihood for density mixture clustering with automatic model selection, in: Proceedings of the 17th International Conference on Pattern Recognition, 2004, pp. 633-636; Y.M. Cheung, Maximum weighted likelihood via rival penalized EM for density mixture clustering with automatic model selection, IEEE Trans. Knowl. Data Eng. 17(6) (2005) 750-761], which is able to determine the number of components (i.e., the model order selection) in a Gaussian mixture automatically. Subsequently, the data clustering, model selection, and the feature selection are all performed in a single learning process. Experimental results have shown the efficacy of the proposed approach. 相似文献

2.

The Supervised Learning Gaussian Mixture Model

下载免费PDF全文

Ma Jiyong Gao Wen 《计算机科学技术学报》1998,13(5):471-474

The traditional Gaussian Mixture Model(GMM)for pattern recognition is an unsupervised learning method.The parameters in the model are derived only by the training samples in one class without taking into account the effect of sample distributions of other classes,hence,its recognition accuracy is not ideal sometimes.This paper introduces an approach for estimating the parameters in GMM in a supervising way.The Supervised Learning Gaussian Mixture Model(SLGMM)improves the recognition accuracy of the GMM.An experimental example has shown its effectiveness.The experimental results have shown that the recognition accuracy derived by the approach is higher than those obtained by the Vector Quantization(VQ)approach,the Radial Basis Function (RBF) network model,the Learning Vector Quantization (LVQ) approach and the GMM.In addition,the training time of the approach is less than that of Multilayer Perceptrom(MLP). 相似文献

3.

A multi-manifold semi-supervised Gaussian mixture model for pattern classification

Xianglei Xing Yao Yu Hua Jiang Sidan Du 《Pattern recognition letters》2013,34(16):2118-2125

Semi-supervised Gaussian mixture model (SGMM) has been successfully applied to a wide range of engineering and scientific fields, including text classification, image retrieval, and biometric identification. Recently, many studies have shown that naturally occurring data may reside on or near manifold structures in ambient space. In this paper, we study the use of SGMM for data sets containing multiple separated or intersecting manifold structures. We propose a new multi-manifold regularized, semi-supervised Gaussian mixture model (M2SGMM) for classifying multiple manifolds. Specifically, we model the data manifold using a similarity graph with local and geometrical consistency properties. The geometrical similarity is measured by a novel application of local tangent space. We regularize the model parameters of the SGMM by incorporating the enhanced Laplacian of the graph. Experiments demonstrate the effectiveness of the proposed approach. 相似文献

4.

Gaussian mixture modeling of histograms for contrast enhancement

Yu-Ren LaiKuo-Liang Chung Guei-Yin LinChyou-Hwa Chen 《Expert systems with applications》2012,39(8):6720-6728

The current major theme in contrast enhancement is to partition the input histogram into multiple sub-histograms before final equalization of each sub-histogram is performed. This paper presents a novel contrast enhancement method based on Gaussian mixture modeling of image histograms, which provides a sound theoretical underpinning of the partitioning process. Our method comprises five major steps. First, the number of Gaussian functions to be used in the model is determined using a cost function of input histogram partitioning. Then the parameters of a Gaussian mixture model are estimated to find the best fit to the input histogram under a threshold. A binary search strategy is then applied to find the intersection points between the Gaussian functions. The intersection points thus found are used to partition the input histogram into a new set of sub-histograms, on which the classical histogram equalization (HE) is performed. Finally, a brightness preservation operation is performed to adjust the histogram produced in the previous step into a final one. Based on three representative test images, the experimental results demonstrate the contrast enhancement advantage of the proposed method when compared to twelve state-of-the-art methods in the literature. 相似文献

5.

A finite mixture model for simultaneous high-dimensional clustering, localized feature selection and outlier rejection

Nizar Bouguila Khaled Almakadmeh 《Expert systems with applications》2012,39(7):6641-6656

Model-based approaches and in particular finite mixture models are widely used for data clustering which is a crucial step in several applications of practical importance. Indeed, many pattern recognition, computer vision and image processing applications can be approached as feature space clustering problems. For complex high-dimensional data, however, the use of these approaches presents several challenges such as the presence of many irrelevant features which may affect the speed and also compromise the accuracy of the used learning algorithm. Another problem is the presence of outliers which potentially influence the resulting model’s parameters. For this purpose, we propose and discuss an algorithm that partitions a given data set without a priori information about the number of clusters, the saliency of the features or the number of outliers. We illustrate the performance of our approach using different applications involving synthetic data, real data and objects shape clustering. 相似文献

6.

The BYY annealing learning algorithm for Gaussian mixture with automated model selection

Jinwen Ma^{Author Vitae} Jianfeng Liu Author Vitae 《Pattern recognition》2007,40(7):2029-2037

Bayesian Ying-Yang (BYY) learning has provided a new mechanism that makes parameter learning with automated model selection via maximizing a harmony function on a backward architecture of the BYY system for the Gaussian mixture. However, since there are a large number of local maxima for the harmony function, any local searching algorithm, such as the hard-cut EM algorithm, does not work well. In order to overcome this difficulty, we propose a simulated annealing learning algorithm to search the global maximum of the harmony function, being expressed as a kind of deterministic annealing EM procedure. It is demonstrated by the simulation experiments that this BYY annealing learning algorithm can efficiently and automatically determine the number of clusters or Gaussians during the learning process. Moreover, the BYY annealing learning algorithm is successfully applied to two real-life data sets, including Iris data classification and unsupervised color image segmentation. 相似文献

7.

A convex hull approach in conjunction with Gaussian mixture model for salient object detection

《Digital Signal Processing》2016

The capability of humans in distinguishing salient objects from background is at par excellence. The researchers are yet to develop a model that matches the detection accuracy as well as computation time taken by the humans. In this paper we attempted to improve the detection accuracy without capitalizing much of computation time. The model utilizes the fact that maximal amount of information is present at the corners and edges of an object in the image. Firstly the keypoints are extracted from the image by using multi-scale Harris and multi-scale Gabor functions. Then the image is roughly segmented into two regions: a salient region and a background region, by constructing a convex hull over these keypoints. Finally the pixels of the two regions are considered as samples to be drawn from a multivariate kernel function whose parameters are estimated using expectation maximization algorithm, to yield a saliency map. The performance of the proposed model is evaluated in terms of precision, recall, F-measure, area under curve and computation time using six publicly available image datasets. Experimental results demonstrate that the proposed model outperformed the existing state-of-the-art methods in terms of recall, F-measure and area under curve on all the six datasets, and precision on four datasets. The proposed method also takes comparatively less computation time in comparison to many existing methods. 相似文献

8.

Combination of wavelet and SIFT features for image classification using trained Gaussian mixture model

WANG Ke-jun REN Zhen XIONG Xin-yan 《通讯和计算机》2008,5(10):41-46

This paper presents an effective combination of Wavelet-based features and SIFT features. For the combined feature patches extracted from images we then adopt the PCA transformation to reduce the dimensionality of their feature vectors. And the reduced vectors are used to train Gaussian Mixture Models （GMMs） in which the mixture weights and Gaussian parameters are updated iteratively. We performed the method on Caltech datasets and compared the results with several other methods. It shown that the combination of salient feature vectors and GMM gives a much better improvement in image classification. 相似文献

9.

Gaussian mixture model classification: A projection pursuit approach

Daniela G. Calò 《Computational statistics & data analysis》2007,52(1):471-482

Gaussian mixture models (GMM) are commonly employed in nonparametric supervised classification. In high-dimensional problems it is often the case that information relevant to the separation of the classes is contained in a few directions. A GMM fitting procedure oriented to supervised classification is proposed, with the aim of reducing the number of free parameters. It resorts to projection pursuit as a dimension reduction method and combines it with GM modelling of class-conditional densities. In its derivation, issues regarding the forward and backward projection pursuit algorithms are discussed. The proposed procedure avoids the “curse of dimensionality”, is able to model structure in subspaces and regularizes the classification model. Its performance is illustrated on a simulation experiment and on a real data set, in comparison with other GMM-based classification methods. 相似文献

10.

基于混合高斯模型的窄带目标跟踪方法

曾绮雯葛辉良《传感器与微系统》2017,36(3)

基于混合高斯模型的轨迹分布融合方法适用于窄带目标跟踪系统.这种算法针对宽带跟踪结果的不精确,目标模糊,窄带跟踪需要依赖人工实现的问题,提出了一种基于混合高斯模型的自动窄带目标跟踪技术.该方法首先将目标方位分布看做是混合高斯模型,利用期望最大化算法估计混合高斯模型中的参数,然后利用混合高斯模型对目标方位进行聚类,最后利用平均加权法对目标方位进行融合,得到清晰稳定的目标跟踪结果. 相似文献

11.

基于协方差的高斯混合模型参数学习算法

廖晓锋范修斌姜青山《计算机科学》2013,40(Z11):77-81

对混合高斯模型参数估计问题的算法通常是基于期望最大(Expectation Maximization)给出的。在混合高斯模型的因素协方差矩阵已知、因素各分量独立的前提下,给出了基于协方差矩阵的机器学习算法,简称CVB(Covariance Based)算法,并进行了一定的数学分析。最后给出了与期望最大算法的实验结果比较。实验结果表明,在该条件下,基于协方差的算法优于期望最大算法。相似文献

12.

Feature selection using localized generalization error for supervised classification problems using RBFNN

Wing W.Y. Ng Author Vitae Daniel S. Yeung Author Vitae Michael Firth^{Author Vitae} 《Pattern recognition》2008,41(12):3706-3719

A pattern classification problem usually involves using high-dimensional features that make the classifier very complex and difficult to train. With no feature reduction, both training accuracy and generalization capability will suffer. This paper proposes a novel hybrid filter-wrapper-type feature subset selection methodology using a localized generalization error model. The localized generalization error model for a radial basis function neural network bounds from above the generalization error for unseen samples located within a neighborhood of the training samples. Iteratively, the feature making the smallest contribution to the generalization error bound is removed. Moreover, the novel feature selection method is independent of the sample size and is computationally fast. The experimental results show that the proposed method consistently removes large percentages of features with statistically insignificant loss of testing accuracy for unseen samples. In the experiments for two of the datasets, the classifiers built using feature subsets with 90% of features removed by our proposed approach yield average testing accuracies higher than those trained using the full set of features. Finally, we corroborate the efficacy of the model by using it to predict corporate bankruptcies in the US. 相似文献

13.

Statistical mixture model for documents skew angle estimation

Amir Egozi Its’hak Dinstein 《Pattern recognition letters》2011,32(14):1912-1921

We present a statistical approach to skew detection, where the distribution of textual features of document images is modeled as a mixture of straight lines in Gaussian noise. The Expectation Maximization (EM) algorithm is used to estimate the parameters of the statistical model and the estimated skew angle is extracted from the estimated parameters. Experiments demonstrate that our method is favorably comparable to other existing methods in terms of accuracy and efficiency. 相似文献

14.

一种改进的混合高斯模型背景估计方法 总被引：1，自引：0，他引：1

蒋明潘姣丽《微型机与应用》2011,30(11):31-33,36

传统混合高斯模型一般为每个像素分配固定的高斯分布个数,从而造成背景形成速度的减慢和系统资源的浪费;同时也存在着高斯模型背景建模中的缓慢或滞留运动物体造成目标误判现象的问题(即空洞问题)。为此,提出了一种有效的两阶段视频图像处理方法。该方法在第一阶段根据像素点的优先级大小自动地调节高斯分布的数目,在第二阶段首先对像素点进行所属区域的划分,进而对目标区域和非目标区域采取不同的更新手段。实验表明,采用两阶段视频图像处理方法明显地改善了背景建模的速度,有效解决了提取目标出现的空洞问题。相似文献

15.

于混合高斯莫型与联合特征的行人检测方法

郑锐邵宗凯《传感器与微系统》2017,36(7)

针对视频环境下行人检测多数采用窗口滑动方法识别慢、不能快速找到行人窗口的缺点,提出了一种基于组合算法的行人目标识别方法,利用高斯混合模型方法提取视频中的运动前景,划定一个泛目标窗口,再使用HOG-l bp联合特征训练的分类器对泛目标窗口进行分类,得到分类结果,对行人目标进行标记.经实验验证:该方法相对于当前行人检测方法,检测速度和正确率都取得了很好的效果. 相似文献

16.

Higher order feature selection for text classification

Jan Bakus Mohamed S. Kamel 《Knowledge and Information Systems》2006,9(4):468-491

In this paper. we present the MIFS-C variant of the mutual information feature-selection algorithms. We present an algorithm to find the optimal value of the redundancy parameter, which is a key parameter in the MIFS-type algorithms. Furthermore, we present an algorithm that speeds up the execution time of all the MIFS variants. Overall, the presented MIFS-C has comparable classification accuracy (in some cases even better) compared with other MIFS algorithms, while its running time is faster. We compared this feature selector with other feature selectors, and found that it performs better in most cases. The MIFS-C performed especially well for the breakeven and F-measure because the algorithm can be tuned to optimise these evaluation measures. Jan Bakus received the B.A.Sc. and M.A.Sc. degrees in electrical engineering from the University of Waterloo, Waterloo, ON, Canada, in 1996 and 1998, respectively, and Ph.D. degree in systems design engineering in 2005. He is currently working at Maplesoft, Waterloo, ON, Canada as an applications engineer, where he is responsible for the development of application specific toolboxes for the Maple scientific computing software. His research interests are in the area of feature selection for text classification, text classification, text clustering, and information retrieval. He is the recipient of the Carl Pollock Fellowship award from the University of Waterloo and the Datatel Scholars Foundation scholarship from Datatel. Mohamed S. Kamel holds a Ph.D. in computer science from the University of Toronto, Canada. He is at present Professor and Director of the Pattern Analysis and Machine Intelligence Laboratory in the Department of Electrical and Computing Engineering, University of Waterloo, Canada. Professor Kamel holds a Canada Research Chair in Cooperative Intelligent Systems. Dr. Kamel's research interests are in machine intelligence, neural networks and pattern recognition with applications in robotics and manufacturing. He has authored and coauthored over 200 papers in journals and conference proceedings, 2 patents and numerous technical and industrial project reports. Under his supervision, 53 Ph.D. and M.A.Sc. students have completed their degrees. Dr. Kamel is a member of ACM, AAAI, CIPS and APEO and has been named s Fellow of IEEE (2005). He is the editor-in-chief of the International Journal of Robotics and Automation, Associate Editor of the IEEE SMC, Part A, the International Journal of Image and Graphics, Pattern Recognition Letters and is a member of the editorial board of the Intelligent Automation and Soft Computing. He has served as a consultant to many Companies, including NCR, IBM, Nortel, VRP and CSA. He is a member of the board of directors and cofounder of Virtek Vision International in Waterloo. 相似文献

17.

Automated framework for classification and selection of software design patterns

《Applied Soft Computing》2019

Though, Unified Modeling Language (UML), Ontology, and Text categorization approaches have been used to automate the classification and selection of design pattern(s). However, there are certain issues such as time and effort for formal specification of new patterns, system context-awareness, and lack of knowledge which needs to be addressed. We propose a framework (i.e. Three-phase method) to discuss these issues, which can aid novice developers to organize and select the correct design pattern(s) for a given design problem in a systematic way. Subsequently, we propose an evaluation model to gauge the efficacy of the proposed framework via certain unsupervised learning techniques. We performed three case studies to describe the working procedure of the proposed framework in the context of three widely used design pattern catalogs and 103 design problems. We find the significant results of Fuzzy c-means and Partition Around Medoids (PAM) as compared to other unsupervised learning techniques. The promising results encourage the applicability of the proposed framework in terms of design patterns organization and selection with respect to a given design problem. 相似文献

18.

Simultaneous feature selection and classification using kernel-penalized support vector machines 总被引：2，自引：0，他引：2

Sebastián Maldonado Jayanta Basak 《Information Sciences》2011,181(1):115-128

We introduce an embedded method that simultaneously selects relevant features during classifier construction by penalizing each feature’s use in the dual formulation of support vector machines (SVM). This approach called kernel-penalized SVM (KP-SVM) optimizes the shape of an anisotropic RBF Kernel eliminating features that have low relevance for the classifier. Additionally, KP-SVM employs an explicit stopping condition, avoiding the elimination of features that would negatively affect the classifier’s performance. We performed experiments on four real-world benchmark problems comparing our approach with well-known feature selection techniques. KP-SVM outperformed the alternative approaches and determined consistently fewer relevant features. 相似文献

19.

一种快速、贪心的高斯混合模型EM算法研究 总被引：1，自引：0，他引：1

下载免费PDF全文

邢长征苑聪《计算机工程与应用》2015,51(20):111-115

针对传统EM算法存在初始模型成分数目需要预先指定以及收敛速度随样本数目的增长而急剧减慢等问题,提出了一种快速、贪心的高斯混合模型EM算法。该算法采用贪心的策略以及对隐含参数设置适当阈值的方法,使算法能够快速收敛,从而在很少的迭代次数内获取高斯混合模型的模型成分数。该算法通过与传统EM算法、无监督EM算法和鲁棒EM算法的聚类结果进行比较,实验结果证明该算法具有很强的鲁棒性,并且能够提高算法的效率以及模型成分数的准确性。相似文献

20.

Distributed feature selection: An application to microarray data classification

《Applied Soft Computing》2015

Feature selection is often required as a preliminary step for many pattern recognition problems. However, most of the existing algorithms only work in a centralized fashion, i.e. using the whole dataset at once. In this research a new method for distributing the feature selection process is proposed. It distributes the data by features, i.e. according to a vertical distribution, and then performs a merging procedure which updates the feature subset according to improvements in the classification accuracy. The effectiveness of our proposal is tested on microarray data, which has brought a difficult challenge for researchers due to the high number of gene expression contained and the small samples size. The results on eight microarray datasets show that the execution time is considerably shortened whereas the performance is maintained or even improved compared to the standard algorithms applied to the non-partitioned datasets. 相似文献