首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Sliced inverse regression (SIR) is an important method for reducing the dimensionality of input variables. Its goal is to estimate the effective dimension reduction directions. In classification settings, SIR is closely related to Fisher discriminant analysis. Motivated by reproducing kernel theory, we propose a notion of nonlinear effective dimension reduction and develop a nonlinear extension of SIR called kernel SIR (KSIR). Both SIR and KSIR are based on principal component analysis. Alternatively, based on principal coordinate analysis, we propose the dual versions of SIR and KSIR, which we refer to as sliced coordinate analysis (SCA) and kernel sliced coordinate analysis (KSCA), respectively. In the classification setting, we also call them discriminant coordinate analysis and kernel discriminant coordinate analysis. The computational complexities of SIR and KSIR rely on the dimensionality of the input vector and the number of input vectors, respectively, while those of SCA and KSCA both rely on the number of slices in the output. Thus, SCA and KSCA are very efficient dimension reduction methods.  相似文献   

2.
针对高维数据集常常存在冗余和维数灾难,在其上直接构造覆盖模型难以充分反映数据分布信息的问题,提出一种基于稀疏降维近似凸壳覆盖模型.首先采用同伦算法求解稀疏表示中l_1优化问题,通过稀疏约束自动获取合理近邻数并构建图,再通过LPP(Locality Preserving Projections)来进行局部保持投影,进而实现对高维空间快速有效地降维,最后在低维空间通过构造近似凸壳覆盖实现一类分类.在UCI数据库,MNIST手写体数据库和MIT-CBCL人脸识别数据库上的实验结果证实了方法的有效性,与现有的一类分类算法相比,提出的覆盖模型具有更高的分类正确率.  相似文献   

3.
Dimensionality reduction is used to preserve significant properties of data in a low-dimensional space. In particular, data representation in a lower dimension is needed in applications, where information comes from multiple high dimensional sources. Data integration, however, is a challenge in itself.In this contribution, we consider a general framework to perform dimensionality reduction taking into account that data are heterogeneous. We propose a novel approach, called Deep Kernel Dimensionality Reduction which is designed for learning layers of new compact data representations simultaneously. The method can be also used to learn shared representations between modalities. We show by experiments on standard and on real large-scale biomedical data sets that the proposed method embeds data in a new compact meaningful representation, and leads to a lower classification error compared to the state-of-the-art methods.  相似文献   

4.
Fisher linear discriminant analysis is a well-known technique for dimensionality reduction and classification. The method was first formulated in 1936 by Fisher. In this paper we concentrate on three different formulations of the multi-dimensional problem. We provide a mathematical explanation why two of the formulations are equivalent and prove that this equivalency can be extended to a broader class of objective functions. The second contribution is a rate of convergence of a fixed point method for solving the third model.  相似文献   

5.
We develop a supervised dimension reduction method that integrates the idea of localization from manifold learning with the sliced inverse regression framework. We call our method localized sliced inverse regression (LSIR) since it takes into account the local structure of the explanatory variables. The resulting projection from LSIR is a linear subspace of the explanatory variables that captures the nonlinear structure relevant to predicting the response. LSIR applies to both classification and regression problems and can be easily extended to incorporate the ancillary unlabeled data in semi-supervised learning. We illustrate the utility of LSIR on real and simulated data. Computer codes and datasets from simulations are available online.  相似文献   

6.
We consider informative dimension reduction for regression problems with random predictors. Based on the conditional specification of the model, we develop a methodology for replacing the predictors with a smaller number of functions of the predictors. We apply the method to the case where the inverse conditional model is in the linear exponential family. For such an inverse model and the usual Normal forward regression model it is shown that, for any number of predictors, the sufficient summary has dimension two or less. In addition, we develop a test of dimensionality. The relationship of our method with the existing dimension reduction theory based on the marginal distribution of the predictors is discussed.  相似文献   

7.
We propose the usage of Möbius transformations, defined in the context of Clifford algebras, for geometrically manipulating a point cloud data lying in a vector space of arbitrary dimension. We present this method as an application to signal classification in a dimensionality reduction framework. We first discuss a general situation where data analysis problems arise in signal processing. In this context, we introduce the construction of special Möbius transformations on vector spaces \({\mathbb{R}^n}\), customized for a classification setting. A computational experiment is presented indicating the potential and shortcomings of this framework.  相似文献   

8.
三维调和问题的自然积分方程及其数值解   总被引:13,自引:0,他引:13  
邬吉明  余德浩 《计算数学》1998,20(4):419-430
1.引言许多椭圆型偏微分方程边值问题可以通过不同途径归化为边界积分方程,由此发展出各种边界元方法.由我国学者冯康和余德浩首创并发展的自然边界元方法便是其中之一山.这一方法与经典边界元法相比有其独特的优点,它有着较高的数值稳定性,能与传统的有限元方法基于同一变分原理自然而直接地耦合[2].近年来,无界区域问题倍受关注[2-71,其中,基于自然边界归化的耦合算法及区域分解算法是处理无界区域问题的一种有效手段.但是,迄今为止关于自然边界归化的研究仅仅局限于二维问题.而三维问题显然更需要发展相应方法且其结果…  相似文献   

9.
In high-dimensional classification problems, one is often interested in finding a few important discriminant directions in order to reduce the dimensionality. Fisher's linear discriminant analysis (LDA) is a commonly used method. Although LDA is guaranteed to find the best directions when each class has a Gaussian density with a common covariance matrix, it can fail if the class densities are more general. Using a likelihood-based interpretation of Fisher's LDA criterion, we develop a general method for finding important discriminant directions without assuming the class densities belong to any particular parametric family. We also show that our method can be easily integrated with projection pursuit density estimation to produce a powerful procedure for (reduced-rank) nonparametric discriminant analysis.  相似文献   

10.
The goal of dimensionality reduction or manifold learning for a given set of high-dimensional data points, is to find a low-dimensional parametrization for them. Usually it is easy to carry out this parametrization process within a small region to produce a collection of local coordinate systems. Alignment is the process to stitch those local systems together to produce a global coordinate system and is done through the computation of a partial eigendecomposition of a so-called alignment matrix. In this paper, we present an analysis of the alignment process, giving conditions under which the null space of the alignment matrix recovers the global coordinate system up to an affine transformation. We also propose a post-processing step that can determine the global coordinate system up to a rigid motion. This in turn shows that Local Tangent Space Alignment method (LTSA) can recover a locally isometric embedding up to a rigid motion. AMS subject classification (2000)  65F15, 62H30, 15A18  相似文献   

11.
The ‘Signal plus Noise’ model for nonparametric regression can be extended to the case of observations taken at the vertices of a graph. This model includes many familiar regression problems. This article discusses the use of the edges of a graph to measure roughness in penalized regression. Distance between estimate and observation is measured at every vertex in the L2 norm, and roughness is penalized on every edge in the L1 norm. Thus the ideas of total variation penalization can be extended to a graph. The resulting minimization problem presents special computational challenges, so we describe a new and fast algorithm and demonstrate its use with examples.

The examples include image analysis, a simulation applicable to discrete spatial variation, and classification. In our examples, penalized regression improves upon kernel smoothing in terms of identifying local extreme values on planar graphs. In all examples we use fully automatic procedures for setting the smoothing parameters. Supplemental materials are available online.  相似文献   

12.
本文介绍了一种非线性降维方法一局部线性嵌入法,并通过实例与PCA对比,论证了LLE在处理非线性高维数据中的优越性。  相似文献   

13.
This study shows how data envelopment analysis (DEA) can be used to reduce vertical dimensionality of certain data mining databases. The study illustrates basic concepts using a real-world graduate admissions decision task. It is well known that cost sensitive mixed integer programming (MIP) problems are NP-complete. This study shows that heuristic solutions for cost sensitive classification problems can be obtained by solving a simple goal programming problem by that reduces the vertical dimension of the original learning dataset. Using simulated datasets and a misclassification cost performance metric, the performance of proposed goal programming heuristic is compared with the extended DEA-discriminant analysis MIP approach. The holdout sample results of our experiments shows that the proposed heuristic approach outperforms the extended DEA-discriminant analysis MIP approach.  相似文献   

14.
When clustering multivariate observations adhering the mixture model of Gaussian distributions, rather frequently projections of the observations onto a linear subspace of less dimensionality, called discriminant space (DS), contain all statistical information about the cluster structure of the model. In this case, the actual reduction of data dimensionality substantially facilitates a solution of various classification problems. In the paper, attention is devoted to statistical testing of hypotheses about DS and its dimension. The characterization of DS and methods of its identification are also briefly discussed.  相似文献   

15.
We present our recent work on both linear and nonlinear data reduction methods and algorithms: for the linear case we discuss results on structure analysis of SVD of column-partitioned matrices and sparse low-rank approximation; for the nonlinear case we investigate methods for nonlinear dimensionality reduction and manifold learning. The problems we address have attracted great deal of interest in data mining and machine learning.  相似文献   

16.
We present a heuristic optimization method for stochastic production-inventory systems that defy analytical modelling and optimization. The proposed heuristic takes advantage of simulation while at the same time minimizes the impact of the dimensionality curse by using regression analysis. The heuristic was developed and tested for an oil and gas company, which decided to adopt the heuristic as the optimization method for a supply-chain design project. To explore the performance of the heuristic in general settings, we conducted a simulation experiment on 900 test problems. We found that the average cost error of using the proposed heuristic was reasonably low for practical applications.  相似文献   

17.
This paper gives an overview of the eigenvalue problems encountered in areas of data mining that are related to dimension reduction. Given some input high‐dimensional data, the goal of dimension reduction is to map them to a low‐dimensional space such that certain properties of the original data are preserved. Optimizing these properties among the reduced data can be typically posed as a trace optimization problem that leads to an eigenvalue problem. There is a rich variety of such problems and the goal of this paper is to unravel relationships between them as well as to discuss effective solution techniques. First, we make a distinction between projective methods that determine an explicit linear mapping from the high‐dimensional space to the low‐dimensional space, and nonlinear methods where the mapping between the two is nonlinear and implicit. Then, we show that all the eigenvalue problems solved in the context of explicit linear projections can be viewed as the projected analogues of the nonlinear or implicit projections. We also discuss kernels as a means of unifying linear and nonlinear methods and revisit some of the equivalences between methods established in this way. Finally, we provide some illustrative examples to showcase the behavior and the particular characteristics of the various dimension reduction techniques on real‐world data sets. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

18.
在回归分析中往往对条件均值,条件方差及高阶条件矩特别感兴趣.本文我们将关注中心k阶条件矩子空间在高维相依自变量情形的估计问题.为此,我们首先引入中心k阶条件矩子空间的概念,并研究该子空间的基本性质.针对高维相依自变量的复杂数据,为了避免预测变量协方差阵的逆矩阵的计算,本文提出用偏最小二乘方法来估计中心k阶条件矩子空间....  相似文献   

19.
This paper presents an approach which is useful for regression analysis in the case of heterogeneity of a set of observations, for which regression is evaluated. The proposed procedure consists of two stages. First, for a set of observations, fuzzy classification is determined. Due to this, homogenous classes of observations which are of hyperellipsoidal shape, are obtained. Then for each fuzzy class, the so called linear fuzzy regression is evaluated.

In the paper the method of calculating linear fuzzy regression coefficients is given. It is a generalized version of the least squares method. The formula for the values of coefficients is given. Some properties of linear fuzzy regression are analyzed. It is proved that in one- and two-dimensional cases, the formulae are analogous to those for usual regression. A measure of goodness-of-fit and the method of determination of the number of fuzzy classes are also given.

Presented examples indicate the superiority of fuzzy regression in comparison to usual regression in the case of heterogenous observations.  相似文献   


20.
Convex Nonparametric Least Squares (CNLSs) is a nonparametric regression method that does not require a priori specification of the functional form. The CNLS problem is solved by mathematical programming techniques; however, since the CNLS problem size grows quadratically as a function of the number of observations, standard quadratic programming (QP) and Nonlinear Programming (NLP) algorithms are inadequate for handling large samples, and the computational burdens become significant even for relatively small samples. This study proposes a generic algorithm that improves the computational performance in small samples and is able to solve problems that are currently unattainable. A Monte Carlo simulation is performed to evaluate the performance of six variants of the proposed algorithm. These experimental results indicate that the most effective variant can be identified given the sample size and the dimensionality. The computational benefits of the new algorithm are demonstrated by an empirical application that proved insurmountable for the standard QP and NLP algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号