首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
郭小萍  袁杰  李元 《自动化学报》2014,40(1):135-142
针对具有非高斯、非线性及多工况特性的批次过程,提出一种基于特征量最近邻统计指标的过程监视方法. 首先,将批次过程正常工况原始数据投影到其特征空间,提取主元T和平方预测误差SPE,并进行特征量k最近邻距离平方和的求解. 然后,采用核密度估计法获得概率密度分布函数,确定统计监视控制限. 特征空间的主元T和SPE特征量能全面代表原始数据的有用信息. 采用特征量k最近邻建立监视模型将会节省存储空间,提高建模样本数量与变量之比以及检测异常工况的速度. 另外,利用局部近邻数据建模可以解决过程具有的非线性和多工况问题,而应用核密度估计法可以解决过程数据具有的非高斯分布问题. 最后,在半导体生产过程的成功应用表明了所提方法的有效性.  相似文献   

2.
Nearest neighbor (NN) classification assumes locally constant class conditional probabilities, and suffers from bias in high dimensions with a small sample set. In this paper, we propose a novel cam weighted distance to ameliorate the curse of dimensionality. Different from the existing neighborhood-based methods which only analyze a small space emanating from the query sample, the proposed nearest neighbor classification using the cam weighted distance (CamNN) optimizes the distance measure based on the analysis of inter-prototype relationship. Our motivation comes from the observation that the prototypes are not isolated. Prototypes with different surroundings should have different effects in the classification. The proposed cam weighted distance is orientation and scale adaptive to take advantage of the relevant information of inter-prototype relationship, so that a better classification performance can be achieved. Experiments show that CamNN significantly outperforms one nearest neighbor classification (1-NN) and k-nearest neighbor classification (k-NN) in most benchmarks, while its computational complexity is comparable with that of 1-NN classification.  相似文献   

3.
The k nearest neighbor is a lazy learning algorithm that is inefficient in the classification phase because it needs to compare the query sample with all training samples. A template reduction method is recently proposed that uses only samples near the decision boundary for classification and removes those far from the decision boundary. However, when class distributions overlap, more border samples are retrained and it leads to inefficient performance in the classification phase. Because the number of reduced samples are limited, using an appropriate feature reduction method seems a logical choice to improve classification time. This paper proposes a new prototype reduction method for the k nearest neighbor algorithm, and it is based on template reduction and ViSOM. The potential property of ViSOM is displaying the topology of data on a two-dimensional feature map, it provides an intuitive way for users to observe and analyze data. An efficient classification framework is then presented, which combines the feature reduction method and the prototype selection algorithm. It needs a very small data size for classification while keeping recognition rate. In the experiments, both of synthetic and real datasets are used to evaluate the performance. Experimental results demonstrate that the proposed method obtains above 70 % speedup ratio and 90 % compression ratio while maintaining similar performance to kNN.  相似文献   

4.
K‐Nearest neighbor (K‐NN) algorithm is a classification algorithm widely used in machine learning, statistical pattern recognition, data mining, etc. Ordered weighted averaging (OWA) distance based CxK nearest neighbor algorithm is a kind of K‐NN algorithm based on OWA distance. In this study, the aim is two‐fold: i) to perform the algorithm with two different fuzzy metric measures, which are Diamond distance, and weighted dissimilarity measure composed by spread distances and center distances, and ii) to evaluate the effects of different metric measures. K neighbors are searched for each class, and OWA distance is used to aggregate the information. The OWA distance can behave as intercluster distance approaches single, complete, and average linkages by using different weights. The experimental study is performed on well‐known three classification data sets (iris, glass, and wine). N‐fold cross‐validation is used for the evaluation of performances. It is seen that single linkage approach by using two different metric measures has significant different results.  相似文献   

5.
A reliable and precise classification of tumors is essential for successful treatment of cancer. Gene selection is an important step for improved diagnostics. The modified SFFS (sequential forward floating selection) algorithm based on weighted Mahalanobis distance, called MSWM, is proposed to identify optimal informative gene subsets taking into account joint discriminatory power for accurate discrimination in this study. Firstly, we make use of the one-dimensional weighted Mahalanobis distance to perform a preliminary selection of genes and then make use of the modified SFFS method and multidimensional weighted Mahalanobis distance to obtain the optimal informative gene subset for tumor classification. Finally, we used the k nearest neighbor and naive Bayes methods to classify tumors based on the optimal gene subset selected using the MSWM method. To validate the efficiency, the proposed MSWM method is applied to classify two different DNA microarray datasets. Our empirical study shows that the MSWM method for tumor classification can obtain better effectiveness of classification than the BWR (the ratio of between-groups to within-groups sum of squares) and IVGA_I (independent variable group analysis I) methods. It suggests that the MSWM gene selection method is ability to obtain correct informative gene subsets taking into account genes’ joint discriminatory power for tumor classification.  相似文献   

6.
In this paper, we present a fast and versatile algorithm which can rapidly perform a variety of nearest neighbor searches. Efficiency improvement is achieved by utilizing the distance lower bound to avoid the calculation of the distance itself if the lower bound is already larger than the global minimum distance. At the preprocessing stage, the proposed algorithm constructs a lower bound tree (LB-tree) by agglomeratively clustering all the sample points to be searched. Given a query point, the lower bound of its distance to each sample point can be calculated by using the internal node of the LB-tree. To reduce the amount of lower bounds actually calculated, the winner-update search strategy is used for traversing the tree. For further efficiency improvement, data transformation can be applied to the sample and the query points. In addition to finding the nearest neighbor, the proposed algorithm can also (i) provide the k-nearest neighbors progressively; (ii) find the nearest neighbors within a specified distance threshold; and (iii) identify neighbors whose distances to the query are sufficiently close to the minimum distance of the nearest neighbor. Our experiments have shown that the proposed algorithm can save substantial computation, particularly when the distance of the query point to its nearest neighbor is relatively small compared with its distance to most other samples (which is the case for many object recognition problems).  相似文献   

7.
This study investigates stock market indices prediction that is an interesting and important research in the areas of investment and applications, as it can get more profits and returns at lower risk rate with effective exchange strategies. To realize accurate prediction, various methods have been tried, among which the machine learning methods have drawn attention and been developed. In this paper, we propose a basic hybridized framework of the feature weighted support vector machine as well as feature weighted K-nearest neighbor to effectively predict stock market indices. We first establish a detailed theory of feature weighted SVM for the data classification assigning different weights for different features with respect to the classification importance. Then, to get the weights, we estimate the importance of each feature by computing the information gain. Lastly, we use feature weighted K-nearest neighbor to predict future stock market indices by computing k weighted nearest neighbors from the historical dataset. Experiment results on two well known Chinese stock market indices like Shanghai and Shenzhen stock exchange indices are finally presented to test the performance of our established model. With our proposed model, it can achieve a better prediction capability to Shanghai Stock Exchange Composite Index and Shenzhen Stock Exchange Component Index in the short, medium and long term respectively. The proposed algorithm can also be adapted to other stock market indices prediction.  相似文献   

8.
Initializing a student model for individualized tutoring in educational applications is a difficult task, since very little is known about a new student. On the other hand, fast and efficient initialization of the student model is necessary. Otherwise the tutoring system may lose its credibility in the first interactions with the student. In this paper we describe a framework for the initialization of student models in Web-based educational applications. The framework is called ISM. The basic idea of ISM is to set initial values for all aspects of student models using an innovative combination of stereotypes and the distance weighted k-nearest neighbor algorithm. In particular, a student is first assigned to a stereotype category concerning her/his knowledge level of the domain being taught. Then, the model of the new student is initialized by applying the distance weighted k-nearest neighbor algorithm among the students that belong to the same stereotype category with the new student. ISM has been applied in a language learning system, which has been used as a test-bed. The quality of the student models created using ISM has been evaluated in an experiment involving classroom students and their teachers. The results from this experiment showed that the initialization of student models was improved using the ISM framework.  相似文献   

9.
The nearest neighbor classification method assigns an unclassified point to the class of the nearest case of a set of previously classified points. This rule is independent of the underlying joint distribution of the sample points and their classifications. An extension to this approach is the k-NN method, in which the classification of the unclassified point is made by following a voting criteria within the k nearest points.The method we present here extends the k-NN idea, searching in each class for the k nearest points to the unclassified point, and classifying it in the class which minimizes the mean distance between the unclassified point and the k nearest points within each class. As all classes can take part in the final selection process, we have called the new approach k Nearest Neighbor Equality (k-NNE).Experimental results we obtained empirically show the suitability of the k-NNE algorithm, and its effectiveness suggests that it could be added to the current list of distance based classifiers.  相似文献   

10.
This article proposes an optimized instance-based learning approach for prediction of the compressive strength of high performance concrete based on mix data, such as water to binder ratio, water content, super-plasticizer content, fly ash content, etc. The base algorithm used in this study is the k nearest neighbor algorithm, which is an instance-based machine leaning algorithm. Five different models were developed and analyzed to investigate the effects of the number of neighbors, the distance function and the attribute weights on the performance of the models. For each model a modified version of the differential evolution algorithm was used to find the optimal model parameters. Moreover, two different models based on generalized regression neural network and stepwise regressions were also developed. The performances of the models were evaluated using a set of high strength concrete mix data. The results of this study indicate that the optimized models outperform those derived from the standard k nearest neighbor algorithm, and that the proposed models have a better performance in comparison to generalized regression neural network, stepwise regression and modular neural networks models.  相似文献   

11.
针对邻域信息系统的特征选择模型存在人为设定邻域参数值的问题。分别计算样本与最近同类样本和最近异类样本的距离,用于定义样本的最近邻以确定信息粒子的大小。将最近邻的概念扩展到信息理论,提出最近邻互信息。在此基础上,采用前向贪心搜索策略构造了基于最近邻互信息的特征算法。在两个不同基分类器和八个UCI数据集上进行实验。实验结果表明:相比当前多种流行算法,该模型能够以较少的特征获得较高的分类性能。  相似文献   

12.
Tianyang  Dong  Lulu  Yuan  Qiang  Cheng  Bin  Cao  Jing  Fan 《World Wide Web》2019,22(4):1765-1797

Recently more and more people focus on k-nearest neighbor (KNN) query processing over moving objects in road networks, e.g., taxi hailing and ride sharing. However, as far as we know, the existing k-nearest neighbor (KNN) queries take distance as the major criteria for nearest neighbor objects, even without taking direction into consideration. The main issue with existing methods is that moving objects change their locations and directions frequently over time, so the information updates cannot be processed in time and they run the risk of retrieving the incorrect KNN results. They may fail to meet users’ needs in certain scenarios, especially in the case of querying k-nearest neighbors for moving objects in a road network. In order to find the top k-nearest objects moving toward a query point, this paper presents a novel algorithm for direction-aware KNN (DAKNN) queries for moving objects in a road network. In this method, R-tree and simple grid are firstly used as the underlying index structure, where the R-tree is used for indexing the static road network and the simple grid is used for indexing the moving objects. Then, it introduces the notion of “azimuth” to represent the moving direction of objects in a road network, and presents a novel local network expansion method to quickly judge the direction of the moving objects. By considering whether a moving object is moving farther away from or getting closer to a query point, the object that is definitely not in the KNN result set is effectively excluded. Thus, we can reduce the communication cost, meanwhile simplify the computation of moving direction between moving objects and query point. Comprehensive experiments are conducted and the results show that our algorithm can achieve real-time and efficient queries in retrieving objects moving toward query point in a road network.

  相似文献   

13.
针对传统基于接收信号强度的定位缺陷,提出一种新型的基于K-邻居节点覆盖的物联网定位模型.该模型分为选取邻居节点与定位两个阶段,未知节点先通过调整发射功率等级来选择最近的K个邻居节点,尽量减少远距离节点对定位的影响.定位阶段,未知节点通过与K个信标节点的接收信号强度来计算权重,通过加权求和算出未知节点的坐标.采用K-邻居节点误差的自校正方法对坐标进行补偿.该定位模型可有效的避免环境因素对定位的影响,且定位算法简单,避免复杂的计算.实验表明,该定位模型定位精度较高.  相似文献   

14.
The problem of selecting a subset of relevant features is classic and found in many branches of science including—examples in pattern recognition. In this paper, we propose a new feature selection criterion based on low-loss nearest neighbor classification and a novel feature selection algorithm that optimizes the margin of nearest neighbor classification through minimizing its loss function. At the same time, theoretical analysis based on energy-based model is presented, and some experiments are also conducted on several benchmark real-world data sets and facial data sets for gender classification to show that the proposed feature selection method outperforms other classic ones.  相似文献   

15.
Algorithms based on Nested Generalized Exemplar (NGE) theory (Salzberg, 1991) classify new data points by computing their distance to the nearest generalized exemplar (i.e., either a point or an axis-parallel rectangle). They combine the distance-based character of nearest neighbor (NN) classifiers with the axis-parallel rectangle representation employed in many rule-learning systems. An implementation of NGE was compared to thek-nearest neighbor (kNN) algorithm in 11 domains and found to be significantly inferior to kNN in 9 of them. Several modifications of NGE were studied to understand the cause of its poor performance. These show that its performance can be substantially improved by preventing NGE from creating overlapping rectangles, while still allowing complete nesting of rectangles. Performance can be further improved by modifying the distance metric to allow weights on each of the features (Salzberg, 1991). Best results were obtained in this study when the weights were computed using mutual information between the features and the output class. The best version of NGE developed is a batch algorithm (BNGE FWMI) that has no user-tunable parameters. BNGE FWMI's performance is comparable to the first-nearest neighbor algorithm (also incorporating feature weights). However, thek-nearest neighbor algorithm is still significantly superior to BNGE FWMI in 7 of the 11 domains, and inferior to it in only 2. We conclude that, even with our improvements, the NGE approach is very sensitive to the shape of the decision boundaries in classification problems. In domains where the decision boundaries are axis-parallel, the NGE approach can produce excellent generalization with interpretable hypotheses. In all domains tested, NGE algorithms require much less memory to store generalized exemplars than is required by NN algorithms.  相似文献   

16.
Zhao  Guodong  Wu  Yan 《Neural Processing Letters》2019,50(2):1257-1279

As known, the supervised feature extraction aims to search a discriminative low dimensional space where the new samples in the sample class cluster tightly and the samples in the different classes keep away from each other. For most of algorithms, how to push these samples located in class margin or in other class (called hard samples in this paper) towards the class is difficult during the transformation. Frequently, these hard samples affect the performance of most of methods. Therefore, for an efficient method, to deal with these hard samples is very important. However, fewer methods in the past few years have been specially proposed to solve the problem of hard samples. In this study, the large margin nearest neighbor (LMNN) and weighted local modularity (WLM) in complex network are introduced respectively to deal with these hard samples in order to push them towards the class quickly and the samples with the same labels as a whole shrink into the class, which both result in small within-class distance and large margin between classes. Combined WLM with LMNN, a novel feature extraction method named WLMLMNN is proposed, which takes into account both the global and local consistencies of input data in the projected space. Comparative experiments with other popular methods on various real-world data sets demonstrate the effectiveness of the proposed method.

  相似文献   

17.
研究用最近邻分类预测多目标优化问题Pareto支配性的相似性测度方法. 在分析决策分量对各目标分量贡献率的基础上定义决策向量的等价子向量,等价子向量由贡献率相同的决策分量所组成.提出基于等价子向量的最小交叉距离加 权和相似性测度方法.对每个目标分量,独立评价待测数据与N个已知样本的相似度,每个样本按其相似度值的升序赋予[0:N-1]之间的序号,按各目标上的序号之和最小准则确定最近邻样本.等价子向量最小交叉距离加权和相似性测度以及多目标最近邻搜索方法在确定决策向量相似性时,引入了决策空间到目标向量空间的映射知识,使决策变量相似性测度更真实地反映目标向量相似性.对典型多目标优化问题的Pareto支配性最近邻分类实验结果表明,提出的方法可显著地提高分类准确性.  相似文献   

18.
A modified counter-propagation (CP) algorithm with supervised learning vector quantizer (LVQ) and dynamic node allocation has been developed for rapid classification of molecular sequences. The molecular sequences were encoded into neural input vectors using an n–gram hashing method for word extraction and a singular value decomposition (SVD) method for vector compression. The neural networks used were three-layered, forward-only CP networks that performed nearest neighbor classification. Several factors affecting the CP performance were evaluated, including weight initialization, Kohonen layer dimensioning, winner selection and weight update mechanisms. The performance of the modified CP network was compared with the back-propagation (BP) neural network and the k–nearest neighbor method. The major advantages of the CP network are its training and classification speed and its capability to extract statistical properties of the input data. The combined BP and CP networks can classify nucleic acid or protein sequences with a close to 100% accuracy at a rate of about one order of magnitude faster than other currently available methods.  相似文献   

19.
In multimedia databases, k-nearest neighbor queries are popular and frequently contain non-spatial predicates. Among the available techniques for such queries, the incremental nearest neighbor algorithm proposed by Hjaltason and Samet is known as the most useful algorithm [16]. The reason is that if k > k neighbors are needed, it can provide the next neighbor for the upper operator without restarting the query from scratch. However, the R-tree in their algorithm has no facility capable of partially pruning tuple candidates that will turn out not to satisfy the remaining predicates, leading their algorithm to inefficiency. In this paper, we propose an RS-tree-based incremental nearest neighbor algorithm complementary to their algorithm. The RS-tree used in our algorithm is a hybrid of the R-tree and the S-tree, as its buddy tree, based on the hierarchical signature file. Experimental results show that our RS-tree enhances the performance of Hjaltason and Samet's algorithm.  相似文献   

20.

With the popularity of software-defined radio and cognitive radio-technologies in wireless communication, radio frequency devices have to adapt to changing conditions and adjust its transmitting parameters such as transmitting power, operating frequency, and modulation schemes. Thus, automatic modulation classification becomes an essential feature for such scenarios where the receiver has a little or no knowledge about the transmitter parameters. This paper presents kth nearest neighbor (KNN)-based classification of M-QAM and M-PSK modulation schemes using higher-order cumulants as input features set. Genetic programming is used to enhance the performance of the KNN classifier by creating super features from the data set. Simulation result shows improved accuracy at comparatively lower signal-to-noise ratio for all the considered modulations.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号