首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Yin  Kexin  Zhai  Junren  Xie  Aifeng  Zhu  Jianqi 《Pattern Analysis & Applications》2023,26(2):631-643
Pattern Analysis and Applications - Feature selection algorithms based on three-way interaction information have been widely studied. However, most of these traditional algorithms only consider...  相似文献   

2.
针对文本分类特征选择方法中的卡方统计(CHI)和期望交叉熵(ECE),分析了其特点和不足。为了避免传统CHI和ECE方法在不平衡数据集上分类效果差的问题,通过引入调节因子和除去负相关影响因素,给出了改进的CHI方法(◢p◣CHI),并以加权的方式弥补ECE方法倾向于选择弱区分能力高频特征的缺陷(◢ω◣ECE)。在综合两种改进后方法的基础上,进一步提出基于改进CHI和带权ECE结合(◢p◣CHI◢ω◣ECE)的特征选择方法。经对比实验验证,◢p◣CHI◢ω◣ECE方法的查准率、◢F◣▼1▽值均优于CHI、ECE及◢p◣CHI、◢ω◣ECE方法,且该方法的降维稳定性更好。  相似文献   

3.
Target decomposition is an important method for ship detection in polarimetric synthetic aperture radar (SAR) imagery. Parameters such as the polarization entropy and alpha angle deduced from the coherency matrix eigenvalue decomposition capture the differences between the target and background from different views separately. However, under the conditions of a relatively high resolution and a rough sea, the contrast between ship and sea reduces in the aforementioned space. Based on the analyses of target decomposition theory and the target’s scattering mechanism, multi-polarization parameters can be used to characterize different scattering behaviours of the ship target and sea clutter. Moreover, each parameter has its own diverse significance in the practical detection problem. This article proposes a feature selection and weighted support vector machine (FSWSVM) classifier-based algorithm to detect ships in polarimetric SAR (PolSAR) imagery. First, the method constructs a feature vector that consists of multi-polarization parameters. Then, different polarization parameters are refined and weighted according to their significance in the support vector machine (SVM) classifier. Finally, ships are classified from the sea background and other false alarms by the classifier. The validation results on National Aeronautics and Space Administration/Jet Propulsion Laboratory (NASA/JPL) airborne synthetic aperture radar (AIRSAR) and Radarsat-2 quad polarimetric data illustrate that the method detects ship targets more precisely and reduces false alarms effectively.  相似文献   

4.
在最小最大概率机中引入Boosting权值确定方法,构造特征加权最小最大概率机(FWMPM)。利用Boosting方法计算各个特征对分类任务的重要度,把此特征重要度作为原始数据各个特征的权重,对核函数中的内积和欧氏距离进行加权计算,从而可以减轻最小最大概率机被一些弱相关的特征影响。实验结果和理论分析表明,该方法比标准最小最大概率机具有更好的分类性能。  相似文献   

5.
P.  P. 《Pattern recognition》2002,35(12):2749-2759
A software package developed for the purpose of feature selection in statistical pattern recognition is presented. The software tool includes both several classical and new methods suitable for dimensionality reduction, classification and data representation. Examples of solved problems are given, as well as observations regarding the behavior of criterion functions.  相似文献   

6.
Neural Computing and Applications - The phenomena of two-way selection among people are universal. In the paper, we depict the model of two-way selection in complex weighted networks and give its...  相似文献   

7.
Chih-Fong Tsai 《Knowledge》2009,22(2):120-127
For many corporations, assessing the credit of investment targets and the possibility of bankruptcy is a vital issue before investment. Data mining and machine learning techniques have been applied to solve the bankruptcy prediction and credit scoring problems. As feature selection is an important step to select more representative data from a given dataset in data mining to improve the final prediction performance, it is unknown that which feature selection method is better. Therefore, this paper aims at comparing five well-known feature selection methods used in bankruptcy prediction, which are t-test, correlation matrix, stepwise regression, principle component analysis (PCA) and factor analysis (FA) to examine their prediction performance. Multi-layer perceptron (MLP) neural networks are used as the prediction model. Five related datasets are used in order to provide a reliable conclusion. Regarding the experimental results, the t-test feature selection method outperforms the other ones by the two performance measurements.  相似文献   

8.
The results of experiments with a novel criterion for absolute non-parametric feature selection are reported. The basic idea of the new technique involves the use of computer graphics and the human pattern recognition ability to interactively choose a number of features, this number not being necessarily determined in advance, from a larger set of measurements. The triangulation method, recently proposed in the cluster analysis literature for mapping points from l-space to 2-space, is used to yield a simple and efficient algorithm for feature selection by interactive clustering. It is shown that a subset of features can thus be chosen which allows a significant reduction in storage and time while still keeping the probability of error in classification within reasonable bounds.  相似文献   

9.
使用PGA的特征选择方法   总被引:1,自引:1,他引:0       下载免费PDF全文
特征选择是文本分类系统的核心步骤之一。然而现有的特征选择方法都是串行化的,应用于中文海量文本数据时时间效率较低,因此利用并行策略来提高特征选择的效率,已经成为研究的热点。详细设计了一个用于特征选择的并行遗传算法,该算法采用遗传算法搜索特征,利用并行策略评价特征子集,即将种群中个体的适应度计算并行在多个计算节点上同时进行,从而较快地获得较具代表性的特征子集。实验结果表明该方法是有效的。  相似文献   

10.
Multi-instance learning was first proposed by Dietterich et al. (Artificial Intelligence 89(1–2):31–71, 1997) when they were investigating the problem of drug activity prediction. Here, the training set is composed of labeled bags, each of which consists of many unlabeled instances. And the goal of this learning framework is to learn some classifier from the training set for correctly labeling unseen bags. After Dietterich et al., many studies about this new learning framework have been started and many new algorithms have been proposed, for example, DD, EM-DD, Citation-kNN and so on. All of these algorithms are working on the full data set. But as in single-instance learning, different feature in training set has different effect on the training about classifier. In this paper, we will study the problem about feature selection in multi-instance learning. We will extend the data reliability measure and make it select the key feature in multi-instance scenario.  相似文献   

11.
针对图像特征匹配过程中采集图像易受噪声、光照、尺度等因素影响使产生的匹配结果鲁棒性差、误匹配率高等问题,提出一种基于加权相似性度量(WSM)的特征匹配方法.该方法首先采用基于网格多密度聚类的特征匹配(FM_GMC)算法对原始图像进行特征聚类块划分;其次在每一特征聚类块中,采用Canny提取边缘特征点并使用尺度不变特征变...  相似文献   

12.
郑林江  刘旭  易兵 《计算机应用》2017,37(8):2381-2386
针对当前实时地图匹配算法难以同时保证匹配高准确性和高实时性的问题,提出一种基于动态权重的实时地图匹配改进算法。首先,算法考虑了相邻全球定位系统(GPS)轨迹点在时间、速度和方向上的约束关系,以及道路网拓扑结构,并基于时空特性分析,建立了距离权重、方位权重、方向权重和连通性权重组成的权重模型;然后,根据GPS轨迹点自身属性信息,建立了动态权重系数模型;最后,根据置信度水平选择最佳匹配路段。用三条总长36 km的重庆城市公交车行驶轨迹进行测试,结果显示:所提算法平均匹配正确率达到97.31%,单个轨迹点匹配平均延迟为17.9 ms。新算法匹配正确率和实时性较高,在Y形路口和平行路段的匹配效果上优于对比算法。  相似文献   

13.
In data analysis tasks, we are often confronted to very high dimensional data. Based on the purpose of a data analysis study, feature selection will find and select the relevant subset of features from the original features. Many feature selection algorithms have been proposed in classical data analysis, but very few in symbolic data analysis (SDA) which is an extension of the classical data analysis, since it uses rich objects instead to simple matrices. A symbolic object, compared to the data used in classical data analysis can describe not only individuals, but also most of the time a cluster of individuals. In this paper we present an unsupervised feature selection algorithm on probabilistic symbolic objects (PSOs), with the purpose of discrimination. A PSO is a symbolic object that describes a cluster of individuals by modal variables using relative frequency distribution associated with each value. This paper presents new dissimilarity measures between PSOs, which are used as feature selection criteria, and explains how to reduce the complexity of the algorithm by using the discrimination matrix.  相似文献   

14.
Feature selection plays an important role in data mining and pattern recognition, especially for large scale data. During past years, various metrics have been proposed to measure the relevance between different features. Since mutual information is nonlinear and can effectively represent the dependencies of features, it is one of widely used measurements in feature selection. Just owing to these, many promising feature selection algorithms based on mutual information with different parameters have been developed. In this paper, at first a general criterion function about mutual information in feature selector is introduced, which can bring most information measurements in previous algorithms together. In traditional selectors, mutual information is estimated on the whole sampling space. This, however, cannot exactly represent the relevance among features. To cope with this problem, the second purpose of this paper is to propose a new feature selection algorithm based on dynamic mutual information, which is only estimated on unlabeled instances. To verify the effectiveness of our method, several experiments are carried out on sixteen UCI datasets using four typical classifiers. The experimental results indicate that our algorithm achieved better results than other methods in most cases.  相似文献   

15.
16.
The human ability to organize and make sense out of complex arrays of information is the best example of flexible pattern recognition. Nevertheless, relatively little is known about the processes which underlie these organizational abilities. A model is proposed which describes the organizational rules or criteria employed by human listeners when comparing members of a set of complex sounds. The model assumes that feature selection is based on a Karhunen Loéve expansion of the low-level representations of sound samples. Theoretical and psychophysical analyses were performed on a set of sixteen complex sounds, revealing similar feature representations. It was concluded that the proposed model provides a reasonable first approximation to the organizational rules employed by listeners in a signal comparison task.  相似文献   

17.
基于属性选择的因果网络多传感器融合系统   总被引:1,自引:0,他引:1  
针对粗集“简化”在实际应用中存在的问题提出了“统计简化”的定义和相应属性搜索算法,利用此算法对一个水域污染监测信息表进行属性简化,结果显示与常规算法相比,此算法得到的结果能够覆盖最大数量的对象,更不易失配,利用简化结果对上述数据融合系统建立了因果网络模型,实验表明,在保持原型搜索正确率的同时,新模型压缩了搜索空间,提高了搜索效率,此外,为便于因果网络的建立导出了因果连接强度的粗集表达式。  相似文献   

18.
Cohen S  Dror G  Ruppin E 《Neural computation》2007,19(7):1939-1961
We present and study the contribution-selection algorithm (CSA), a novel algorithm for feature selection. The algorithm is based on the multiperturbation shapley analysis (MSA), a framework that relies on game theory to estimate usefulness. The algorithm iteratively estimates the usefulness of features and selects them accordingly, using either forward selection or backward elimination. It can optimize various performance measures over unseen data such as accuracy, balanced error rate, and area under receiver-operator-characteristic curve. Empirical comparison with several other existing feature selection methods shows that the backward elimination variant of CSA leads to the most accurate classification results on an array of data sets.  相似文献   

19.
基于并行二进制免疫量子粒子群优化的特征选择方法   总被引:1,自引:0,他引:1  
为提高文本挖掘算法的运行速度,降低占用的内存空间,提出一种基于并行二进制免疫量子粒子群优化的特征选择方法.该方法采用二进制免疫量子粒子群优化搜索特征子集,利用并行算法来提高时间效率,从而较快地获得较具代表性的特征子集.实验结果表明该算法是有效的.  相似文献   

20.
在文本分类系统中,特征选择方法是一种有效的降维方法.在分析了几种常用的特征选择评价函数之后,将权值计算函数应用于特征选择,并基于改进的TFIDF方法提出了一种新的评价函数,它将类别信息引入到特征项中,提取出与类别相关的特征项,弥补了TFIDF的缺陷.实验证明该方法简单可行,有助于提高所选特征子集的有效性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号