首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
One of the most popular criteria for model selection is the Bayesian Information Criterion (BIC). It is based on an asymptotic approximation using Bayes rule when the sample size tends to infinity and the dimension of the model is fixed. Although it works well in classical applications, it performs less satisfactorily for high dimensional problems, i.e. when the number of regressors is very large compared to the sample size. For this reason, an alternative version of the BIC has been proposed for the problem of mapping quantitative trait loci (QTLs) considered in genetics. One approach is to locate QTLs by using model selection in the context of a regression model with an extremely large number of potential regressors. Since the assumption of normally distributed errors is often unrealistic in such settings, we extend the idea underlying the modified BIC to the context of robust regression.  相似文献   

2.
Data available in software engineering for many applications contains variability and it is not possible to say which variable helps in the process of the prediction. Most of the work present in software defect prediction is focused on the selection of best prediction techniques. For this purpose, deep learning and ensemble models have shown promising results. In contrast, there are very few researches that deals with cleaning the training data and selection of best parameter values from the data. Sometimes data available for training the models have high variability and this variability may cause a decrease in model accuracy. To deal with this problem we used the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) for selection of the best variables to train the model. A simple ANN model with one input, one output and two hidden layers was used for the training instead of a very deep and complex model. AIC and BIC values are calculated and combination for minimum AIC and BIC values to be selected for the best model. At first, variables were narrowed down to a smaller number using correlation values. Then subsets for all the possible variable combinations were formed. In the end, an artificial neural network (ANN) model was trained for each subset and the best model was selected on the basis of the smallest AIC and BIC value. It was found that combination of only two variables’ ns and entropy are best for software defect prediction as it gives minimum AIC and BIC values. While, nm and npt is the worst combination and gives maximum AIC and BIC values.  相似文献   

3.
This study addresses the problem of choosing the most suitable probabilistic model selection criterion for unsupervised learning of visual context of a dynamic scene using mixture models. A rectified Bayesian Information Criterion (BICr) and a Completed Likelihood Akaike’s Information Criterion (CL-AIC) are formulated to estimate the optimal model order (complexity) for a given visual scene. Both criteria are designed to overcome poor model selection by existing popular criteria when the data sample size varies from small to large and the true mixture distribution kernel functions differ from the assumed ones. Extensive experiments on learning visual context for dynamic scene modelling are carried out to demonstrate the effectiveness of BICr and CL-AIC, compared to that of existing popular model selection criteria including BIC, AIC and Integrated Completed Likelihood (ICL). Our study suggests that for learning visual context using a mixture model, BICr is the most appropriate criterion given sparse data, while CL-AIC should be chosen given moderate or large data sample sizes.  相似文献   

4.
TANC-BIC结构学习算法   总被引:1,自引:2,他引:1  
程泽凯  林士敏 《微机发展》2004,14(11):10-12
树扩展朴素贝叶斯分类器(TANC)是应用较广的一种贝叶斯分类器。TANC的分类性能优于朴素贝叶斯分类器(NBC)。现有的TANC结构学习算法是基于相关性分析的,采用互信息测度。贝叶斯信息测度(BIC)在基于打分和搜索的贝叶斯网络结构学习中取得了成功,文中用BIC测度来衡量属性结点之间的相关性,提出了一种新的TANC-BIC结构学习算法。在MBNC实验平台上编程实现了TANC-BIC算法,用分类准确率衡量算法的性能。实验结果表明,TANC-BIC算法是有效的。  相似文献   

5.
针对滑动变长窗口BIC算法冗余分割点多的问题,提出了基于小波子带平均能量方差和BIC的音频分割算法相结合。该算法用小波子带平均能量方差将连续音频流分割成音频段,然后用改进的滑动变长窗口BIC算法在音频段上检测声学改变点。实验表明,该算法取得了较好的分割效果,与滑动变长窗口的BIC算法相比,该算法的准确率、召回率和综合性能都得了提高。  相似文献   

6.
M.  J. 《Neurocomputing》2008,71(7-9):1321-1329
Bayesian information criterion (BIC) criterion is widely used by the neural-network community for model selection tasks, although its convergence properties are not always theoretically established. In this paper we will focus on estimating the number of components in a mixture of multilayer perceptrons and proving the convergence of the BIC criterion in this frame. The penalized marginal-likelihood for mixture models and hidden Markov models introduced by Keribin [Consistent estimation of the order of mixture models, Sankhya Indian J. Stat. 62 (2000) 49–66] and, respectively, Gassiat [Likelihood ratio inequalities with applications to various mixtures, Ann. Inst. Henri Poincare 38 (2002) 897–906] is extended to mixtures of multilayer perceptrons for which a penalized-likelihood criterion is proposed. We prove its convergence under some hypothesis which involve essentially the bracketing entropy of the generalized score-function class and illustrate it by some numerical examples.  相似文献   

7.
Canonical correlation analysis (CCA) is a widely used multivariate method for assessing the association between two sets of variables. However, when the number of variables far exceeds the number of subjects, such in the case of large-scale genomic studies, the traditional CCA method is not appropriate. In addition, when the variables are highly correlated, the sample covariance matrices become unstable or undefined. To overcome these two issues, sparse canonical correlation analysis (SCCA) for multiple data sets has been proposed using a Lasso type of penalty. However, these methods do not have direct control over the sparsity of the solution. An additional step that uses a Bayesian Information Criterion (BIC) has also been suggested to further filter out unimportant features. In this paper, a comparison of four penalty functions (Lasso, Elastic-net, smoothly clipped absolute deviation (SCAD), and Hard-threshold) for SCCA with and without the BIC filtering step have been carried out using both real and simulated genotypic and mRNA expression data. This study indicates that the SCAD penalty with a BIC filter would be a preferable penalty function for application of SCCA to genomic data.  相似文献   

8.
改进的BIC说话人分割算法   总被引:1,自引:1,他引:0       下载免费PDF全文
郑继明  张萍 《计算机工程》2010,36(17):240-242
针对多人说话改变点检测问题,提出一种改进的BIC说话人分割算法。采用固定窗BIC算法对音频流进行分割,利用基于递归的分割算法和变长窗口的BIC算法确认潜在的分割点。实验结果表明,与其他BIC算法相比,该算法的准确率、召回率和综合性能较高。  相似文献   

9.
提出了一种基于语种模型混淆度的模型参数估计方法,并结合贝叶斯信息准则(Bayesian information criterion,BIC)来进行模型的选取,避免了大量标注信息的需求.在NIST-07语种识别30,10和3s的测试任务中,分别给出了在最大似然(Maximum likelihood,ML)准则和最大互信息...  相似文献   

10.
王书海  刘刚  綦朝晖 《计算机工程》2008,34(15):229-230
针对入侵检测系统漏报率、误报率高的缺点,以贝叶斯信息标准(BIC)评分函数为尺度,结合爬山搜索算法,降低朴素贝叶斯网络模型的强独立性假设,提出更符合实际情形的BIC评分贝叶斯网络模型。对模型进行验证和性能分析,实验结果表明,基于BIC评分函数的贝叶斯网络模型对行为特征渐变的DoS攻击和刺探攻击具有较高识别率。  相似文献   

11.
Hidden Markov random fields appear naturally in problems such as image segmentation, where an unknown class assignment has to be estimated from the observations at each pixel. Choosing the probabilistic model that best accounts for the observations is an important first step for the quality of the subsequent estimation and analysis. A commonly used selection criterion is the Bayesian Information Criterion (BIC) of Schwarz (1978), but for hidden Markov random fields, its exact computation is not tractable due to the dependence structure induced by the Markov model. We propose approximations of BIC based on the mean field principle of statistical physics. The mean field theory provides approximations of Markov random fields by systems of independent variables leading to tractable computations. Using this principle, we first derive a class of criteria by approximating the Markov distribution in the usual BIC expression as a penalized likelihood. We then rewrite BIC in terms of normalizing constants, also called partition functions, instead of Markov distributions. It enables us to use finer mean field approximations and to derive other criteria using optimal lower bounds for the normalizing constants. To illustrate the performance of our partition function-based approximation of BIC as a model selection criterion, we focus on the preliminary issue of choosing the number of classes before the segmentation task. Experiments on simulated and real data point out our criterion as promising: It takes spatial information into account through the Markov model and improves the results obtained with BIC for independent mixture models.  相似文献   

12.
针对当前尚无建立简约高效语音识别系统标准方法的情形,提出了通过贝叶斯信息准则(Bayesian Information Criterion,BIC)中的权衡系数折中选择系统识别率与复杂度,利用改进的粒子群优化(Particle Swarm Optimization,PSO)算法优化声学模型拓扑结构,进而创建高效简约语音识别系统的新方法。TIDigits上的实验表明,与传统方法创建的同复杂度的基线系统相比,用该方法建立的新系统句子正确率提升了7.85%,与同识别率的基线系统相比,系统复杂度降低了51.4%,说明新系统能够以较低的复杂度获得较高的识别率。  相似文献   

13.
The advent of mixture models has opened the possibility of flexible models which are practical to work with. A common assumption is that practitioners typically expect that data are generated from a Gaussian mixture. The inverted Dirichlet mixture has been shown to be a better alternative to the Gaussian mixture and to be of significant value in a variety of applications involving positive data. The inverted Dirichlet is, however, usually undesirable, since it forces an assumption of positive correlation. Our focus here is to develop a Bayesian alternative to both the Gaussian and the inverted Dirichlet mixtures when dealing with positive data. The alternative that we propose is based on the generalized inverted Dirichlet distribution which offers high flexibility and ease of use, as we show in this paper. Moreover, it has a more general covariance structure than the inverted Dirichlet. The proposed mixture model is subjected to a fully Bayesian analysis based on Markov Chain Monte Carlo (MCMC) simulation methods namely Gibbs sampling and Metropolis–Hastings used to compute the posterior distribution of the parameters, and on Bayesian information criterion (BIC) used for model selection. The adoption of this purely Bayesian learning choice is motivated by the fact that Bayesian inference allows to deal with uncertainty in a unified and consistent manner. We evaluate our approach on the basis of two challenging applications concerning object classification and forgery detection.  相似文献   

14.
Node order is one of the most important factors in learning the structure of a Bayesian network (BN) for probabilistic reasoning. To improve the BN structure learning, we propose a node order learning algorithmbased on the frequently used Bayesian information criterion (BIC) score function. The algorithm dramatically reduces the space of node order and makes the results of BN learning more stable and effective. Specifically, we first find the most dependent node for each individual node, prove analytically that the dependencies are undirected, and then construct undirected subgraphs UG. Secondly, the UG- is examined and connected into a single undirected graph UGC. The relation between the subgraph number and the node number is analyzed. Thirdly, we provide the rules of orienting directions for all edges in UGC, which converts it into a directed acyclic graph (DAG). Further, we rank the DAG’s topology order and describe the BIC-based node order learning algorithm. Its complexity analysis shows that the algorithm can be conducted in linear time with respect to the number of samples, and in polynomial time with respect to the number of variables. Finally, experimental results demonstrate significant performance improvement by comparing with other methods.  相似文献   

15.
We propose an assessing method of mixture model in a cluster analysis setting with integrated completed likelihood. For this purpose, the observed data are assigned to unknown clusters using a maximum a posteriori operator. Then, the integrated completed likelihood (ICL) is approximated using the Bayesian information criterion (BIC). Numerical experiments on simulated and real data of the resulting ICL criterion show that it performs well both for choosing a mixture model and a relevant number of clusters. In particular, ICL appears to be more robust than BIC to violation of some of the mixture model assumptions and it can select a number of dusters leading to a sensible partitioning of the data  相似文献   

16.
We develop an interactive color image segmentation method in this paper. This method makes use of the conception of Markov random fields (MRFs) and D–S evidence theory to obtain segmentation results by considering both likelihood information and priori information under Bayesian framework. The method first uses expectation maximization (EM) algorithm to estimate the parameter of the user input regions, and the Bayesian information criterion (BIC) is used for model selection. Then the beliefs of each pixel are assigned by a predefined scheme. The result is obtained by iteratively fusion of the pixel likelihood information and the pixel contextual information until convergence. The method is initially designed for two-label segmentation, however it can be easily generalized to multi-label segmentation. Experimental results show that the proposed method is comparable to other prevalent interactive image segmentation algorithms in most cases of two-label segmentation task, both qualitatively and quantitatively.  相似文献   

17.
Comparison of model selection for regression   总被引:10,自引:0,他引:10  
Cherkassky V  Ma Y 《Neural computation》2003,15(7):1691-1714
We discuss empirical comparison of analytical methods for model selection. Currently, there is no consensus on the best method for finite-sample estimation problems, even for the simple case of linear estimators. This article presents empirical comparisons between classical statistical methods - Akaike information criterion (AIC) and Bayesian information criterion (BIC) - and the structural risk minimization (SRM) method, based on Vapnik-Chervonenkis (VC) theory, for regression problems. Our study is motivated by empirical comparisons in Hastie, Tibshirani, and Friedman (2001), which claims that the SRM method performs poorly for model selection and suggests that AIC yields superior predictive performance. Hence, we present empirical comparisons for various data sets and different types of estimators (linear, subset selection, and k-nearest neighbor regression). Our results demonstrate the practical advantages of VC-based model selection; it consistently outperforms AIC for all data sets. In our study, SRM and BIC methods show similar predictive performance. This discrepancy (between empirical results obtained using the same data) is caused by methodological drawbacks in Hastie et al. (2001), especially in their loose interpretation and application of SRM method. Hence, we discuss methodological issues important for meaningful comparisons and practical application of SRM method. We also point out the importance of accurate estimation of model complexity (VC-dimension) for empirical comparisons and propose a new practical estimate of model complexity for k-nearest neighbors regression.  相似文献   

18.
An optimization criterion is presented for discriminant analysis. The criterion extends the optimization criteria of the classical Linear Discriminant Analysis (LDA) through the use of the pseudoinverse when the scatter matrices are singular. It is applicable regardless of the relative sizes of the data dimension and sample size, overcoming a limitation of classical LDA. The optimization problem can be solved analytically by applying the Generalized Singular Value Decomposition (GSVD) technique. The pseudoinverse has been suggested and used for undersampled problems in the past, where the data dimension exceeds the number of data points. The criterion proposed in this paper provides a theoretical justification for this procedure. An approximation algorithm for the GSVD-based approach is also presented. It reduces the computational complexity by finding subclusters of each cluster and uses their centroids to capture the structure of each cluster. This reduced problem yields much smaller matrices to which the GSVD can be applied efficiently. Experiments on text data, with up to 7,000 dimensions, show that the approximation algorithm produces results that are close to those produced by the exact algorithm.  相似文献   

19.
Regression models are used in geosciences to extrapolate data and identify significant predictors of a response variable. Criterion approaches based on the residual sum of squares (RSS), such as the Akaike Information Criterion, Bayesian Information Criterion (BIC), Deviance Information Criterion, or Mallows' Cp can be used to compare non-nested models to identify an optimal subset of covariates. Computational limitations arise when the number of observations or candidate covariates is large in comparing all possible combinations of the available covariates, and in characterizing the covariance of the residuals for each examined model when the residuals are autocorrelated, as is often the case in spatial and temporal regression analysis. This paper presents computationally efficient algorithms for identifying the optimal model as defined using any RSS-based model selection criterion. The proposed dual criterion optimal branch and bound (DCO B&B) algorithm is guaranteed to identify the optimal model, while a single criterion heuristic (SCH) B&B algorithm provides further computational savings and approximates the optimal solution. These algorithms are applicable both to multiple linear regression (MLR) and to response variables with correlated residuals. We also propose an approach for iterative model selection, where a single set of covariance parameters is used in each iteration rather than a different set of parameters being used for each examined model. Simulation experiments are performed to evaluate the performance of the algorithms for regression models, using MLR and geostatistical regression as prototypical regression tools and BIC as a prototypical model selection approach. Results show massive computational savings using the DCO B&B algorithm relative to performing an exhaustive search. The SCH B&B is shown to provide a good approximation of the optimal model in most cases, while the DCO B&B with iterative covariance parameter optimization yields the closest approximation to the DCO B&B algorithm while also providing additional computational savings.  相似文献   

20.
Clustering problems are central to many knowledge discovery and data mining tasks. However, most existing clustering methods can only work with fixed-dimensional representations of data patterns. In this paper, we study the clustering of data patterns that are represented as sequences or time series possibly of different lengths. We propose a model-based approach to this problem using mixtures of autoregressive moving average (ARMA) models. We derive an expectation-maximization (EM) algorithm for learning the mixing coefficients as well as the parameters of the component models. To address the model selection problem, we use the Bayesian information criterion (BIC) to determine the number of clusters in the data. Experiments are conducted on a number of simulated and real datasets. Results from the experiments show that our method compares favorably with other methods proposed previously by others for similar time series clustering tasks.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号