首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
For a linear noisy system assume an estimatehat{beta}of the unknown parameter vector was obtained from a certain past data on inputs and outputs of the system. The prediction of the output for any input vectorxis a linear combination ofxandhat{beta}. The question of concern is: For what - values of the inputs would the error in their predicted outputs be the smallest? It is found that they are the inputs which lie on the fitted plane of the orthogonal regression among the past data of the clean inputs.  相似文献   

2.
Due to the difficulties of outlier and skewed data, the prediction of breast cancer survivability has presented many challenges in the field of data mining and pattern precognition, especially in medical research. To solve these problems, we have proposed a hybrid approach to generating higher quality data sets in the creation of improved breast cancer survival prediction models. This approach comprises two main steps: (1) utilization of an outlier filtering approach based on C-Support Vector Classification (C-SVC) to identify and eliminate outlier instances; and (2) application of an over-sampling approach using over-sampling with replacement to increase the number of instances in the minority class. In order to assess the capability and effectiveness of the proposed approach, several measurement methods including basic performance (e.g., accuracy, sensitivity, and specificity), Area Under the receiver operating characteristic Curve (AUC) and F-measure were utilized. Moreover, a 10-fold cross-validation method was used to reduce the bias and variance of the results of breast cancer survivability prediction models. Results have indicated that the proposed approach leads to improving the performance of breast cancer survivability prediction models by up to 28.34% due to the improved training data space.  相似文献   

3.
Multimedia Tools and Applications - Breast cancer is one of the most common types of cancer among Jordanian women. Recently, healthcare organizations in Jordan have adopted electronic health...  相似文献   

4.
We have in previous studies reported our findings and concern about the reliability and validity of the evaluation procedures used in comparative studies on competing effort prediction models. In particular, we have raised concerns about the use of accuracy statistics to rank and select models. Our concern is strengthened by the observed lack of consistent findings. This study offers more insights into the causes of conclusion instability by elaborating on the findings of our previous work concerning the reliability and validity of the evaluation procedures. We show that model selection based on the accuracy statistics MMRE, MMER, MBRE, and MIBRE contribute to conclusion instability as well as selection of inferior models. We argue and show that the evaluation procedure must include an evaluation of whether the functional form of the prediction model makes sense to better prevent selection of inferior models.  相似文献   

5.
This paper introduces a number of reliability criteria for computer-aided diagnostic systems for breast cancer. These criteria are then used to analyze some published neural network systems. It is also shown that the property of monotonicity for the data is rather natural in this medical domain, and it has the potential to significantly improve the reliability of breast cancer diagnosis while maintaining a general representation power. A central part of this paper is devoted to the representation/narrow vicinity hypothesis, upon which existing computer-aided diagnostic methods heavily rely. The paper also develops a framework for determining the validity of this hypothesis. The same framework can be used to construct a diagnostic procedure with improved reliability.  相似文献   

6.
This paper addresses the problem of constructing reliable interval predictors directly from observed data. Differently from standard predictor models, interval predictors return a prediction interval as opposed to a single prediction value. We show that, in a stationary and independent observations framework, the reliability of the model (that is, the probability that the future system output falls in the predicted interval) is guaranteed a priori by an explicit and non-asymptotic formula, with no further assumptions on the structure of the unknown mechanism that generates the data. This fact stems from a key result derived in this paper, which relates, at a fundamental level, the reliability of the model to its complexity and to the amount of available information (number of observed data).  相似文献   

7.
Several tools have been developed for the estimation of software reliability. However, they are highly specialized in the approaches they implement and the particular phase of the software life-cycle in which they are applicable. There is an increasing need for a tool that can be used to track the quality of a software product during the software life-cycle, right from the architectural phase all the way up to the operational phase of the software. Also the conventional techniques for software reliability evaluation, which treat the software as a monolithic entity, are inadequate to assess the reliability of heterogeneous systems, which consist of a large number of globally distributed components. Architecture-based approaches are essential to assess the reliability and performance of such systems. This paper presents the high-level design of a software reliability estimation and prediction tool (SREPT), that offers a unified framework consisting of techniques (including the architecture-based approach) to assist in the evaluation of software reliability during all phases of the software life-cycle.  相似文献   

8.
Estimating the risk of relapse for breast cancer patients is necessary, since it affects the choice of treatment. This problem involves analysing data of times to relapse of patients and relating them to prognostic variables. Some of the times to relapse will usually be censored.We investigate various ways of using neural network models to extend traditional statistical models in this situation. Such models are better able to model both non-linear effects of prognostic factors and interactions between them, than linear logistic or Cox regression models. With the dataset used in our study, however, the prediction of the risk of relapse is not significantly improved when using a neural network model. Predicting the risk that a patient will relapse within three years, say, is possible from this data, but not when any relapse will happen.  相似文献   

9.
Although there are many arguments that logic is an appropriate tool for artificial intelligence, there has been a perceived problem with the monotonicity of classical logic. This paper elaborates on the idea that reasoning should be viewed as theory formation where logic tells us the consequences of our assumptions. The two activities of predicting what is expected to be true and explaining observations are considered in a simple theory formation framework. Properties of each activity are discussed, along with a number of proposals as to what should be predicted or accepted as reasonable explanations. An architecture is proposed to combine explanation and prediction into one coherent framework. Algorithms used to implement the system as well as examples from a running implementation are given.  相似文献   

10.
Traditional parametric software reliability growth models (SRGMs) are based on some assumptions or distributions and none such single model can produce accurate prediction results in all circumstances. Non-parametric models like the artificial neural network (ANN) based models can predict software reliability based on only fault history data without any assumptions. In this paper, initially we propose a robust feedforward neural network (FFNN) based dynamic weighted combination model (PFFNNDWCM) for software reliability prediction. Four well-known traditional SRGMs are combined based on the dynamically evaluated weights determined by the learning algorithm of the proposed FFNN. Based on this proposed FFNN architecture, we also propose a robust recurrent neural network (RNN) based dynamic weighted combination model (PRNNDWCM) to predict the software reliability more justifiably. A real-coded genetic algorithm (GA) is proposed to train the ANNs. Predictability of the proposed models are compared with the existing ANN based software reliability models through three real software failure data sets. We also compare the performances of the proposed models with the models that can be developed by combining three or two of the four SRGMs. Comparative studies demonstrate that the PFFNNDWCM and PRNNDWCM present fairly accurate fitting and predictive capability than the other existing ANN based models. Numerical and graphical explanations show that PRNNDWCM is promising for software reliability prediction since its fitting and prediction error is much less relative to the PFFNNDWCM.  相似文献   

11.
When a woman diagnosed as having breast cancer has a tumour removed, it is important to try and predict whether she is likely to relapse within, say, the next three years. In this paper, the performance of a neural network classifier trained on a number of prognostic indicators is shown to be better than that of the clinical experts working with the same information. To obtain meaningful statistics with the relatively small dataset available, the network is trained using a modified form of the leave-one-out method. A procedure is also introduced for investigating how much independentinformation each input parameter contributes. This shows that, in this type of retrospective study, the type of therapy given to the woman does not significantly affect the network's prediction of whether or not she will relapse within three years. Finally, since this problem, in common with many other medical problems, is plagued by a shortage of data, the final section of the paper reports on an investigation of whether or not multi-centre databases might be feasible.  相似文献   

12.

Soccer match attendance is an example of group behavior with noisy context that can only be approximated by a limited set of quantifiable factors. However, match attendance is representative of a wider spectrum of context-based behaviors for which only the aggregate effect of otherwise individual decisions is observable. Modeling of such behaviors is desirable from the perspective of economics, psychology, and other social studies with prospective use in simulators, games, product planning, and advertising. In this paper, we evaluate the efficiency of different neural network architectures as models of context in attendance behavior by comparing the achieved prediction accuracy of a multilayer perceptron (MLP), an Elman recurrent neural network (RNN), a time-lagged feedforward neural network (TLFN), and a radial basis function network (RBFN) against a multiple linear regression model, an autoregressive moving average model with exogenous inputs, and a naive cumulative mean model. We show that the MLP, TLFN, and RNN are superior to the RBFN and achieve comparable prediction accuracy on datasets of three teams from the English Football League Championship, which indicates weak importance of context transition modeled by the TLFN and the RNN. The experiments demonstrate that all neural network models outperform linear predictors by a significant margin. We show that neural models built on individual datasets achieve better performance than a generalized neural model constructed from pooled data. We analyze the input parameter influences extracted from trained networks and show that there is an agreement between nonlinear and linear measures about the most significant attributes.

  相似文献   

13.
This paper is focused on choosing a sufficient number of runs of a coupling Markov chain that makes it possible to generate, with a high confidence level, hypotheses such that at least one of them is inserted into any test example with high probability of positive prediction. The proposed technique is based on the Vapnik–Chervonenkis resampling method.  相似文献   

14.
There exist several methods for binary classification of gene expression data sets. However, in the majority of published methods, little effort has been made to minimize classifier complexity. In view of the small number of samples available in most gene expression data sets, there is a strong motivation for minimizing the number of free parameters that must be fitted to the data. In this paper, a method is introduced for evolving (using an evolutionary algorithm) simple classifiers involving a minimal subset of the available genes. The classifiers obtained by this method perform well, reaching 97% correct classification of clinical outcome on training samples from the breast cancer data set published by van't Veer, and up to 89% correct classification on validation samples from the same data set, easily outperforming previously published results.  相似文献   

15.
Early breast cancer recurrence is indicative of poor response to adjuvant therapy and poses threats to patients’ lives. Most existing prediction models for breast cancer recurrence are regression-based models and difficult to interpret. We apply a Decision Tree algorithm to the clinical information of a cohort of non-metastatic invasive breast cancer patients, to establish a classifier that categorizes patients based on whether they develop early recurrence and on similarities of their clinical and pathological diagnoses. The classifier predicts for whether a patient developed early disease recurrence; and is estimated to be about 70% accurate. For an independent validation cohort of 65 patients, the classifier predicts correctly for 55 patients. The classifier also groups patients based on intrinsic properties of their diseases; and for each subgroup lists the disease characteristics in a hierarchal order, according to their relevance to early relapse. Overall, it identifies pathological nodal stage, percentage of intra-tumor stroma and components of TGFβ-Smad signaling pathway as highly relevant factors for early breast cancer recurrence. Since most of the disease characteristics used by this classifier are results of standardized tests, routinely collected during breast cancer diagnosis, the classifier can easily be adopted in various research and clinical settings.  相似文献   

16.
为了研究AdaBoost算法在乳腺癌疾病预测中的应用,收集乳腺癌诊断数据集并按照一定的比例拆分成测试数据和训练数据.利用AdaBoost、GaussianNB、KNeighbors算法模型分别进行测试,以准确率为评价标准来评价模型性能的好坏.当测试数据占30%时,AdaBoost算法模型预测乳腺癌疾病优于其他算法模型,...  相似文献   

17.
18.
The task of breast density quantification is becoming increasingly relevant due to its association with breast cancer risk. In this work, a semi-automated and a fully automated tools to assess breast density from full-field digitized mammograms are presented. The first tool is based on a supervised interactive thresholding procedure for segmenting dense from fatty tissue and is used with a twofold goal: for assessing mammographic density (MD) in a more objective and accurate way than via visual-based methods and for labeling the mammograms that are later employed to train the fully automated tool. Although most automated methods rely on supervised approaches based on a global labeling of the mammogram, the proposed method relies on pixel-level labeling, allowing better tissue classification and density measurement on a continuous scale. The fully automated method presented combines a classification scheme based on local features and thresholding operations that improve the performance of the classifier. A dataset of 655 mammograms was used to test the concordance of both approaches in measuring MD. Three expert radiologists measured MD in each of the mammograms using the semi-automated tool (DM-Scan). It was then measured by the fully automated system and the correlation between both methods was computed. The relation between MD and breast cancer was then analyzed using a case–control dataset consisting of 230 mammograms. The Intraclass Correlation Coefficient (ICC) was used to compute reliability among raters and between techniques. The results obtained showed an average ICC = 0.922 among raters when using the semi-automated tool, whilst the average correlation between the semi-automated and automated measures was ICC = 0.838. In the case–control study, the results obtained showed Odds Ratios (OR) of 1.38 and 1.50 per 10% increase in MD when using the semi-automated and fully automated approaches respectively. It can therefore be concluded that the automated and semi-automated MD assessments present a good correlation. Both the methods also found an association between MD and breast cancer risk, which warrants the proposed tools for breast cancer risk prediction and clinical decision making. A full version of the DM-Scan is freely available.  相似文献   

19.
Multimedia Tools and Applications - Breast cancer is the second popular cause of the women’s death. There are some existing techniques for identifying the breast cancer and one of them is...  相似文献   

20.
针对现有处理不完全数据的填充方法,对数据集引入新的噪声这一问题,提出一种基于最大间隔理论的预测模型,直接使用含缺失特征的样本进行预测,通过实验验证该模型优于常用的填充模型。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号