首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 750 毫秒
1.
In the current context of an aging population in many developed countries, the issue of healthy aging is at the forefront of the political, scientific, and technological concerns. The frailty accompanying the late years of elderly people (>70 years old) deserves special consideration due to its great economical and personal costs and the workload imposed on the health care system. Hospital readmissions under a short time after hospital discharge are one of the sources of concern, and much effort is being devoted to their prediction for better care of the elder and optimized resource management. In this paper, we consider the prediction of readmissions for patients that are evaluated positively in the frailty scales. The computational experiments are carried out over a gender-balanced cohort of 645 patients recruited at the University Hospital of Alava. We report machine-learning prediction results of the readmission before the standard readmission limit of 30 days. We apply an upsampling technique to correct for class imbalance. Results are positive, encouraging further research and the creation of larger cohorts in international efforts.  相似文献   

2.
ABSTRACT

Emergency Departments (ED) suffer heavy overload due to lack of primary attention service. Increasingly geriatric admissions pose specific problems contributing to this overload. A consequence is the increase of patient returning short time after discharge, i.e., readmissions, sometimes requiring hospitalization. In this latter case the patient problem was not solved in the first admission and the condition has aggravated. The time threshold defining a patient comeback as readmission varies; therefore we have considered several such thresholds in our prediction experiments. Prediction of hospitalization following ED readmission is posed over a heavily imbalanced class distribution, so we have considered several approaches to deal with imbalanced datasets and several base classifiers, as well as performance measures that enhance the critical comparison between approaches. Experimental works are carried out on real data from a university hospital in Santiago, Chile, corresponding to a period of 3 years, including pediatric and adult admissions to the ED. We achieve results that encourage the development of real life application of the data balancing and classification approach for prediction of hospitalization after readmission.  相似文献   

3.
This study was purposed to study the relationship between healthcare information system (HIS) and cost in hospital. It also analyzed which organizational factors affect the adoption of information system in hospitals. The study included 577 hospitals in the statistical analysis. The HIS was measured by three indicators, which is based on the number of application systems in three core hospital functions (administration, management, and clinical). American Hospital Association and Dorenfest IHDS were merged to create sample data. Structural equation modeling was applied to estimate the parameters of the model. HIS is negatively associated with the total expense. However, it was not statistically significant. The internal variables of hospitals such as location, size, and system hospital variables were significantly related to the extent of HIS adoption.  相似文献   

4.
For a linear multilevel model with 2 levels, with equal numbers of level-1 units per level-2 unit and a random intercept only, different empirical Bayes estimators of the random intercept are examined. Studied are the classical empirical Bayes estimator, the Morris version of the empirical Bayes estimator and Rao's estimator. It is unclear which of these estimators performs best in terms of Bayes risk. Of these three, the Rao estimator is optimal in case the covariance matrix of random coefficients may be negative definite. However, in the multilevel model this matrix is restricted to be positive semi-definite. The Morris version, replaces the weights of the empirical Bayes estimator by unbiased estimates. This correction, however, is based on known level-1 variances, which in many empirical settings are unknown. A fourth estimator is proposed, a variant of Rao's estimator which restricts the estimated covariance matrix of random coefficients to be positive semi-definite. Since there are no closed-form expressions for estimators involved in the empirical Bayes estimators (except for the Rao estimator), Monte Carlo simulations are done to evaluate the performance of these different empirical Bayes estimators. Only for small sample sizes there are clear differences between these estimators. As a consequence, for larger sample sizes the formula for the Bayes risk of the Rao estimator can be used to calculate the Bayes risk for the other estimators proposed.  相似文献   

5.
Phase-type distributions represent the time to absorption for a finite state Markov chain in continuous time, generalising the exponential distribution and providing a flexible and useful modelling tool. We present a new reversible jump Markov chain Monte Carlo scheme for performing a fully Bayesian analysis of the popular Coxian subclass of phase-type models; the convenient Coxian representation involves fewer parameters than a more general phase-type model. The key novelty of our approach is that we model covariate dependence in the mean whilst using the Coxian phase-type model as a very general residual distribution. Such incorporation of covariates into the model has not previously been attempted in the Bayesian literature. A further novelty is that we also propose a reversible jump scheme for investigating structural changes to the model brought about by the introduction of Erlang phases. Our approach addresses more questions of inference than previous Bayesian treatments of this model and is automatic in nature. We analyse an example dataset comprising lengths of hospital stays of a sample of patients collected from two Australian hospitals to produce a model for a patient’s expected length of stay which incorporates the effects of several covariates. This leads to interesting conclusions about what contributes to length of hospital stay with implications for hospital planning. We compare our results with an alternative classical analysis of these data.  相似文献   

6.
A statistical software testing model is proposed in which white box factors have a role. The model combines test adequacy notions with statistical analysis, and in so doing provides a rudimentary treatment of dependencies between test results caused by the execution of common code during the tests. The model is used to estimate the probability of failure on demand for software performing safety shutdown functions on large plants and concerns the case where extensive test results are available on the latest version of the software, none of which have resulted in software failure. According to the model, there are circumstances in which some current statistical models for dynamic software testing are too conservative, and others are not conservative, depending on the software architecture  相似文献   

7.
The age of a building influences its form and fabric composition and this in turn is critical to inferring its energy performance. However, often this data is unknown. In this paper, we present a methodology to automatically identify the construction period of houses, for the purpose of urban energy modelling and simulation. We describe two major stages to achieving this – a per-building classification model and post-classification analysis to improve the accuracy of the class inferences. In the first stage, we extract measures of the morphology and neighbourhood characteristics from readily available topographic mapping, a high-resolution Digital Surface Model and statistical boundary data. These measures are then used as features within a random forest classifier to infer an age category for each building. We evaluate various predictive model combinations based on scenarios of available data, evaluating these using 5-fold cross-validation to train and tune the classifier hyper-parameters based on a sample of city properties. A separate sample estimated the best performing cross-validated model as achieving 77% accuracy. In the second stage, we improve the inferred per-building age classification (for a spatially contiguous neighbourhood test sample) through aggregating prediction probabilities using different methods of spatial reasoning. We report on three methods for achieving this based on adjacency relations, near neighbour graph analysis and graph-cuts label optimisation. We show that post-processing can improve the accuracy by up to 8 percentage points.  相似文献   

8.
This paper presents a method for dealing with parameter uncertainty in system design which is based on the study of the statistical properties of an ensemble of systems defined by a given structure and by a priori parameter distributions rather than point parameter estimates. It is assumed that the model of the actual system is a random member of the ensemble. The object of the analysis is to design or modify the properties of the ensemble to ensure a high probability of adequate performance of the actual system. The primary statistical function employed is the sample distribution function. This function is used to estimate the true population distribution of a scalar variable chosen to measure the system property of interest. The sample distribution function is constructed from random samples of this figure of merit generated by a suitable digital computer programme. The accuracy of the estimation of the population distribution by the sample distribution is determined by application of statistical results of Kolmogorov and Rényi.  相似文献   

9.
随着人口老龄化加剧,心力衰竭发病率升高,心衰患者的非计划性再入院问题导致患者生存质量降低、医疗成本升高的情况日益严重,因此成为了一个亟待解决的问题。本文针对再入院风险预测问题,提出一种基于ADE-Stacking的心衰患者非计划性再入院风险预测模型,这一模型主要由集成学习算法模型构建与参数优化2部分构成,集成学习算法可以结合多个弱分类器的优势,使模型具有更好的泛化性和准确率,参数优化部分采用自适应收缩因子F改进的差分进化算法寻优,以提高参数寻优性能。使用心力衰竭再入院病人数据集对模型进行训练与测试,结果显示本文所提出的模型优于风险预测模型常用的随机森林、XGBoost、支持向量机等其他机器学习算法。  相似文献   

10.
The bivariate random effects model has been advocated for the meta-analysis of diagnostic accuracy despite scarce information regarding its statistical performance for non-comparative categorical outcomes. Four staggered simulation experiments using a full-factorial design were conducted to assess such performance over a wide range of scenarios. The number of studies, the number of individuals per study, diagnostic accuracy values, heterogeneity, correlation, and disease prevalence were evaluated as factors. Univariate and bivariate random effects were estimated using NLMIXED with trust region optimization. Bias, accuracy, and coverage probability were evaluated as performance metrics among 1000 replicates in 272 different scenarios. Number of studies, individuals per study, and heterogeneity were the most influential meta-analytic factors affecting most metrics in all parameters for both random effects models. More studies improved all metrics while low heterogeneity benefited fixed and random effects but not the correlation. About twenty studies are required to obtain random effects estimates with good statistical properties in the presence of moderate heterogeneity, while only the univariate model should be used when few studies are summarized. In general, the bivariate model is advantageous for meta-analyses of diagnostic accuracy with complete data only when the correlation is of interest.  相似文献   

11.
Imagine numerous clients, each with personal data; individual inputs are severely corrupt, and a server only concerns the collective, statistically essential facets of this data. In several data mining methods, privacy has become highly critical. As a result, various privacy-preserving data analysis technologies have emerged. Hence, we use the randomization process to reconstruct composite data attributes accurately. Also, we use privacy measures to estimate how much deception is required to guarantee privacy. There are several viable privacy protections; however, determining which one is the best is still a work in progress. This paper discusses the difficulty of measuring privacy while also offering numerous random sampling procedures and statistical and categorized data results. Furthermore, this paper investigates the use of arbitrary nature with perturbations in privacy preservation. According to the research, arbitrary objects (most notably random matrices) have "predicted" frequency patterns. It shows how to recover crucial information from a sample damaged by a random number using an arbitrary lattice spectral selection strategy. This filtration system's conceptual framework posits, and extensive practical findings indicate that sparse data distortions preserve relatively modest privacy protection in various situations. As a result, the research framework is efficient and effective in maintaining data privacy and security.  相似文献   

12.
随机森林是一种组合分类器技术,相较于决策树等单分类器,具有更好的预测和分类性能,但其也存在一些问题:因为随机森林自身的随机性,导致预测结果存在波动性;所使用的原始数据集样本基数大,维数多,增加了随机森林组合分类器的训练时间。针对以上问题,提出优化随机森林模型,对数据集进行数据集预处理和PCA降维操作,引入累计贡献率。结合选择的最佳阈值进行最终的预测结果分类,提高了模型的训练速度、预测准确率和稳定性。实验证明,该方法具有更优越的预测性能。  相似文献   

13.
Mixed-effects linear regression models have become more widely used for analysis of repeatedly measured outcomes in clinical trials over the past decade. There are formulae and tables for estimating sample sizes required to detect the main effects of treatment and the treatment by time interactions for those models. A formula is proposed to estimate the sample size required to detect an interaction between two binary variables in a factorial design with repeated measures of a continuous outcome. The formula is based, in part, on the fact that the variance of an interaction is fourfold that of the main effect. A simulation study examines the statistical power associated with the resulting sample sizes in a mixed-effects linear regression model with a random intercept. The simulation varies the magnitude (Δ) of the standardized main effects and interactions, the intraclass correlation coefficient (ρ), and the number (k) of repeated measures within-subject. The results of the simulation study verify that the sample size required to detect a 2×2 interaction in a mixed-effects linear regression model is fourfold that to detect a main effect of the same magnitude.  相似文献   

14.
Acute coronary syndrome (ACS) is a leading cause of mortality and morbidity in the Arabian Gulf. In this study, the in‐hospital mortality amongst patients admitted with ACS to Arabian Gulf hospitals is predicted using a comprehensive modelling framework that combines powerful machine‐learning methods such as support‐vector machine (SVM), Naïve Bayes (NB), artificial neural networks (NN), and decision trees (DT). The performance of the machine‐learning methods is compared with that of the performance of a commonly used statistical method, namely, logistic regression (LR). The study follows the current practise of computing mortality risk using risk scores such as the Global Registry of Acute Coronary Events (GRACE) score, which has not been validated for Arabian Gulf patients. Cardiac registry data of 7,000 patients from 65 hospitals located in Arabian Gulf countries are used for the study. This study is unique as it uses a contemporary data analytics framework. A k‐fold (k = 10) cross‐validation is utilized to generate training and validation samples from the GRACE dataset. The machine‐learning‐based predictive models often incur prejudgments for imbalanced training data patterns. To mitigate the data imbalance due to scarce observations for in‐hospital mortalities, we have utilized specialized methods such as random undersampling (RUS) and synthetic minority over sampling technique (SMOTE). A detailed simulation experimentation is carried out to build models with each of the five predictive methods (LR, NN, NB, SVM, and DT) for the each of the three datasets k‐fold subsamples generated. The predictive models are developed under three schemes of the k‐fold samples that include no data imbalance, RUS, and SMOTE. We have implemented an information fusion method rooted in computing weighted impact scores obtain for an individual medical history attributes from each of the predictive models simulated for a collective recommendation based on an impact score specific to a predictor. Finally, we grouped the predictors using fuzzy c‐mean clustering method into three categories, high‐, medium‐, and low‐risk factors for in‐hospital mortality due to ACS. Our study revealed that patients with medical history related to the presences of peripheral artery disease, congestive heart failure, cardiovascular transient ischemic attack valvular disease, and coronary artery bypass grafting amongst others have the most risk for in‐hospital mortality.  相似文献   

15.
A new image segmentation algorithm is presented, based on recursive Bayes smoothing of images modeled by Markov random fields and corrupted by independent additive noise. The Bayes smoothing algorithm yields the a posteriori distribution of the scene value at each pixel, given the total noisy image, in a recursive way. The a posteriori distribution together with a criterion of optimality then determine a Bayes estimate of the scene. The algorithm presented is an extension of a 1-D Bayes smoothing algorithm to 2-D and it gives the optimum Bayes estimate for the scene value at each pixel. Computational concerns in 2-D, however, necessitate certain simplifying assumptions on the model and approximations on the implementation of the algorithm. In particular, the scene (noiseless image) is modeled as a Markov mesh random field, a special class of Markov random fields, and the Bayes smoothing algorithm is applied on overlapping strips (horizontal/vertical) of the image consisting of several rows (columns). It is assumed that the signal (scene values) vector sequence along the strip is a vector Markov chain. Since signal correlation in one of the dimensions is not fully used along the edges of the strip, estimates are generated only along the middle sections of the strips. The overlapping strips are chosen such that the union of the middle sections of the strips gives the whole image. The Bayes smoothing algorithm presented here is valid for scene random fields consisting of multilevel (discrete) or continuous random variables.  相似文献   

16.

Due to its vital role in healthcare system, performance evaluation of hospitals is indispensable. In addition, hospitals try to achieve desired and efficient conditions by careful planning based on their present facilities. Several studies have been conducted on hospital evaluation, but nearly none of them has taken into consideration the difference in the nature of performance in respect of hospitals’ managerial construction, funding, and type of services provided by them. Furthermore, hospitals’ outputs have not been estimated in respect of cause and effect relationships between inputs and outputs for achieving efficient conditions. In the present study, first, a new approach for hospital evaluation is presented according to the differences in the nature of their performances while categorized in groups. Then, optimal outputs for each hospital in its own group are dealt with using results obtained from multi-group data envelopment analysis and the method of fuzzy cognitive map. The activation Hebbian learning (AHL) algorithm is adapted to the concept of efficiency and is conducted to estimate the outputs of inefficient hospitals. In the present study, 27 hospitals located in the provincial capitals in northwest of Iran are categorized in four groups including general governmental hospitals, specialty governmental hospitals, private hospitals, and social security hospitals. Afterward, optimal outputs are estimated for the inefficient hospitals by using the proposed modified AHL algorithm. The results indicate that when the hospitals have been evaluated in groups, efficiency scores of hospitals have changed. Also, given the cause and effect relationships between inputs and outputs in each group can help to decision and policy makers to estimate the optimal outputs that have caused inefficient hospitals become efficient.

  相似文献   

17.
Artificial bee colony or ABC is one of the newest additions to the class of population based Nature Inspired Algorithms. In the present study we suggest some modifications in the structure of basic ABC to further improve its performance. The corresponding algorithms proposed in the present study are named Intermediate ABC (I-ABC) and I-ABC greedy. In I-ABC, the potential food sources are generated by using the intermediate positions between the uniformly generated random numbers and random numbers generated by opposition based learning (OBL). I-ABC greedy is a variation of I-ABC, where the search is always forced to move towards the solution vector having the best fitness value in the population. While the use of OBL provides a priori information about the search space, the component of greediness improves the convergence rate. The performance of proposed I-ABC and I-ABC greedy are investigated on a comprehensive set of 13 classical benchmark functions, 25 composite functions included in the special session of CEC 2005 and eleven shifted functions proposed in the special session of CEC 2008, ISDA 2009, CEC 2010 and SOCO 2010. Also, the efficiency of the proposed algorithms is validated on two real life problems; frequency modulation sound parameter estimation and to estimate the software cost model parameters. Numerical results and statistical analysis demonstrates that the proposed algorithms are quite competent in dealing with different types of problems.  相似文献   

18.
In this study, regression models are evaluated for grouped survival data when the effect of censoring time is considered in the model and the regression structure is modeled through four link functions. The methodology for grouped survival data is based on life tables, and the times are grouped in k intervals so that ties are eliminated. Thus, the data modeling is performed by considering the discrete models of lifetime regression. The model parameters are estimated by using the maximum likelihood and jackknife methods. To detect influential observations in the proposed models, diagnostic measures based on case deletion, which are denominated global influence, and influence measures based on small perturbations in the data or in the model, referred to as local influence, are used. In addition to those measures, the local influence and the total influential estimate are also employed. Various simulation studies are performed and compared to the performance of the four link functions of the regression models for grouped survival data for different parameter settings, sample sizes and numbers of intervals. Finally, a data set is analyzed by using the proposed regression models.  相似文献   

19.
师小琳 《计算机应用》2011,31(8):2059-2061
自回归(AR)模型和Clarke模型(CLARKE R H. A statistical theory of mobile-radio reception. Bell System Technology Journal, 1968, 47(6): 957-1000)常用于产生瑞利(Rayleigh)衰落信道。但是仿真结果显示自回归模型有其自身无法避免的数值缺陷,因此无法产生理想的Rayleigh衰落信道特性。在Clarke模型的基础上提出了改进模型。新模型合理地减少了入射波到达角的随机分布范围,在保证到达角随机分布特性的同时获得了信道模型更好的统计特性。仿真结果表明,改进模型比Clarke模型具有更好的统计性能,可以用于产生较理想的Rayleigh衰落信道。  相似文献   

20.
为有效地存储指数级增长的数据集,人们通常利用分布式有序表来存储数据,数据批量插入是数据库系统中的一个十分常见的操作,所以如何在分布式有序表中高效地执行数据批量插入操作就十分重要.现有方法是利用一个插入前的计划过程,可以较好地执行批量插入操作,可是该方法要求获得所有新数据,关键在于获得新数据较准确的数据分布,提出一种在线计划的批量插入操作,不需要等待所有的数据接收完毕才开始执行计划过程,而是根据获得新数据的样本,利用内核密度估计方法,较准确地估计新数据分布,并且还提供了计算估计分布置信区间的方法,如果估计分布的置信区间超过系统给定的阈值就可以执行计划操作.在实验给定数据集上,系统只需要接收0.1%的样本数据就可以得到概率为95%、误差在0.05之内的估计分布.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号