首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 500 毫秒
1.
Background: Conclusion Instability in software effort estimation (SEE) refers to the inconsistent results produced by a diversity of predictors using different datasets. This is largely due to the “ranking instability” problem, which is highly related to the evaluation criteria and the subset of the data being used. Aim: To determine stable rankings of different predictors. Method: 90 predictors are used with 20 datasets and evaluated using 7 performance measures, whose results are subject to Wilcoxon rank test (95 %). These results are called the “aggregate results”. The aggregate results are challenged by a sanity check, which focuses on a single error measure (MRE) and uses a newly developed evaluation algorithm called CLUSTER. These results are called the “specific results.” Results: Aggregate results show that: (1) It is now possible to draw stable conclusions about the relative performance of SEE predictors; (2) Regression trees or analogy-based methods are the best performers. The aggregate results are also confirmed by the specific results of the sanity check. Conclusion: This study offers means to address the conclusion instability issue in SEE, which is an important finding for empirical software engineering.  相似文献   

2.
Stable rankings for different effort models   总被引:1,自引:0,他引:1  
There exists a large and growing number of proposed estimation methods but little conclusive evidence ranking one method over another. Prior effort estimation studies suffered from “conclusion instability”, where the rankings offered to different methods were not stable across (a) different evaluation criteria; (b) different data sources; or (c) different random selections of that data. This paper reports a study of 158 effort estimation methods on data sets based on COCOMO features. Four “best” methods were detected that were consistently better than the “rest” of the other 154 methods. These rankings of “best” and “rest” methods were stable across (a) three different evaluation criteria applied to (b) multiple data sets from two different sources that were (c) divided into hundreds of randomly selected subsets using four different random seeds. Hence, while there exists no single universal “best” effort estimation method, there appears to exist a small number (four) of most useful methods. This result both complicates and simplifies effort estimation research. The complication is that any future effort estimation analysis should be preceded by a “selection study” that finds the best local estimator. However, the simplification is that such a study need not be labor intensive, at least for COCOMO style data sets.  相似文献   

3.
By and large, given the inherent subjectivity in defining and measuring factors used in algorithmic effort estimation methods, when algorithmic methods produce consistent estimates it seems reasonable to assume that this is in part due to estimator experience. Further, software development factors are usually assumed to have different degrees of influence on actual effort. For example, no specific allowances for program language or problem domain were made in the original COCOMO model or in Albrecht's Function Points, whilst allowances for development mode in COCOMO and function type complexity for Albrecht's Function Points are crucial. However, work has been conducted that concluded that 4GLs are associated with higher productivity than 3GLs. Clearly, we can support such conclusions about productivity, since, for example, it usually requires less effort to develop a database using a purposely designed DBMS product than it does using a 3GL. However, in general, for a given problem domain an appropriate development language and platform will be selected. Hence, we might feel that an appropriate development language will not be a factor that influences estimate consistency unduly, given that an estimator has experience of the problem domain. However, algorithmic methods usually require calibration to different problem domains. Calibration may be needed because the method was originally designed using data from another type of domain. Furthermore, estimators' estimation consistency within problem domains may be affected for one or more reasons. Intuitively, reasons might include: estimators lack estimation experience in some domains; or the development team(s) may have different levels of experience in different domains, which the estimator finds difficult to take into account. We demonstrate how, in general, the influence of problem domain may be assessed using a Hierarchical Bayesian inference procedure. We also show how values can be derived to account for variations in estimate consistency in problem domains.  相似文献   

4.
In project management, effective cost estimation is one of the most crucial activities to efficiently manage resources by predicting the required cost to fulfill a given task. However, finding the best estimation results in software development is challenging. Thus, accurate estimation of software development efforts is always a concern for many companies. In this paper, we proposed a novel software development effort estimation model based both on constructive cost model II (COCOMO II) and the artificial neural network (ANN). An artificial neural network enhances the COCOMO model, and the value of the baseline effort constant A is calibrated to use it in the proposed model equation. Three state-of-the-art publicly available datasets are used for experiments. The backpropagation feedforward procedure used a training set by iteratively processing and training a neural network. The proposed model is tested on the test set. The estimated effort is compared with the actual effort value. Experimental results show that the effort estimated by the proposed model is very close to the real effort, thus enhanced the reliability and improving the software effort estimation accuracy.  相似文献   

5.
Software effort estimation is an important but difficult task. Existing algorithmic models often fail to predict effort accurately and consistently. To address this, we developed a computational approach to software effort estimation. cEstor is a case-based reasoning engine developed from an analysis of expert reasoning. cEstor's architecture explicitly separates case-independent productivity adaptation knowledge (rules) from case-specific representations of prior projects encountered (cases). Using new data from actual projects, uncalibrated cEstor generated estimates which compare favorably to those of the referent expert, calibrated Function Points and calibrated COCOMO. The estimates were better than those produced by uncalibrated Basic COCOMO and Intermediate COCOMO. The roles of specific knowledge components in cEstor (cases, adaptation rules, and retrieval heuristics) were also examined. The results indicate that case-independent productivity adaptation rules affect the consistency of estimates and appropriate case selection affects the accuracy of estimates, but the combination of an adaptation rule set and unrestricted case base can yield the best estimates. Retrieval heuristics based on source lines of code and a Function Count heuristic based on summing over differences in parameter values, were found to be equivalent in accuracy and consistency, and both performed better than a heuristic based on Function Count totals.  相似文献   

6.
In this paper, we present a model for software effort (person-month) estimation based on three levels Bayesian network and 15 components of COCOMO and software size. The Bayesian network works with discrete intervals for nodes. However, we consider the intervals of all nodes of network as fuzzy numbers. Also, we obtain the optimal updating coefficient of effort estimation based on the concept of optimal control using Genetic algorithm and Particle swarm optimization for the COCOMO NASA database. In the other words, estimated value of effort is modified by determining the optimal coefficient. Also, we estimate the software effort with considering software quality in terms of the number of defects which is detected and removed in three steps of requirements specification, design and coding. If the number of defects is more than the specified threshold then the model is returned to the current step and an additional effort is added to the estimated effort. The results of model indicate that optimal updating coefficient obtained by genetic algorithm increases the accuracy of estimation significantly. Also, results of comparing the proposed model with the other ones indicate that the accuracy of the model is more than the other models.  相似文献   

7.
In most software development organizations, there is seldom a one-to-one mapping between software developers and development tasks. It is frequently necessary to concurrently assign individuals to multiple tasks and to assign more than one individual to work cooperatively on a single task. A principal goal in making such assignments should be to minimize the effort required to complete each task. But what impact does the manner in which developers are assigned to tasks have on the effort requirements? This paper identifies four task assignment factors: team size, concurrency, intensity, and fragmentation. These four factors are shown to improve the predictive ability of the well-known intermediate COCOMO cost estimation model. A parsimonious effort estimation model is also derived that utilizes a subset of the task assignment factors and unadjusted function points. For the data examined, this parsimonious model is shown to have goodness of fit and quality of estimation superior to that of the COCOMO model, while utilizing fewer cost factors  相似文献   

8.
Current software cost estimation models, such as the 1981 Constructive Cost Model (COCOMO) for software cost estimation and its 1987 Ada COCOMO update, have been experiencing increasing difficulties in estimating the costs of software developed to new life cycle processes and capabilities. These include non-sequential and rapid-development process models; reuse-driven approaches involving commercial off-the-shelf (COTS) packages, re-engineering, applications composition, and applications generation capabilities; object-oriented approaches supported by distributed middleware; and software process maturity initiatives. This paper summarizes research in deriving a baseline COCOMO 2.0 model tailored to these new forms of software development, including rationale for the model decisions. The major new modeling capabilities of COCOMO 2.0 are a tailorable family of software sizing models, involving Object Points, Function Points, and Source Lines of Code; nonlinear models for software reuse and re-engineering; an exponentdriver approach for modeling relative software diseconomies of scale; and several additions, deletions and updates to previous COCOMO effort-multiplier cost drivers. This model is serving as a framework for an extensive current data collection and analysis effort to further refine and calibrate the model's estimation capabilities.  相似文献   

9.
ContextAlong with expert judgment, analogy-based estimation, and algorithmic methods (such as Function point analysis and COCOMO), Least Squares Regression (LSR) has been one of the most commonly studied software effort estimation methods. However, an effort estimation model using LSR, a single LSR model, is highly affected by the data distribution. Specifically, if the data set is scattered and the data do not sit closely on the single LSR model line (do not closely map to a linear structure) then the model usually shows poor performance. In order to overcome this drawback of the LSR model, a data partitioning-based approach can be considered as one of the solutions to alleviate the effect of data distribution. Even though clustering-based approaches have been introduced, they still have potential problems to provide accurate and stable effort estimates.ObjectiveIn this paper, we propose a new data partitioning-based approach to achieve more accurate and stable effort estimates via LSR. This approach also provides an effort prediction interval that is useful to describe the uncertainty of the estimates.MethodEmpirical experiments are performed to evaluate the performance of the proposed approach by comparing with the basic LSR approach and clustering-based approaches, based on industrial data sets (two subsets of the ISBSG (Release 9) data set and one industrial data set collected from a banking institution).ResultsThe experimental results show that the proposed approach not only improves the accuracy of effort estimation more significantly than that of other approaches, but it also achieves robust and stable results according to the degree of data partitioning.ConclusionCompared with the other considered approaches, the proposed approach shows a superior performance by alleviating the effect of data distribution that is a major practical issue in software effort estimation.  相似文献   

10.
Several popular cost estimation models like COCOMO and function points use adjustment variables, such as software complexity and platform, to modify original estimates and arrive at final estimates. Using data on 666 programs from 15 software projects, this study empirically tests a research model that studies the influence of three adjustment variables—software complexity, computer platform, and program type (batch or online programs) on software effort. The results confirm that all the three adjustment variables have a significant effect on effort. Further, multiple comparison of means also points to two other results for the data examined. Batch programs involve significantly higher software effort than online programs. Programs rated as complex have significantly higher effort than programs rated as average.  相似文献   

11.
Accurate estimation of software development effort is strongly associated with the success or failure of software projects. The clear lack of convincing accuracy and flexibility in this area has attracted the attention of researchers over the past few years. Despite improvements achieved in effort estimating, there is no strong agreement as to which individual model is the best. Recent studies have found that an accurate estimation of development effort in software projects is unreachable in global space, meaning that proposing a high performance estimation model for use in different types of software projects is likely impossible. In this paper, a localized multi-estimator model, called LMES, is proposed in which software projects are classified based on underlying attributes. Different clusters of projects are then locally investigated so that the most accurate estimators are selected for each cluster. Unlike prior models, LMES does not rely on only one individual estimator in a cluster of projects. Rather, an exhaustive investigation is conducted to find the best combination of estimators to assign to each cluster. The investigation domain includes 10 estimators combined using four combination methods, which results in 4017 different combinations. ISBSG, Maxwell and COCOMO datasets are utilized for evaluation purposes, which include a total of 573 real software projects. The promising results show that the estimate accuracy is improved through localization of estimation process and allocation of appropriate estimators. Besides increased accuracy, the significant contribution of LMES is its adaptability and flexibility to deal with the complexity and uncertainty that exist in the field of software development effort estimation.  相似文献   

12.
To date most research in software effort estimation has not taken chronology into account when selecting projects for training and validation sets. A chronological split represents the use of a project’s starting and completion dates, such that any model that estimates effort for a new project p only uses as its training set projects that have been completed prior to p’s starting date. A study in 2009 (“S3”) investigated the use of chronological split taking into account a project’s age. The research question investigated was whether the use of a training set containing only the most recent past projects (a “moving window” of recent projects) would lead to more accurate estimates when compared to using the entire history of past projects completed prior to the starting date of a new project. S3 found that moving windows could improve the accuracy of estimates. The study described herein replicates S3 using three different and independent data sets. Estimation models were built using regression, and accuracy was measured using absolute residuals. The results contradict S3, as they do not show any gain in estimation accuracy when using windows for effort estimation. This is a surprising result: the intuition that recent data should be more helpful than old data for effort estimation is not supported. Several factors, which are discussed in this paper, might have contributed to such contradicting results. Some of our future work entails replicating this work using other datasets, to understand better when using windows is a suitable choice for software companies.  相似文献   

13.
The ability to accurately and consistently estimate software development efforts is required by the project managers in planning and conducting software development activities. Since software effort drivers are vague and uncertain, software effort estimates, especially in the early stages of the development life cycle, are prone to a certain degree of estimation errors. A software effort estimation model which adopts a fuzzy inference method provides a solution to fit the uncertain and vague properties of software effort drivers. The present paper proposes a fuzzy neural network (FNN) approach for embedding artificial neural network into fuzzy inference processes in order to derive the software effort estimates. Artificial neural network is utilized to determine the significant fuzzy rules in fuzzy inference processes. We demonstrated our approach by using the 63 historical project data in the well-known COCOMO model. Empirical results showed that applying FNN for software effort estimates resulted in slightly smaller mean magnitude of relative error (MMRE) and probability of a project having a relative error of less than or equal to 0.25 (Pred(0.25)) as compared with the results obtained by just using artificial neural network and the original model. The proposed model can also provide objective fuzzy effort estimation rule sets by adopting the learning mechanism of the artificial neural network.  相似文献   

14.
15.
Analogy-based software effort estimation using Fuzzy numbers   总被引:1,自引:0,他引:1  

Background

Early stage software effort estimation is a crucial task for project bedding and feasibility studies. Since collected data during the early stages of a software development lifecycle is always imprecise and uncertain, it is very hard to deliver accurate estimates. Analogy-based estimation, which is one of the popular estimation methods, is rarely used during the early stage of a project because of uncertainty associated with attribute measurement and data availability.

Aims

We have integrated analogy-based estimation with Fuzzy numbers in order to improve the performance of software project effort estimation during the early stages of a software development lifecycle, using all available early data. Particularly, this paper proposes a new software project similarity measure and a new adaptation technique based on Fuzzy numbers.

Method

Empirical evaluations with Jack-knifing procedure have been carried out using five benchmark data sets of software projects, namely, ISBSG, Desharnais, Kemerer, Albrecht and COCOMO, and results are reported. The results are compared to those obtained by methods employed in the literature using case-based reasoning and stepwise regression.

Results

In all data sets the empirical evaluations have shown that the proposed similarity measure and adaptation techniques method were able to significantly improve the performance of analogy-based estimation during the early stages of software development. The results have also shown that the proposed method outperforms some well know estimation techniques such as case-based reasoning and stepwise regression.

Conclusions

It is concluded that the proposed estimation model could form a useful approach for early stage estimation especially when data is almost uncertain.  相似文献   

16.
Models are developed to estimate lines of code and function counts directly from user application features of process control systems early in the software development lifecycle. Since the application features are known with reasonable degree of confidence during early stages of development, it is possible to extend the use of the constructive cost model (COCOMO) and function-points-based approach for early software cost estimation. Alternative feature-based models that estimate size and effort using application features and productivity factors are developed. The feature-based models have been shown to estimate software effort with the least error  相似文献   

17.
Social choice deals with aggregating the preferences of a number of voters into a collective preference. We will use this idea for software project effort estimation, substituting the voters by project attributes. Therefore, instead of supplying numeric values for various project attributes that are then used in regression or similar methods, a new project only needs to be placed into one ranking per attribute, necessitating only ordinal values. Using the resulting aggregate ranking the new project is again placed between other projects whose actual expended effort can be used to derive an estimation. In this paper we will present this method and extensions using weightings derived from genetic algorithms. We detail a validation based on several well-known data sets and show that estimation accuracy similar to classic methods can be achieved with considerably lower demands on input data.  相似文献   

18.
A critical issue in software project management is the accurate estimation of size, effort, resources, cost, and time spent in the development process. Underestimates may lead to time pressures that may compromise full functional development and the software testing process. Likewise, overestimates can result in noncompetitive budgets. In this paper, artificial neural network and stepwise regression based predictive models are investigated, aiming at offering alternative methods for those who do not believe in estimation models. The results presented in this paper compare the performance of both methods and indicate that these techniques are competitive with the APF, SLIM, and COCOMO methods.  相似文献   

19.
ContextIn software industry, project managers usually rely on their previous experience to estimate the number men/hours required for each software project. The accuracy of such estimates is a key factor for the efficient application of human resources. Machine learning techniques such as radial basis function (RBF) neural networks, multi-layer perceptron (MLP) neural networks, support vector regression (SVR), bagging predictors and regression-based trees have recently been applied for estimating software development effort. Some works have demonstrated that the level of accuracy in software effort estimates strongly depends on the values of the parameters of these methods. In addition, it has been shown that the selection of the input features may also have an important influence on estimation accuracy.ObjectiveThis paper proposes and investigates the use of a genetic algorithm method for simultaneously (1) select an optimal input feature subset and (2) optimize the parameters of machine learning methods, aiming at a higher accuracy level for the software effort estimates.MethodSimulations are carried out using six benchmark data sets of software projects, namely, Desharnais, NASA, COCOMO, Albrecht, Kemerer and Koten and Gray. The results are compared to those obtained by methods proposed in the literature using neural networks, support vector machines, multiple additive regression trees, bagging, and Bayesian statistical models.ResultsIn all data sets, the simulations have shown that the proposed GA-based method was able to improve the performance of the machine learning methods. The simulations have also demonstrated that the proposed method outperforms some recent methods reported in the recent literature for software effort estimation. Furthermore, the use of GA for feature selection considerably reduced the number of input features for five of the data sets used in our analysis.ConclusionsThe combination of input features selection and parameters optimization of machine learning methods improves the accuracy of software development effort. In addition, this reduces model complexity, which may help understanding the relevance of each input feature. Therefore, some input parameters can be ignored without loss of accuracy in the estimations.  相似文献   

20.
Estimation by analogy (EBA) predicts effort for a new project by aggregating effort information of similar projects from a given historical data set. Existing research results have shown that a careful selection and weighting of attributes may improve the performance of the estimation methods. This paper continues along that research line and considers weighting of attributes in order to improve the estimation accuracy. More specifically, the impact of weighting (and selection) of attributes is studied as extensions to our former EBA method AQUA, which has shown promising results and also allows estimation in the case of data sets that have non-quantitative attributes and missing values. The new resulting method is called AQUA+. For attribute weighting, a qualitative analysis pre-step using rough set analysis (RSA) is performed. RSA is a proven machine learning technique for classification of objects. We exploit the RSA results in different ways and define four heuristics for attribute weighting. AQUA+ was evaluated in two ways: (1) comparison between AQUA+ and AQUA, along with the comparative analysis between the proposed four heuristics for AQUA+, (2) comparison of AQUA+ with other EBA methods. The main evaluation results are: (1) better estimation accuracy was obtained by AQUA+ compared to AQUA over all six data sets; and (2) AQUA+ obtained better results than, or very close to that of other EBA methods for the three data sets applied to all the EBA methods. In conclusion, the proposed attribute weighing method using RSA can improve the estimation accuracy of EBA method AQUA+ according to the empirical studies over six data sets. Testing more data sets is necessary to get results that are more statistical significant.
Guenther RuheEmail:
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号