首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Previous research shows that class size can influence the associations between object-oriented (OO) metrics and fault-proneness and therefore proposes that it should be controlled as a confounding variable when validating OO metrics on fault-proneness. Otherwise, their true associations may be distorted. However, it has not been determined whether this practice is equally applicable to other external quality attributes. In this paper, we use three size metrics, two of which are available during the high-level design phase, to examine the potentially confounding effect of class size on the associations between OO metrics and change-proneness. The OO metrics that are investigated include cohesion, coupling, and inheritance metrics. Our results, based on Eclipse, indicate that: 1) The confounding effect of class size on the associations between OO metrics and change-proneness, in general, exists, regardless of whichever size metric is used; 2) the confounding effect of class size generally leads to an overestimate of the associations between OO metrics and change-proneness; and 3) for many OO metrics, the confounding effect of class size completely accounts for their associations with change-proneness or results in a change of the direction of the associations. These results strongly suggest that studies validating OO metrics on change-proneness should also consider class size as a confounding variable.  相似文献   

2.
In the last decade, empirical studies on object-oriented design metrics have shown some of them to be useful for predicting the fault-proneness of classes in object-oriented software systems. This research did not, however, distinguish among faults according to the severity of impact. It would be valuable to know how object-oriented design metrics and class fault-proneness are related when fault severity is taken into account. In this paper, we use logistic regression and machine learning methods to empirically investigate the usefulness of object-oriented design metrics, specifically, a subset of the Chidamber and Kemerer suite, in predicting fault-proneness when taking fault severity into account. Our results, based on a public domain NASA data set, indicate that 1) most of these design metrics are statistically related to fault-proneness of classes across fault severity, and 2) the prediction capabilities of the investigated metrics greatly depend on the severity of faults. More specifically, these design metrics are able to predict low severity faults in fault-prone classes better than high severity faults in fault-prone classes  相似文献   

3.
Empirical validation of software metrics suites to predict fault proneness in object-oriented (OO) components is essential to ensure their practical use in industrial settings. In this paper, we empirically validate three OO metrics suites for their ability to predict software quality in terms of fault-proneness: the Chidamber and Kemerer (CK) metrics, Abreu's Metrics for Object-Oriented Design (MOOD), and Bansiya and Davis' Quality Metrics for Object-Oriented Design (QMOOD). Some CK class metrics have previously been shown to be good predictors of initial OO software quality. However, the other two suites have not been heavily validated except by their original proposers. Here, we explore the ability of these three metrics suites to predict fault-prone classes using defect data for six versions of Rhino, an open-source implementation of JavaScript written in Java. We conclude that the CK and QMOOD suites contain similar components and produce statistical models that are effective in detecting error-prone classes. We also conclude that the class components in the MOOD metrics suite are not good class fault-proneness predictors. Analyzing multivariate binary logistic regression models across six Rhino versions indicates these models may be useful in assessing quality in OO classes produced using modern highly iterative or agile software development processes.  相似文献   

4.
ContextIn a large object-oriented software system, packages play the role of modules which group related classes together to provide well-identified services to the rest of the system. In this context, it is widely believed that modularization has a large influence on the quality of packages. Recently, Sarkar, Kak, and Rama proposed a set of new metrics to characterize the modularization quality of packages from important perspectives such as inter-module call traffic, state access violations, fragile base-class design, programming to interface, and plugin pollution. These package-modularization metrics are quite different from traditional package-level metrics, which measure software quality mainly from size, extensibility, responsibility, independence, abstractness, and instability perspectives. As such, it is expected that these package-modularization metrics should be useful predictors for fault-proneness. However, little is currently known on their actual usefulness for fault-proneness prediction, especially compared with traditional package-level metrics.ObjectiveIn this paper, we examine the role of these new package-modularization metrics for determining software fault-proneness in object-oriented systems.MethodWe first use principal component analysis to analyze whether these new package-modularization metrics capture additional information compared with traditional package-level metrics. Second, we employ univariate prediction models to investigate how these new package-modularization metrics are related to fault-proneness. Finally, we build multivariate prediction models to examine the ability of these new package-modularization metrics for predicting fault-prone packages.ResultsOur results, based on six open-source object-oriented software systems, show that: (1) these new package-modularization metrics provide new and complementary views of software complexity compared with traditional package-level metrics; (2) most of these new package-modularization metrics have a significant association with fault-proneness in an expected direction; and (3) these new package-modularization metrics can substantially improve the effectiveness of fault-proneness prediction when used with traditional package-level metrics together.ConclusionsThe package-modularization metrics proposed by Sarkar, Kak, and Rama are useful for practitioners to develop quality software systems.  相似文献   

5.
Previous studies have demonstrated the relationship between coupling and external software quality attributes, such as fault-proneness, and the application of coupling to software maintenance tasks, such as impact analysis. These previous studies concentrate on class coupling. However, there is a growing focus on the study of features in software, and features are often implemented across multiple classes, meaning class-level coupling measures are not applicable. We ask the pertinent question, “Is measuring coupling at the feature-level also useful?” We define new feature coupling metrics based on structural and textual source code information and extend the unified framework for coupling measurement to include these new metrics. We also conduct three extensive case studies to evaluate these new metrics and answer this research question. The first study examines the relationship between feature coupling and fault-proneness, the second assesses feature coupling in the context of impact analysis, and the third study surveys developers to determine if the metrics align with what they consider to be coupled features. All three studies provide evidence that feature coupling metrics are indeed useful new measures that capture coupling at a higher level of abstraction than classes and can be useful for finding bugs, guiding testing effort, and assessing change impact.  相似文献   

6.
软件易变性预测主要通过软件的内部特性,即软件度量值,来刻画、预测的,是软件工程中热点方向之一,在提高软件质量和控制软件成本方面起着非常重要的作用。虽然软件易变性预测在学术界取得了一系列的成绩,但在工业界尚未有成功应用的案例。本文从简单相关性分析与偏相关性分析和关联规则挖掘的角度出发甄别面向对象度量与软件易变性间相关性的真伪,明确了在软件易变性上下文中类规模对面向对象度量有潜在影响。  相似文献   

7.
In this paper, we present an FP-like approach, named class point, which was conceived to estimate the size of object-oriented products. In particular, two measures are proposed, which are theoretically validated showing that they satisfy well-known properties necessary for size measures. An initial, empirical validation is also performed, meant to assess the usefulness and effectiveness of the proposed measures to predict the development effort of object-oriented systems. Moreover, a comparative analysis is carried out, taking into account several other size measures.  相似文献   

8.
It has been proposed by El Emam et al. (ibid. vol.27 (7), 2001) that size should be taken into account as a confounding variable when validating object-oriented metrics. We take issue with this perspective since the ability to measure size does not temporally precede the ability to measure many of the object-oriented metrics that have been proposed. Hence, the condition that a confounding variable must occur causally prior to another explanatory variable is not met. In addition, when specifying multivariate models of defects that incorporate object-oriented metrics, entering size as an explanatory variable may result in misspecified models that lack internal consistency. Examples are given where this misspecification occurs.  相似文献   

9.
Open source software systems are becoming increasingly important these days. Many companies are investing in open source projects and lots of them are also using such software in their own work. But, because open source software is often developed with a different management style than the industrial ones, the quality and reliability of the code needs to be studied. Hence, the characteristics of the source code of these projects need to be measured to obtain more information about it. This paper describes how we calculated the object-oriented metrics given by Chidamber and Kemerer to illustrate how fault-proneness detection of the source code of the open source Web and e-mail suite called Mozilla can be carried out. We checked the values obtained against the number of bugs found in its bug database - called Bugzilla - using regression and machine learning methods to validate the usefulness of these metrics for fault-proneness prediction. We also compared the metrics of several versions of Mozilla to see how the predicted fault-proneness of the software system changed during its development cycle.  相似文献   

10.
Empirical validation of code metrics has a long history of success. Many metrics have been shown to be good predictors of external features, such as correlation to bugs. Our study provides an alternative explanation to such validation, attributing it to the confounding effect of size. In contradiction to received wisdom, we argue that the validity of a metric can be explained by its correlation to the size of the code artifact. In fact, this work came about in view of our failure in the quest of finding a metric that is both valid and free of this confounding effect. Our main discovery is that, with the appropriate (non-parametric) transformations, the validity of a metric can be accurately (with R-squared values being at times as high as 0.97) predicted from its correlation with size. The reported results are with respect to a suite of 26 metrics, that includes the famous Chidamber and Kemerer metrics. Concretely, it is shown that the more a metric is correlated with size, the more able it is to predict external features values, and vice-versa. We consider two methods for controlling for size, by linear transformations. As it turns out, metrics controlled for size, tend to eliminate their predictive capabilities. We also show that the famous Chidamber and Kemerer metrics are no better than other metrics in our suite. Overall, our results suggest code size is the only “unique” valid metric.  相似文献   

11.
With the increasing use of object-oriented methods in new software development, there is a growing need to both document and improve current practice in object-oriented design and development. In response to this need, a number of researchers have developed various metrics for object-oriented systems as proposed aids to the management of these systems. In this research, an analysis of a set of metrics proposed by Chidamber and Kemerer (1994) is performed in order to assess their usefulness for practising managers. First, an informal introduction to the metrics is provided by way of an extended example of their managerial use. Second, exploratory analyses of empirical data relating the metrics to productivity, rework effort and design effort on three commercial object-oriented systems are provided. The empirical results suggest that the metrics provide significant explanatory power for variations in these economic variables, over and above that provided by traditional measures, such as size in lines of code, and after controlling for the effects of individual developers  相似文献   

12.
Many studies use logistic regression models to investigate the ability of complexity metrics to predict fault-prone classes. However, it is not uncommon to see the inappropriate use of performance indictors such as odds ratio in previous studies. In particular, a recent study by Olague et al. uses the odds ratio associated with one unit increase in a metric to compare the relative magnitude of the associations between individual metrics and fault-proneness. In addition, the percents of concordant, discordant, and tied pairs are used to evaluate the predictive effectiveness of a univariate logistic regression model. Their results suggest that lesser known complexity metrics such as standard deviation method complexity (SDMC) and average method complexity (AMC) are better predictors than the two commonly used metrics: lines of code (LOC) and weighted method McCabe complexity (WMC). In this paper, however, we show that (1) the odds ratio associated with one standard deviation increase, rather than one unit increase, in a metric should be used to compare the relative magnitudes of the effects of individual metrics on fault-proneness. Otherwise, misleading results may be obtained; and that (2) the connection of the percents of concordant, discordant, and tied pairs with the predictive effectiveness of a univariate logistic regression model is false, as they indeed do not depend on the model. Furthermore, we use the data collected from three versions of Eclipse to re-examine the ability of complexity metrics to predict fault-proneness. Our experimental results reveal that: (1) many metrics exhibit moderate or almost moderate ability in discriminating between fault-prone and not fault-prone classes; (2) LOC and WMC are indeed better fault-proneness predictors than SDMC and AMC; and (3) the explanatory power of other complexity metrics in addition to LOC is limited.  相似文献   

13.
A growing body of literature suggests that there is an optimal size for software components. This means that components that are too small or too big will have a higher defect content (i.e., there is a U-shaped curve relating defect content to size). The U-shaped curve has become known as the "Goldilocks Conjecture." Recently, a cognitive theory has been proposed to explain this phenomenon and it has been expanded to characterize object-oriented software. This conjecture has wide implications for software engineering practice. It suggests 1) that designers should deliberately strive to design classes that are of the optimal size, 2) that program decomposition is harmful, and 3) that there exists a maximum (threshold) class size that should not be exceeded to ensure fewer faults in the software. The purpose of the current paper is to evaluate this conjecture for object-oriented systems. We first demonstrate that the claims of an optimal component/class size (1) above) and of smaller components/classes having a greater defect content (2) above) are due to a mathematical artifact in the analyses performed previously. We then empirically test the threshold effect claims of this conjecture (3) above). To our knowledge, the empirical test of size threshold effects for object-oriented systems has not been performed thus far. We performed an initial study with an industrial C++ system and repeated it twice on another C++ system and on a commercial Java application. Our results provide unambiguous evidence that there is no threshold effect of class size. We obtained the same result for three systems using four different size measures. These findings suggest that there is a simple continuous relationship between class size and faults, and that, optimal class size, smaller classes are better and threshold effects conjectures have no sound theoretical nor empirical basis  相似文献   

14.
It is widely known that a small number of modules in any system are likely to contain the majority of faults. Early identification and consequent attention to such modules may mitigate or prevent many defects. The objective of this study is to use product metrics to build a prediction model of the number of change requests (CRs) that are likely to occur in individual modules during testing. The study first empirically validates eight product metrics, while considering the confounding effects of code size (lines of code). Next, a prediction model of CR outcomes is developed with the validated metrics by utilizing a negative binomial regression that allows over-dispersion. In total, 816 modules written in the Chill programming language were analyzed in a large-scale telecommunication system. There is a positive association between the number of CRs and four product metrics (number of unique operators, unique operands, signals, and library calls) after considering the confounding effect of code size. A prediction model that includes only code size and the number of unique operands provides the best empirical fit.  相似文献   

15.
Applying machine learning to software fault-proneness prediction   总被引:1,自引:0,他引:1  
The importance of software testing to quality assurance cannot be overemphasized. The estimation of a module’s fault-proneness is important for minimizing cost and improving the effectiveness of the software testing process. Unfortunately, no general technique for estimating software fault-proneness is available. The observed correlation between some software metrics and fault-proneness has resulted in a variety of predictive models based on multiple metrics. Much work has concentrated on how to select the software metrics that are most likely to indicate fault-proneness. In this paper, we propose the use of machine learning for this purpose. Specifically, given historical data on software metric values and number of reported errors, an Artificial Neural Network (ANN) is trained. Then, in order to determine the importance of each software metric in predicting fault-proneness, a sensitivity analysis is performed on the trained ANN. The software metrics that are deemed to be the most critical are then used as the basis of an ANN-based predictive model of a continuous measure of fault-proneness. We also view fault-proneness prediction as a binary classification task (i.e., a module can either contain errors or be error-free) and use Support Vector Machines (SVM) as a state-of-the-art classification method. We perform a comparative experimental study of the effectiveness of ANNs and SVMs on a data set obtained from NASA’s Metrics Data Program data repository.  相似文献   

16.
A number of papers have investigated the relationships between design metrics and the detection of faults in object-oriented software. Several of these studies have shown that such models can be accurate in predicting faulty classes within one particular software product. In practice, however, prediction models are built on certain products to be used on subsequent software development projects. How accurate can these models be, considering the inevitable differences that may exist across projects and systems? Organizations typically learn and change. From a more general standpoint, can we obtain any evidence that such models are economically viable tools to focus validation and verification effort? This paper attempts to answer these questions by devising a general but tailorable cost-benefit model and by using fault and design data collected on two mid-size Java systems developed in the same environment. Another contribution of the paper is the use of a novel exploratory analysis technique - MARS (multivariate adaptive regression splines) to build such fault-proneness models, whose functional form is a-priori unknown. The results indicate that a model built on one system can be accurately used to rank classes within another system according to their fault proneness. The downside, however, is that, because of system differences, the predicted fault probabilities are not representative of the system predicted. However, our cost-benefit model demonstrates that the MARS fault-proneness model is potentially viable, from an economical standpoint. The linear model is not nearly as good, thus suggesting a more complex model is required.  相似文献   

17.
Antipatterns are poor design choices that are conjectured to make object-oriented systems harder to maintain. We investigate the impact of antipatterns on classes in object-oriented systems by studying the relation between the presence of antipatterns and the change- and fault-proneness of the classes. We detect 13 antipatterns in 54 releases of ArgoUML, Eclipse, Mylyn, and Rhino, and analyse (1) to what extent classes participating in antipatterns have higher odds to change or to be subject to fault-fixing than other classes, (2) to what extent these odds (if higher) are due to the sizes of the classes or to the presence of antipatterns, and (3) what kinds of changes affect classes participating in antipatterns. We show that, in almost all releases of the four systems, classes participating in antipatterns are more change-and fault-prone than others. We also show that size alone cannot explain the higher odds of classes with antipatterns to underwent a (fault-fixing) change than other classes. Finally, we show that structural changes affect more classes with antipatterns than others. We provide qualitative explanations of the increase of change- and fault-proneness in classes participating in antipatterns using release notes and bug reports. The obtained results justify a posteriori previous work on the specification and detection of antipatterns and could help to better focus quality assurance and testing activities.  相似文献   

18.
Network measures are useful for predicting fault-prone modules. However, existing work has not distinguished faults according to their severity. In practice, high severity faults cause serious problems and require further attention. In this study, we explored the utility of network measures in high severity faultproneness prediction. We constructed software source code networks for four open-source projects by extracting the dependencies between modules. We then used univariate logistic regression to investigate the associations between each network measure and fault-proneness at a high severity level. We built multivariate prediction models to examine their explanatory ability for fault-proneness, as well as evaluated their predictive effectiveness compared to code metrics under forward-release and cross-project predictions. The results revealed the following: (1) most network measures are significantly related to high severity fault-proneness; (2) network measures generally have comparable explanatory abilities and predictive powers to those of code metrics; and (3) network measures are very unstable for cross-project predictions. These results indicate that network measures are of practical value in high severity fault-proneness prediction.  相似文献   

19.
Predicting the fault-proneness labels of software program modules is an emerging software quality assurance activity and the quality of datasets collected from previous software version affects the performance of fault prediction models. In this paper, we propose an outlier detection approach using metrics thresholds and class labels to identify class outliers. We evaluate our approach on public NASA datasets from PROMISE repository. Experiments reveal that this novel outlier detection method improves the performance of robust software fault prediction models based on Naive Bayes and Random Forests machine learning algorithms.  相似文献   

20.
We present an empirical validation of object-oriented size estimation models. In previous work we proposed object oriented function points (OOFP), an adaptation of the function points approach to object-oriented systems. In a small pilot study, we used the OOFP method to estimate lines of code (LOC). In this paper we extend the empirical validation of OOFP substantially, using a larger data set and comparing OOFP with alternative predictors of LOC. The aim of the paper is to gain an understanding of which factors contribute to accurate size prediction for OO software, and to position OOFP within that knowledge. A cross validation approach was adopted to build and evaluate linear models where the independent variable was either a traditional OO entity (classes, methods, association, inheritance, or a combination of them) or an OOFP-related measure. Using the full OOFP process, the best size predictor achieved a normalized mean squared error of 38%. By removing function point weighting tables from the OOFP process, and carefully analyzing collected data points and developer practices, we identified several factors that influence size estimation. Our empirical evidence demonstrates that by controlling these factors size estimates could be substantially improved, decreasing the normalized mean squared error to 15%—in relative terms, a 56% reduction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号