期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Analyzing software measurement data with clustering techniques 总被引：1，自引：0，他引：1

Zhong S. Khoshgoftaar T.M. Seliya N. 《Intelligent Systems, IEEE》2004,19(2):20-27

For software quality estimation, software development practitioners typically construct quality-classification or fault prediction models using software metrics and fault data from a previous system release or a similar software project. Engineers then use these models to predict the fault proneness of software modules in development. Software quality estimation using supervised-learning approaches is difficult without software fault measurement data from similar projects or earlier system releases. Cluster analysis with expert input is a viable unsupervised-learning solution for predicting software modules' fault proneness and potential noisy modules. Data analysts and software engineering experts can collaborate more closely to construct and collect more informative software metrics. 相似文献

2.

Empirical Validation of Three Software Metrics Suites to Predict Fault-Proneness of Object-Oriented Classes Developed Using Highly Iterative or Agile Software Development Processes 总被引：2，自引：0，他引：2

Olague H.M. Etzkorn L.H. Gholston S. Quattlebaum S. 《IEEE transactions on pattern analysis and machine intelligence》2007,33(6):402-419

Empirical validation of software metrics suites to predict fault proneness in object-oriented (OO) components is essential to ensure their practical use in industrial settings. In this paper, we empirically validate three OO metrics suites for their ability to predict software quality in terms of fault-proneness: the Chidamber and Kemerer (CK) metrics, Abreu's Metrics for Object-Oriented Design (MOOD), and Bansiya and Davis' Quality Metrics for Object-Oriented Design (QMOOD). Some CK class metrics have previously been shown to be good predictors of initial OO software quality. However, the other two suites have not been heavily validated except by their original proposers. Here, we explore the ability of these three metrics suites to predict fault-prone classes using defect data for six versions of Rhino, an open-source implementation of JavaScript written in Java. We conclude that the CK and QMOOD suites contain similar components and produce statistical models that are effective in detecting error-prone classes. We also conclude that the class components in the MOOD metrics suite are not good class fault-proneness predictors. Analyzing multivariate binary logistic regression models across six Rhino versions indicates these models may be useful in assessing quality in OO classes produced using modern highly iterative or agile software development processes. 相似文献

3.

An empirical validation of object-oriented metrics in two different iterative software processes 总被引：1，自引：0，他引：1

Alshayeb M. Wei Li 《IEEE transactions on pattern analysis and machine intelligence》2003,29(11):1043-1049

Object-oriented (OO) metrics are used mainly to predict software engineering activities/efforts such as maintenance effort, error proneness, and error rate. There have been discussions about the effectiveness of metrics in different contexts. In this paper, we present an empirical study of OO metrics in two iterative processes: the short-cycled agile process and the long-cycled framework evolution process. We find that OO metrics are effective in predicting design efforts and source lines of code added, changed, and deleted in the short-cycled agile process and ineffective in predicting the same aspects in the long-cycled framework process. This leads us to believe that OO metrics' predictive capability is limited to the design and implementation changes during the development iterations, not the long-term evolution of an established system in different releases. 相似文献

4.

A systematic review of machine learning techniques for software fault prediction

《Applied Soft Computing》2015

BackgroundSoftware fault prediction is the process of developing models that can be used by the software practitioners in the early phases of software development life cycle for detecting faulty constructs such as modules or classes. There are various machine learning techniques used in the past for predicting faults.MethodIn this study we perform a systematic review of studies from January 1991 to October 2013 in the literature that use the machine learning techniques for software fault prediction. We assess the performance capability of the machine learning techniques in existing research for software fault prediction. We also compare the performance of the machine learning techniques with the statistical techniques and other machine learning techniques. Further the strengths and weaknesses of machine learning techniques are summarized.ResultsIn this paper we have identified 64 primary studies and seven categories of the machine learning techniques. The results prove the prediction capability of the machine learning techniques for classifying module/class as fault prone or not fault prone. The models using the machine learning techniques for estimating software fault proneness outperform the traditional statistical models.ConclusionBased on the results obtained from the systematic review, we conclude that the machine learning techniques have the ability for predicting software fault proneness and can be used by software practitioners and researchers. However, the application of the machine learning techniques in software fault prediction is still limited and more number of studies should be carried out in order to obtain well formed and generalizable results. We provide future guidelines to practitioners and researchers based on the results obtained in this work. 相似文献

5.

Estimating software readiness using predictive models

Tong-Seng Quah 《Information Sciences》2009,179(4):430-5009

In this study, defect tracking is used as a proxy method to predict software readiness. The number of remaining defects in an application under development is one of the most important factors that allow one to decide if a piece of software is ready to be released. By comparing predicted number of faults and number of faults discovered in testing, software manager can decide whether the software is likely ready to be released or not.The predictive model developed in this research can predict: (i) the number of faults (defects) likely to exist, (ii) the estimated number of code changes required to correct a fault and (iii) the estimated amount of time (in minutes) needed to make the changes in respective classes of the application. The model uses product metrics as independent variables to do predictions. These metrics are selected depending on the nature of source code with regards to architecture layers, types of faults and contribution factors of these metrics. The use of neural network model with genetic training strategy is introduced to improve prediction results for estimating software readiness in this study. This genetic-net combines a genetic algorithm with a statistical estimator to produce a model which also shows the usefulness of inputs.The model is divided into three parts: (1) prediction model for presentation logic tier (2) prediction model for business tier and (3) prediction model for data access tier. Existing object-oriented metrics and complexity software metrics are used in the business tier prediction model. New sets of metrics have been proposed for the presentation logic tier and data access tier. These metrics are validated using data extracted from real world applications. The trained models can be used as tools to assist software mangers in making software release decisions. 相似文献

6.

Multiple kernel ensemble learning for software defect prediction

Tiejian Wang Zhiwu Zhang Xiaoyuan Jing Liqiang Zhang 《Automated Software Engineering》2016,23(4):569-590

Software defect prediction aims to predict the defect proneness of new software modules with the historical defect data so as to improve the quality of a software system. Software historical defect data has a complicated structure and a marked characteristic of class-imbalance; how to fully analyze and utilize the existing historical defect data and build more precise and effective classifiers has attracted considerable researchers’ interest from both academia and industry. Multiple kernel learning and ensemble learning are effective techniques in the field of machine learning. Multiple kernel learning can map the historical defect data to a higher-dimensional feature space and make them express better, and ensemble learning can use a series of weak classifiers to reduce the bias generated by the majority class and obtain better predictive performance. In this paper, we propose to use the multiple kernel learning to predict software defect. By using the characteristics of the metrics mined from the open source software, we get a multiple kernel classifier through ensemble learning method, which has the advantages of both multiple kernel learning and ensemble learning. We thus propose a multiple kernel ensemble learning (MKEL) approach for software defect classification and prediction. Considering the cost of risk in software defect prediction, we design a new sample weight vector updating strategy to reduce the cost of risk caused by misclassifying defective modules as non-defective ones. We employ the widely used NASA MDP datasets as test data to evaluate the performance of all compared methods; experimental results show that MKEL outperforms several representative state-of-the-art defect prediction methods. 相似文献

7.

Accuracy of software quality models over multiple releases

Taghi M. Khoshgoftaar Edward B. Allen Wendell D. Jones John P. Hudepohl 《Annals of Software Engineering》2000,9(1-2):103-116

Many evolving mission‐critical systems must have high software reliability. However, it is often difficult to identify fault‐prone modules early enough in a development cycle to guide software enhancement efforts effectively and efficiently. Software quality models can yield timely predictions of membership in the fault‐prone class on a module‐by‐module basis, enabling one to target enhancement techniques. However, it is an open empirical question, “Can a software quality model remain useful over several releases?” Most prior software quality studies have examined only one release of a system, evaluating the model with modules from the same release. We conducted a case study of a large legacy telecommunications system where measurements on one software release were used to build models, and three subsequent releases of the same system were used to evaluate model accuracy. This is a realistic assessment of model accuracy, closely simulating actual use of a software quality model. A module was considered fault‐prone if any of its faults were discovered by customers. These faults are extremely expensive due to consequent loss of service and emergency repair efforts. We found that the model maintained useful accuracy over several releases. These findings are initial empirical evidence that software quality models can remain useful as a system is maintained by a stable software development process. 相似文献

8.

Dynamic coupling measurement for object-oriented software 总被引：1，自引：0，他引：1

Arisholm E. Briand L.C. Foyen A. 《IEEE transactions on pattern analysis and machine intelligence》2004,30(8):491-506

The relationships between coupling and external quality factors of object-oriented software have been studied extensively for the past few years. For example, several studies have identified clear empirical relationships between class-level coupling and class fault-proneness. A common way to define and measure coupling is through structural properties and static code analysis. However, because of polymorphism, dynamic binding, and the common presence of unused ("dead") code in commercial software, the resulting coupling measures are imprecise as they do not perfectly reflect the actual coupling taking place among classes at runtime. For example, when using static analysis to measure coupling, it is difficult and sometimes impossible to determine what actual methods can be invoked from a client class if those methods are overridden in the subclasses of the server classes. Coupling measurement has traditionally been performed using static code analysis, because most of the existing work was done on nonobject oriented code and because dynamic code analysis is more expensive and complex to perform. For modern software systems, however, this focus on static analysis can be problematic because although dynamic binding existed before the advent of object-orientation, its usage has increased significantly in the last decade. We describe how coupling can be defined and precisely measured based on dynamic analysis of systems. We refer to this type of coupling as dynamic coupling. An empirical evaluation of the proposed dynamic coupling measures is reported in which we study the relationship of these measures with the change proneness of classes. Data from maintenance releases of a large Java system are used for this purpose. Preliminary results suggest that some dynamic coupling measures are significant indicators of change proneness and that they complement existing coupling measures based on static analysis. 相似文献

9.

A systematic review of software fault prediction studies

Cagatay Catal Banu Diri 《Expert systems with applications》2009,36(4):7346-7354

This paper provides a systematic review of previous software fault prediction studies with a specific focus on metrics, methods, and datasets. The review uses 74 software fault prediction papers in 11 journals and several conference proceedings. According to the review results, the usage percentage of public datasets increased significantly and the usage percentage of machine learning algorithms increased slightly since 2005. In addition, method-level metrics are still the most dominant metrics in fault prediction research area and machine learning algorithms are still the most popular methods for fault prediction. Researchers working on software fault prediction area should continue to use public datasets and machine learning algorithms to build better fault predictors. The usage percentage of class-level is beyond acceptable levels and they should be used much more than they are now in order to predict the faults earlier in design phase of software life cycle. 相似文献

10.

Software defect prediction using Bayesian networks

Ahmet Okutan Olcay Taner Yıldız 《Empirical Software Engineering》2014,19(1):154-181

There are lots of different software metrics discovered and used for defect prediction in the literature. Instead of dealing with so many metrics, it would be practical and easy if we could determine the set of metrics that are most important and focus on them more to predict defectiveness. We use Bayesian networks to determine the probabilistic influential relationships among software metrics and defect proneness. In addition to the metrics used in Promise data repository, we define two more metrics, i.e. NOD for the number of developers and LOCQ for the source code quality. We extract these metrics by inspecting the source code repositories of the selected Promise data repository data sets. At the end of our modeling, we learn the marginal defect proneness probability of the whole software system, the set of most effective metrics, and the influential relationships among metrics and defectiveness. Our experiments on nine open source Promise data repository data sets show that response for class (RFC), lines of code (LOC), and lack of coding quality (LOCQ) are the most effective metrics whereas coupling between objects (CBO), weighted method per class (WMC), and lack of cohesion of methods (LCOM) are less effective metrics on defect proneness. Furthermore, number of children (NOC) and depth of inheritance tree (DIT) have very limited effect and are untrustworthy. On the other hand, based on the experiments on Poi, Tomcat, and Xalan data sets, we observe that there is a positive correlation between the number of developers (NOD) and the level of defectiveness. However, further investigation involving a greater number of projects is needed to confirm our findings. 相似文献

11.

Empirical analysis of search based algorithms to identify change prone classes of open source software

《Computer Languages, Systems and Structures》2017

There are numerous reasons leading to change in software such as changing requirements, changing technology, increasing customer demands, fixing of defects etc. Thus, identifying and analyzing the change-prone classes of the software during software evolution is gaining wide importance in the field of software engineering. This would help software developers to judiciously allocate the resources used for testing and maintenance. Software metrics can be used for constructing various classification models which can be used for timely identification of change prone classes. Search based algorithms which form a subset of machine learning algorithms can be utilized for constructing prediction models to identify change prone classes of software. Search based algorithms use a fitness function to find the best optimal solution among all the possible solutions. In this work, we analyze the effectiveness of hybridized search based algorithms for change prediction. In other words, the aim of this work is to find whether search based algorithms are capable for accurate model construction to predict change prone classes. We have also constructed models using machine learning techniques and compared the performance of these models with the models constructed using Search Based Algorithms. The validation is carried out on two open source Apache projects, Rave and Commons Math. The results prove the effectiveness of hybridized search based algorithms in predicting change prone classes of software. Thus, they can be utilized by the software developers to produce an efficient and better developed software. 相似文献

12.

Evaluating the change of software fault behavior with dataset attributes based on categorical correlation

Izzat Alsmadi Hassan Najadat 《Advances in Engineering Software》2011,42(8):535-546

Utilization of data mining in software engineering has been the subject of several research papers. Majority of subjects of those paper were in making use of historical data for decision making activities such as cost estimation and product or project attributes prediction and estimation. The ability to predict software fault modules and the ability to correlate relations between faulty modules and product attributes using statistics is the subject of this paper. Correlations and relations between the attributes and the categorical variable or the class are studied through generating a pool of records from each dataset and then select two samples every time from the dataset and compare them. The correlation between the two selected records is studied in terms of changing from faulty to non-faulty or the opposite for the module defect attribute and the value change between the two records in each evaluated attribute (e.g. equal, larger or smaller). The goal was to study if there are certain attributes that are consistently affecting changing the state of the module from faulty to none, or the opposite. Results indicated that such technique can be very useful in studying the correlations between each attribute and the defect status attribute. Another prediction algorithm is developed based on statistics of the module and the overall dataset. The algorithm gave each attribute true class and faulty class predictions. We found that dividing prediction capability for each attribute into those two (i.e. correct and faulty module prediction) facilitate understanding the impact of attribute values on the class and hence improve the overall prediction relative to previous studies and data mining algorithms. Results were evaluated and compared with other algorithms and previous studies. ROC metrics were used to evaluate the performance of the developed metrics. Results from those metrics showed that accuracy or prediction performance calculated traditionally using accurately predicted records divided by the total number of records in the dataset does not necessarily give the best indicator of a good metric or algorithm predictability. Those predictions may give wrong implication if other metrics are not considered with them. The ROC metrics were able to show some other important aspects of performance or accuracy. 相似文献

13.

Assessing the applicability of fault-proneness models across object-oriented software projects 总被引：1，自引：0，他引：1

Briand L.C. Melo W.L. Wust J. 《IEEE transactions on pattern analysis and machine intelligence》2002,28(7):706-720

A number of papers have investigated the relationships between design metrics and the detection of faults in object-oriented software. Several of these studies have shown that such models can be accurate in predicting faulty classes within one particular software product. In practice, however, prediction models are built on certain products to be used on subsequent software development projects. How accurate can these models be, considering the inevitable differences that may exist across projects and systems? Organizations typically learn and change. From a more general standpoint, can we obtain any evidence that such models are economically viable tools to focus validation and verification effort? This paper attempts to answer these questions by devising a general but tailorable cost-benefit model and by using fault and design data collected on two mid-size Java systems developed in the same environment. Another contribution of the paper is the use of a novel exploratory analysis technique - MARS (multivariate adaptive regression splines) to build such fault-proneness models, whose functional form is a-priori unknown. The results indicate that a model built on one system can be accurately used to rank classes within another system according to their fault proneness. The downside, however, is that, because of system differences, the predicted fault probabilities are not representative of the system predicted. However, our cost-benefit model demonstrates that the MARS fault-proneness model is potentially viable, from an economical standpoint. The linear model is not nearly as good, thus suggesting a more complex model is required. 相似文献

14.

Accuracy of software quality models over multiple releases 总被引：1，自引：0，他引：1

Taghi M. Khoshgoftaar Edward B. Allen Wendell D. Jones John P. Hudepohl 《Annals of Software Engineering》2000,9(1-4):103-116

Many evolving mission‐critical systems must have high software reliability. However, it is often difficult to identify fault‐prone modules early enough in a development cycle to guide software enhancement efforts effectively and efficiently. Software quality models can yield timely predictions of membership in the fault‐prone class on a module‐by‐module basis, enabling one to target enhancement techniques. However, it is an open empirical question, “Can a software quality model remain useful over several releases?” Most prior software quality studies have examined only one release of a system, evaluating the model with modules from the same release. We conducted a case study of a large legacy telecommunications system where measurements on one software release were used to build models, and three subsequent releases of the same system were used to evaluate model accuracy. This is a realistic assessment of model accuracy, closely simulating actual use of a software quality model. A module was considered fault‐prone if any of its faults were discovered by customers. These faults are extremely expensive due to consequent loss of service and emergency repair efforts. We found that the model maintained useful accuracy over several releases. These findings are initial empirical evidence that software quality models can remain useful as a system is maintained by a stable software development process. This revised version was published online in June 2006 with corrections to the Cover Date. 相似文献

15.

Empirical Analysis of Object-Oriented Design Metrics for Predicting High and Low Severity Faults 总被引：4，自引：0，他引：4

《IEEE transactions on pattern analysis and machine intelligence》2006,32(10):771-789

In the last decade, empirical studies on object-oriented design metrics have shown some of them to be useful for predicting the fault-proneness of classes in object-oriented software systems. This research did not, however, distinguish among faults according to the severity of impact. It would be valuable to know how object-oriented design metrics and class fault-proneness are related when fault severity is taken into account. In this paper, we use logistic regression and machine learning methods to empirically investigate the usefulness of object-oriented design metrics, specifically, a subset of the Chidamber and Kemerer suite, in predicting fault-proneness when taking fault severity into account. Our results, based on a public domain NASA data set, indicate that 1) most of these design metrics are statistically related to fault-proneness of classes across fault severity, and 2) the prediction capabilities of the investigated metrics greatly depend on the severity of faults. More specifically, these design metrics are able to predict low severity faults in fault-prone classes better than high severity faults in fault-prone classes 相似文献

16.

An empirical investigation of an object-oriented software system 总被引：2，自引：0，他引：2

《IEEE transactions on pattern analysis and machine intelligence》2000,26(8):786-796

The paper describes an empirical investigation into an industrial object oriented (OO) system comprised of 133000 lines of C++. The system was a subsystem of a telecommunications product and was developed using the Shlaer-Mellor method (S. Shlaer and S.J. Mellor, 1988; 1992). From this study, we found that there was little use of OO constructs such as inheritance, and therefore polymorphism. It was also found that there was a significant difference in the defect densities between those classes that participated in inheritance structures and those that did not, with the former being approximately three times more defect-prone. We were able to construct useful prediction systems for size and number of defects based upon simple counts such as the number of states and events per class. Although these prediction systems are only likely to have local significance, there is a more general principle that software developers can consider building their own local prediction systems. Moreover, we believe this is possible, even in the absence of the suites of metrics that have been advocated by researchers into OO technology. As a consequence, measurement technology may be accessible to a wider group of potential users 相似文献

17.

Metrics applicable to software design

Xiaobao Yu David Alex Lamb 《Annals of Software Engineering》1995,1(1):23-41

Applying metrics to software is a way to measure and improve software quality. Many metrics apply to software implementations (code), so they cannot be used early in the life cycle. We survey ten modularity and structural complexity metrics applicable to software designs, and summarize the results of empirical validation studies when they are available. We present a database schema from which most of these metrics can be computed. Aspects of each metric require further study and refinement. However, using some of them during design may still be beneficial. 相似文献

18.

Bayesian statistical effort prediction models for data-centred 4GL software development

《Information and Software Technology》2006,48(11):1056-1067

Constructing an accurate effort prediction model is a challenge in Software Engineering. This paper presents three Bayesian statistical software effort prediction models for database-oriented software systems, which are developed using a specific 4GL toolsuite. The models consist of specification-based software size metrics and development team's productivity metric. The models are constructed based on the subjective knowledge of human expert and calibrated using empirical data collected from 17 software systems developed in the target environment. The models' predictive accuracy is evaluated using subsets of the same data, which were not used for the models' calibration. The results show that the models have achieved very good predictive accuracy in terms of MMRE and pred measures. Hence, it is confirmed that the Bayesian statistical models can predict effort successfully in the target environment. In comparison with commonly used multiple linear regression models, the Bayesian statistical models'predictive accuracy is equivalent in general. However, when the number of software systems used for the models' calibration becomes smaller than five, the predictive accuracy of the best Bayesian statistical models are significantly better than the multiple linear regression model. This result suggests that the Bayesian statistical models would be a better choice when software organizations/practitioners do not posses sufficient empirical data for the models' calibration. The authors expect these findings to encourage more researchers to investigate the use of Bayesian statistical models for predicting software effort. 相似文献

19.

Handling class imbalance problem in software maintainability prediction: an empirical investigation

Ruchika MALHOTRA Kusum LATA 《Frontiers of Computer Science》2022,16(4):164205

As the complexity of software systems is increasing; software maintenance is becoming a challenge for software practitioners. The prediction of classes that require high maintainability effort is of utmost necessity to develop cost-effective and high-quality software. In research of software engineering predictive modeling, various software maintainability prediction (SMP) models are evolved to forecast maintainability. To develop a maintainability prediction model, software practitioners may come across situations in which classes or modules requiring high maintainability effort are far less than those requiring low maintainability effort. This condition gives rise to a class imbalance problem (CIP). In this situation, the minority classes’ prediction, i.e., the classes demanding high maintainability effort, is a challenge. Therefore, in this direction, this study investigates three techniques for handling the CIP on ten open-source software to predict software maintainability. This empirical investigation supports the use of resampling with replacement technique (RR) for treating CIP and develop useful models for SMP. 相似文献

20.

Communication metrics for software development

Dutoit A.H. Bruegge B. 《IEEE transactions on pattern analysis and machine intelligence》1998,24(8):615-628

Presents empirical evidence that metrics on communication artifacts generated by groupware tools can be used to gain significant insight into the development process that produced them. We describe a test-bed for developing and testing communication metrics, a senior-level software engineering project course at Carnegie Mellon University, in which we conducted several studies and experiments from 1991-1996 with more than 400 participants. Such a test-bed is an ideal environment for empirical software engineering, providing sufficient realism while allowing for controlled observation of important project parameters. We describe three proof-of-concept experiments to illustrate the value of communication metrics in software development projects. Finally, we propose a statistical framework based on structural equations for validating these communication metrics 相似文献