Mining software metrics from the jazz repository
Files
Date
Authors
Supervisor
Item type
Degree name
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This paper describes the extraction of source code metrics from the Jazz repository and the systematic application of data mining techniques to identify the most useful of those metrics for predicting the success or failure of an attempt to construct a working instance of the software product. Results are presented from a study using the J48 classification method used in conjunction with a number of attribute selection strategies applied to a set of source code metrics. These strategies involve the investigation of differing slices of code from the version control system and the cross-dataset classification of the various significant metrics in an attempt to work around the multicollinearity implicit in the available data. The results indicate that only a relatively small number of the available software metrics that have been considered have any significance for predicting the outcome of a build. These significant metrics are outlined and implication of the results discussed, particularly the relative difficulty of being able to predict failed build attempts.