The value and validity of software effort estimation models built from a multiple organization data set
The objective of this research is to empirically assess the value and validity of a multi-organization data set in the building of prediction models for several ‘local’ software organizations; that is, smaller organizations that might have a few project records but that are interested in improving their ability to accurately predict software project effort. Evidence to date in the research literature is mixed, due not to problems with the underlying research ideas but with limitations in the analytical processes employed: • the majority of previous studies have used only a single organization as the ‘local’ sample, introducing the potential for bias • the degree to which the conclusions of these studies might apply more generally is unable to be determined because of a lack of transparency in the data analysis processes used.
It is the aim of this research to provide a more robust and visible test of the utility of the largest multi-organization data set currently available – that from the ISBSG – in terms of enabling smaller-scale organizations to build relevant and accurate models for project-level effort prediction. Stepwise regression is employed to enable the construction of ‘local’, ‘global’ and ‘refined global’ models of effort that are then validated against actual project data from eight organizations. The results indicate that local data, that is, data collected for a single organization, is almost always more effective as a basis for the construction of a predictive model than data sourced from a global repository. That said, the accuracy of the models produced from the global data set, while worse than that achieved with local data, may be sufficiently accurate in the absence of reliable local data – an issue that could be investigated in future research. The study concludes with recommendations for both software engineering practice – in setting out a more dynamic scenario for the management of software development – and research – in terms of implications for the collection and analysis of software engineering data.