The impact of sampling and rule set size on generated fuzzy inference system predictive accuracy: analysis of a software engineering data set
aut.researcher | MacDonell, Stephen Gerard | |
dc.contributor.author | MacDonell, SG | |
dc.date.accessioned | 2012-03-10T09:16:52Z | |
dc.date.available | 2012-03-10T09:16:52Z | |
dc.date.copyright | 2011 | |
dc.date.issued | 2011 | |
dc.description.abstract | Abstract. Software project management makes extensive use of predictive modeling to estimate product size, defect proneness and development effort. Although uncertainty is acknowledged in these tasks, fuzzy inference systems, designed to cope well with uncertainty, have received only limited attention in the software engineering domain. In this study we empirically investigate the impact of two choices on the predictive accuracy of generated fuzzy inference systems when applied to a software engineering data set: sampling of observations for training and testing; and the size of the rule set generated using fuzzy c-means clustering. Over ten samples we found no consistent pattern of predictive performance given certain rule set size. We did find, however, that a rule set compiled from multiple samples generally resulted in more accurate predictions than single sample rule sets. More generally, the results provide further evidence of the sensitivity of empirical analysis outcomes to specific model-building decisions. | |
dc.identifier.citation | Proceedings of the 12th Engineering Applications of Neural Networks (EANN)/7th Artificial Intelligence Applications and Innovations (AIAI) Joint Conferences, Corfu, Greece, pages 360 - 369 | |
dc.identifier.doi | 10.1007/978-3-642-23960-1_43 | |
dc.identifier.uri | https://hdl.handle.net/10292/3468 | |
dc.publisher | Springer | |
dc.relation.uri | http://dx.doi.org/10.1007/978-3-642-23960-1_43 | |
dc.rights | An author may self-archive an author-created version of his/her article on his/her own website and or in his/her institutional repository. He/she may also deposit this version on his/her funder’s or funder’s designated repository at the funder’s request or as a result of a legal obligation, provided it is not made publicly available until 12 months after official publication. He/ she may not use the publisher's PDF version, which is posted on www.springerlink.com, for the purpose of self-archiving or deposit. Furthermore, the author may only post his/her version provided acknowledgement is given to the original source of publication and a link is inserted to the published article on Springer's website. The link must be accompanied by the following text: "The final publication is available at www.springerlink.com”. (Please also see Publisher’s Version and Citation) | |
dc.rights.accessrights | OpenAccess | |
dc.subject | Fuzzy inference | |
dc.subject | Prediction | |
dc.subject | Software size | |
dc.subject | Source code | |
dc.subject | Sampling | |
dc.subject | Sensitivity analysis | |
dc.title | The impact of sampling and rule set size on generated fuzzy inference system predictive accuracy: analysis of a software engineering data set | |
dc.type | Conference Contribution | |
pubs.organisational-data | /AUT | |
pubs.organisational-data | /AUT/Design & Creative Technologies | |
pubs.organisational-data | /AUT/Design & Creative Technologies/School of Computing & Mathematical Science | |
pubs.organisational-data | /AUT/PBRF Researchers | |
pubs.organisational-data | /AUT/PBRF Researchers/Design & Creative Technologies PBRF Researchers | |
pubs.organisational-data | /AUT/PBRF Researchers/Design & Creative Technologies PBRF Researchers/DCT C & M Computing |