The impact of sampling and rule set size on generated fuzzy inference system predictive accuracy: analysis of a software engineering data set

aut.researcherMacDonell, Stephen Gerard
dc.contributor.authorMacDonell, SG
dc.date.accessioned2012-03-10T09:16:52Z
dc.date.available2012-03-10T09:16:52Z
dc.date.copyright2011
dc.date.issued2011
dc.description.abstractAbstract. Software project management makes extensive use of predictive modeling to estimate product size, defect proneness and development effort. Although uncertainty is acknowledged in these tasks, fuzzy inference systems, designed to cope well with uncertainty, have received only limited attention in the software engineering domain. In this study we empirically investigate the impact of two choices on the predictive accuracy of generated fuzzy inference systems when applied to a software engineering data set: sampling of observations for training and testing; and the size of the rule set generated using fuzzy c-means clustering. Over ten samples we found no consistent pattern of predictive performance given certain rule set size. We did find, however, that a rule set compiled from multiple samples generally resulted in more accurate predictions than single sample rule sets. More generally, the results provide further evidence of the sensitivity of empirical analysis outcomes to specific model-building decisions.
dc.identifier.citationProceedings of the 12th Engineering Applications of Neural Networks (EANN)/7th Artificial Intelligence Applications and Innovations (AIAI) Joint Conferences, Corfu, Greece, pages 360 - 369
dc.identifier.doi10.1007/978-3-642-23960-1_43
dc.identifier.urihttps://hdl.handle.net/10292/3468
dc.publisherSpringer
dc.relation.urihttp://dx.doi.org/10.1007/978-3-642-23960-1_43
dc.rightsAn author may self-archive an author-created version of his/her article on his/her own website and or in his/her institutional repository. He/she may also deposit this version on his/her funder’s or funder’s designated repository at the funder’s request or as a result of a legal obligation, provided it is not made publicly available until 12 months after official publication. He/ she may not use the publisher's PDF version, which is posted on www.springerlink.com, for the purpose of self-archiving or deposit. Furthermore, the author may only post his/her version provided acknowledgement is given to the original source of publication and a link is inserted to the published article on Springer's website. The link must be accompanied by the following text: "The final publication is available at www.springerlink.com”. (Please also see Publisher’s Version and Citation)
dc.rights.accessrightsOpenAccess
dc.subjectFuzzy inference
dc.subjectPrediction
dc.subjectSoftware size
dc.subjectSource code
dc.subjectSampling
dc.subjectSensitivity analysis
dc.titleThe impact of sampling and rule set size on generated fuzzy inference system predictive accuracy: analysis of a software engineering data set
dc.typeConference Contribution
pubs.organisational-data/AUT
pubs.organisational-data/AUT/Design & Creative Technologies
pubs.organisational-data/AUT/Design & Creative Technologies/School of Computing & Mathematical Science
pubs.organisational-data/AUT/PBRF Researchers
pubs.organisational-data/AUT/PBRF Researchers/Design & Creative Technologies PBRF Researchers
pubs.organisational-data/AUT/PBRF Researchers/Design & Creative Technologies PBRF Researchers/DCT C & M Computing
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
MacDonell (2011) CISE-LNCS.pdf
Size:
257.49 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
licence.htm
Size:
29.98 KB
Format:
Unknown data format
Description: