A Taxonomy of Data Quality Challenges in Empirical Software Engineering

aut.relation.endpage106
aut.relation.startpage97
dark.contributor.authorBosu, MFen_NZ
dark.contributor.authorMacdonell, SGen_NZ
dc.date.accessioned2016-08-22T22:17:09Z
dc.date.available2016-08-22T22:17:09Z
dc.date.copyright2013en_NZ
dc.date.issued2013en_NZ
dc.description.abstractReliable empirical models such as those used in software effort estimation or defect prediction are inherently dependent on the data from which they are built. As demands for process and product improvement continue to grow, the quality of the data used in measurement and prediction systems warrants increasingly close scrutiny. In this paper we propose a taxonomy of data quality challenges in empirical software engineering, based on an extensive review of prior research. We consider current assessment techniques for each quality issue and proposed mechanisms to address these issues, where available. Our taxonomy classifies data quality issues into three broad areas: first, characteristics of data that mean they are not fit for modeling, second, data set characteristics that lead to concerns about the suitability of applying a given model to another data set, and third, factors that prevent or limit data accessibility and trust. We identify this latter area as of particular need in terms of further research. © 2013 IEEE.en_NZ
dc.identifier.citationProceedings of the 22nd Australian Software Engineering Conference (ASWEC2013), Melbourne, Australia, pp.97 - 106. doi: 10.1109/ASWEC.2013.21en_NZ
dc.identifier.doi10.1109/ASWEC.2013.21en_NZ
dc.identifier.urihttps://hdl.handle.net/10292/10005
dc.languageengen_NZ
dc.publisherIEEE
dc.relation.urihttp://dx.doi.org/10.1109/ASWEC.2013.21
dc.rightsCopyright © 2013 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
dc.rights.accessrightsOpenAccessen_NZ
dc.subjectAccessibilityen_NZ
dc.subjectCommercial sensitivityen_NZ
dc.subjectData qualityen_NZ
dc.subjectEmpirical software engineeringen_NZ
dc.subjectProvenanceen_NZ
dc.subjectTrustworthinessen_NZ
dc.titleA Taxonomy of Data Quality Challenges in Empirical Software Engineeringen_NZ
dc.typeJournal Article
pubs.elements-id157172
pubs.organisational-data/AUT
pubs.organisational-data/AUT/Design & Creative Technologies
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Bosu and MacDonell (2013b) ASWEC.pdf
Size:
215.89 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
RE4.10 Grant of Licence.docx
Size:
14.05 KB
Format:
Microsoft Word 2007+
Description: