A fuzzy logic approach to computer software source code authorship analysis

Date
1998
Authors
Kilgour, RI
Gray, AR
Sallis, PJ
MacDonell, SG
Supervisor
Item type
Conference Contribution
Degree name
Journal Title
Journal ISSN
Volume Title
Publisher
Springer-Verlag
Abstract

Software source code authorship analysis has become an important area in recent years with promising applications in both the legal sector (such as proof of ownership and software forensics) and the education sector (such as plagiarism detection and assessing style). Authorship analysis encompasses the sub-areas of author discrimination, author characterization, and similarity detection (also referred to as plagiarism detection). While a large number of metrics have been proposed for this task, many borrowed or adapted from the area of computational linguistics, there is a difficulty with capturing certain types of information in terms of quantitative measurement. Here it is proposed that existing numerical metrics should be supplemented with fuzzy-logic linguistic variables to capture more subjective elements of authorship, such as the degree to which comments match the actual source code’s behavior. These variables avoid the need for complex and subjective rules, replacing these with an expert’s judgement. Fuzzy-logic models may also help to overcome problems with small data sets for calibrating such models. Using authorship discrimination as a test case, the utility of objective and fuzzy measures, singularly and in combination, is assessed as well as the consistency of the measures between counters.

Description
Keywords
Source
In Proceedings of the 1997 International Conference on Neural Information Processing and Intelligent Information Systems.Dunedin, New Zealand, Springer-Verlag (1997) 865-868
DOI
Rights statement
An author may self-archive an author-created version of his/her article on his/her own website and or in his/her institutional repository. He/she may also deposit this version on his/her funder’s or funder’s designated repository at the funder’s request or as a result of a legal obligation, provided it is not made publicly available until 12 months after official publication. He/ she may not use the publisher's PDF version, which is posted on www.springerlink.com, for the purpose of self-archiving or deposit. Furthermore, the author may only post his/her version provided acknowledgement is given to the original source of publication and a link is inserted to the published article on Springer's website. The link must be accompanied by the following text: "The final publication is available at www.springerlink.com”. (Please also see Publisher’s Version and Citation)