Optimising the Trade-Off Between Accuracy and Privacy in Data Stream Mining Environments

aut.embargoNoen_NZ
aut.thirdpc.containsNoen_NZ
dc.contributor.advisorSinha, Roopak
dc.contributor.advisorLai, Edmund
dc.contributor.advisorNaeem, Muhammad Asif
dc.contributor.authorHewage, Ullusu Hewage Waruni Amali
dc.date.accessioned2022-12-07T23:05:33Z
dc.date.available2022-12-07T23:05:33Z
dc.date.copyright2022
dc.date.issued2022
dc.date.updated2022-12-07T22:05:35Z
dc.description.abstractData streams differ from static datasets due to numerous characteristics such as being incremental, high speed, high volume, subject to concept drift, and dynamically adapting. This unique nature of data streams makes Privacy-Preserving Data Stream Mining (PPDSM) rather challenging. The trade-off between data privacy and data mining accuracy is one of the significant concerns in PPDSM. Optimising this trade-off is a complicated task due to the nature of data streams. Though privacy-preserving methods are proposed to optimise this trade-off in PPDSM, there is still room for improvement in this area. Moreover, there is a lack of well-structured frameworks to perform the accuracy-privacy optimisation. This research aims to implement an appropriate perturbation method providing optimal trade-off between data privacy and data mining accuracy in PPDSM, which ultimately leads to a well-structured framework. We proposed seven variations of noise addition methods to achieve high privacy while maintaining high accuracy. These novel methods combine cumulative noise addition, noise resetting, and cycle-wise noise addition, inspired by the well-known Logistic Function. The best-performing noise addition method from the proposed variations was used to build the Accuracy Privacy optimising Framework (APOF). The foundation of APOF is that the accuracy and privacy level depends entirely on the user, and achieving 100\% accuracy and privacy is not possible. Consequently, APOF was designed to optimise the accuracy-privacy trade-off by considering the user's privacy requirements. The optimisation is achieved through a data fitting module. Finally, we extended APOF to Enhanced-APOF to operate in data streaming environments. The logistic cumulative noise addition outperformed other proposed noise addition methods considering accuracy and privacy. The optimised accuracy-privacy trade-off could be achieved from the cycle-wise noise addition, and the cycles were designed based on the Logistic function. We could use all these benefits by using logistic cumulative noise addition as the privacy-preserving technique in APOF. Through the data fitting module APOF, we predicted the respective accuracy level for a user-defined privacy threshold retaining a small error. APOF allows the user to fine-tune requirements if needed and further optimise the accuracy-privacy trade-off according to his/her requirements. Experimental evidence shows the Enhanced-APOF is a well-structured framework for accuracy-privacy trade-off optimisation for a data streaming environment as it was designed considering the nature of data streams. The logistic cumulative noise addition for privacy preservation, Hoeffding Adaptive Tree for classification, and data fitting for optimisation have proven to be a prominent combination to achieve accuracy-privacy trade-off optimisation.en_NZ
dc.identifier.urihttps://hdl.handle.net/10292/15729
dc.language.isoenen_NZ
dc.publisherAuckland University of Technology
dc.rights.accessrightsOpenAccess
dc.titleOptimising the Trade-Off Between Accuracy and Privacy in Data Stream Mining Environmentsen_NZ
dc.typeThesisen_NZ
thesis.degree.grantorAuckland University of Technology
thesis.degree.levelDoctoral Theses
thesis.degree.nameDoctor of Philosophyen_NZ
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
HewageU.pdf
Size:
5.7 MB
Format:
Adobe Portable Document Format
Description:
Thesis
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
889 B
Format:
Item-specific license agreed upon to submission
Description:
Collections