Novel Methods for Distributed and Privacy-Preserving Data Stream Mining

Denham, Benjamin James

Novel Methods for Distributed and Privacy-Preserving Data Stream Mining

aut.embargo	No	en_NZ
aut.thirdpc.contains	No	en_NZ
dc.contributor.advisor	Pears, Russel
dc.contributor.advisor	Naeem, Muhammad Asif
dc.contributor.author	Denham, Benjamin James
dc.date.accessioned	2019-06-03T23:33:08Z
dc.date.available	2019-06-03T23:33:08Z
dc.date.copyright	2019
dc.date.issued	2019
dc.date.updated	2019-06-01T06:00:35Z
dc.description.abstract	The growing number of “big” datasets present many opportunities for data mining, but also raise a variety of new challenges. Datasets may take the form of continuous streams with constantly changing patterns, they may be too widely distributed to be centralised for analysis at a single location, or they may contain sensitive values that data owners are not willing to share due to privacy concerns. Much past research has considered these issues individually, but few existing methods can address combinations of these properties. Therefore, this research develops methods for distributed and privacy-preserving data stream mining: a novel Hierarchical Distributed Stream Miner (HDSM) that learns relationships between the features of separate streams with minimal data transmission to central locations, and two data perturbation methods for privacy-preserving stream mining based on the combination of random projection, random translation, and additive noise. Experimental evaluation of HDSM demonstrates significant improvements in classification accuracy over existing distributed stream mining approaches while minimising data transmission and computational costs. HDSM’s ability to dynamically trade-off accuracy with these costs is also demonstrated. Variations of the known input-output Maximum A Posteriori (MAP) attack are developed to experimentally evaluate the data perturbation methods, and the proposed composite methods are shown to achieve a better trade-off between privacy and model accuracy than random projection alone. Finally, an approach is described for combining HDSM with data perturbation to achieve distributed privacy-preserving stream mining.	en_NZ
dc.identifier.uri	https://hdl.handle.net/10292/12536
dc.language.iso	en	en_NZ
dc.publisher	Auckland University of Technology
dc.rights.accessrights	OpenAccess
dc.subject	machine-learning	en_NZ
dc.subject	data stream mining	en_NZ
dc.subject	distributed data mining	en_NZ
dc.subject	privacy-preserving data mining	en_NZ
dc.title	Novel Methods for Distributed and Privacy-Preserving Data Stream Mining	en_NZ
dc.type	Thesis	en_NZ
thesis.degree.grantor	Auckland University of Technology
thesis.degree.level	Masters Theses
thesis.degree.name	Master of Computer and Information Sciences	en_NZ

Files

Original bundle

Now showing 1 - 1 of 1

Name:: DenhamB.pdf
Size:: 1.48 MB
Format:: Adobe Portable Document Format
Description:: Thesis

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 897 B
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters Theses