A Privacy-Preserving Word Embedding Text Classification Model Based on Privacy Boundary Constructed by Deep Belief Network

Date
2023-09-15
Authors
Ma, Bo
Lai, Edmund
Yan, Wei Qi
Wu, Jinsong
Supervisor
Item type
Journal Article
Degree name
Journal Title
Journal ISSN
Volume Title
Publisher
Springer Science and Business Media LLC
Abstract

To effectively extract and classify the information from reports or documents and protect the privacy of the extracted results, we propose a privacy classification named Word Embedding Combination Privacy-preserving Support Vector Machine (WECPPSVM) model to classify the text. In addition, this paper also proposes the Privacy-preserving Distribution and Independent Frequent Subsequence Extraction Algorithm (PPDIFSEA), which calculates the degree of independence of the training data input to the classification model by training the Deep Belief Network(DBN) in PPDIFSEA, then obtains the Privacy Boundary(PB). PB is an indispensable condition for both data sampling and privacy noise generation. And this model can protect privacy by injecting the privacy noise into the classification result, this method can interfere with the background knowledge-based privacy attack. Our quantitative analysis shows that the WECPPSVM proposed in this paper can approach mainstream text classification algorithms in terms of text classification accuracy while preserving privacy without increasing computational complexity. In addition, the fusion study and privacy threat evaluation also verify that the proposed PPDIFSEA method combined with WECPPSVM achieves an acceptable level of classification accuracy and privacy protection.

Description
Keywords
0801 Artificial Intelligence and Image Processing , 0803 Computer Software , 0805 Distributed Computing , 0806 Information Systems , Artificial Intelligence & Image Processing , Software Engineering , 4009 Electronics, sensors and digital hardware , 4603 Computer vision and multimedia computation , 4605 Data management and data science , 4606 Distributed computing and systems software
Source
Multimedia Tools and Applications, ISSN: 1380-7501 (Print); 1573-7721 (Online), Springer Science and Business Media LLC. doi: 10.1007/s11042-023-15623-3
Rights statement