Hybrid Model of Data Augmentation Methods for Text Classification Task

aut.relation.conference13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Managementen_NZ
aut.relation.endpage197
aut.relation.startpage194
aut.relation.volume3en_NZ
aut.researcherMohaghegh, Mahsa
dc.contributor.authorFeng, JHen_NZ
dc.contributor.authorMohaghegh, Men_NZ
dc.date.accessioned2021-11-30T02:22:47Z
dc.date.available2021-11-30T02:22:47Z
dc.date.copyright2021-10-25en_NZ
dc.date.issued2021-10-25en_NZ
dc.description.abstractData augmentation techniques have been increasingly explored in natural language processing to create more textual data for training. However, the performance gain of existing techniques is often marginal. This paper explores the performance of combining two EDA (Easy Data Augmentation) methods, random swap and random delete for the performance in text classification. The classification tasks were conducted using CNN as a text classifier model on a portion of the SST-2: Stanford Sentiment Treebank dataset. The results show that the performance gain of this hybrid model performs worse than the benchmark accuracy. The research can be continued with a different combination of methods and experimented on larger datasets.
dc.identifier.citationIn Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KMIS, ISBN 978-989-758-533-3; ISSN 2184-3228, pages 194-197. DOI: 10.5220/0010688500003064
dc.identifier.doi10.5220/0010688500003064
dc.identifier.isbn978-989-758-533-3en_NZ
dc.identifier.issn2184-3228en_NZ
dc.identifier.urihttps://hdl.handle.net/10292/14756
dc.publisherSCITEPRESS
dc.relation.urihttps://www.scitepress.org/Link.aspx?doi=10.5220/0010688500003064
dc.rightsCopyright (c) 2021 by SCITEPRESS – Science and Technology Publications. Creative Commons License. CC BY-NC-ND 4.0
dc.rights.accessrightsOpenAccessen_NZ
dc.subjectData Augmentation; Hybrid Models; Machine Learning; Natural Language Processing
dc.titleHybrid Model of Data Augmentation Methods for Text Classification Tasken_NZ
dc.typeConference Contribution
pubs.elements-id442485
pubs.organisational-data/AUT
pubs.organisational-data/AUT/Faculty of Design & Creative Technologies
pubs.organisational-data/AUT/Faculty of Design & Creative Technologies/Faculty Central
pubs.organisational-data/AUT/Faculty of Design & Creative Technologies/School of Engineering, Computer & Mathematical Sciences
pubs.organisational-data/AUT/Faculty of Design & Creative Technologies/School of Engineering, Computer & Mathematical Sciences/Science, Technology, Engineering, & Mathematics Tertiary Education Centre
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
KMIS_2021_24_CR.pdf
Size:
235.39 KB
Format:
Adobe Portable Document Format
Description:
Conference contribution
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
AUT Grant of Licence for Tuwhera Jun 2021.pdf
Size:
360.95 KB
Format:
Adobe Portable Document Format
Description: