Hybrid Model of Data Augmentation Methods for Text Classification Task
aut.relation.conference | 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management | en_NZ |
aut.relation.endpage | 197 | |
aut.relation.startpage | 194 | |
aut.relation.volume | 3 | en_NZ |
aut.researcher | Mohaghegh, Mahsa | |
dc.contributor.author | Feng, JH | en_NZ |
dc.contributor.author | Mohaghegh, M | en_NZ |
dc.date.accessioned | 2021-11-30T02:22:47Z | |
dc.date.available | 2021-11-30T02:22:47Z | |
dc.date.copyright | 2021-10-25 | en_NZ |
dc.date.issued | 2021-10-25 | en_NZ |
dc.description.abstract | Data augmentation techniques have been increasingly explored in natural language processing to create more textual data for training. However, the performance gain of existing techniques is often marginal. This paper explores the performance of combining two EDA (Easy Data Augmentation) methods, random swap and random delete for the performance in text classification. The classification tasks were conducted using CNN as a text classifier model on a portion of the SST-2: Stanford Sentiment Treebank dataset. The results show that the performance gain of this hybrid model performs worse than the benchmark accuracy. The research can be continued with a different combination of methods and experimented on larger datasets. | |
dc.identifier.citation | In Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KMIS, ISBN 978-989-758-533-3; ISSN 2184-3228, pages 194-197. DOI: 10.5220/0010688500003064 | |
dc.identifier.doi | 10.5220/0010688500003064 | |
dc.identifier.isbn | 978-989-758-533-3 | en_NZ |
dc.identifier.issn | 2184-3228 | en_NZ |
dc.identifier.uri | https://hdl.handle.net/10292/14756 | |
dc.publisher | SCITEPRESS | |
dc.relation.uri | https://www.scitepress.org/Link.aspx?doi=10.5220/0010688500003064 | |
dc.rights | Copyright (c) 2021 by SCITEPRESS – Science and Technology Publications. Creative Commons License. CC BY-NC-ND 4.0 | |
dc.rights.accessrights | OpenAccess | en_NZ |
dc.subject | Data Augmentation; Hybrid Models; Machine Learning; Natural Language Processing | |
dc.title | Hybrid Model of Data Augmentation Methods for Text Classification Task | en_NZ |
dc.type | Conference Contribution | |
pubs.elements-id | 442485 | |
pubs.organisational-data | /AUT | |
pubs.organisational-data | /AUT/Faculty of Design & Creative Technologies | |
pubs.organisational-data | /AUT/Faculty of Design & Creative Technologies/Faculty Central | |
pubs.organisational-data | /AUT/Faculty of Design & Creative Technologies/School of Engineering, Computer & Mathematical Sciences | |
pubs.organisational-data | /AUT/Faculty of Design & Creative Technologies/School of Engineering, Computer & Mathematical Sciences/Science, Technology, Engineering, & Mathematics Tertiary Education Centre |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- KMIS_2021_24_CR.pdf
- Size:
- 235.39 KB
- Format:
- Adobe Portable Document Format
- Description:
- Conference contribution
License bundle
1 - 1 of 1
Loading...
- Name:
- AUT Grant of Licence for Tuwhera Jun 2021.pdf
- Size:
- 360.95 KB
- Format:
- Adobe Portable Document Format
- Description: