A Geometric and Topological View on Deep Learning Language Models
| aut.embargo | No | |
| aut.thirdpc.contains | No | |
| dc.contributor.advisor | Lai, Edmund | |
| dc.contributor.advisor | Li, Weihua | |
| dc.contributor.author | Feng, Jia Hui | |
| dc.date.accessioned | 2025-06-15T22:12:47Z | |
| dc.date.available | 2025-06-15T22:12:47Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | As Natural Language Processing (NLP) models evolve, challenges such as interpretability, misinformation, and high computational demands persist, primarily due to limited understanding of their decision-making processes. This thesis aims to uncover crucial insights by examining NLP models through the frameworks of topology and geometry, addressing both the understanding and practical aspects of these challenges. Through our topological framework analysis, we demonstrate, for the first time, the occurrence of neural collapse in NLP models for text classification tasks, revealing how features within each class collapse towards their mean while maintaining separation between different classes. Our geometric analysis, using convex hull computations and Delaunay triangulation, uncovered that high-performing language models maintain distinct semantic boundaries and demonstrate consistent geometric properties across different scales of analysis. Building on these insights, we developed two novel algorithms: GATFilter and GATA. GATFilter improved augmented data quality, achieving performance gains of up to 8% across various NLP datasets (SST2, SNIPS, TREC, Question Topic) and multiple augmentation strategies (EDA, Backtranslation, Contextual Augmentation using BERT). GATA demonstrated exceptional performance on similar datasets while maintaining significant computational efficiency. GATA showed non-linear scaling in execution time (54.25 to 167.58 seconds from 20 to 80 samples) and avoided the large initial memory loads (>3900MB) required by methods like Backtranslation and ContextualBERT. These findings advance NLP theory by establishing new geometric frameworks for analyzing word embeddings and model behavior, while offering practical solutions for resource-efficient text augmentation. Our work demonstrates that geometric principles can effectively balance model performance with computational efficiency. | |
| dc.identifier.uri | http://hdl.handle.net/10292/19312 | |
| dc.language.iso | en | |
| dc.publisher | Auckland University of Technology | |
| dc.rights.accessrights | OpenAccess | |
| dc.title | A Geometric and Topological View on Deep Learning Language Models | |
| dc.type | Thesis | |
| thesis.degree.grantor | Auckland University of Technology | |
| thesis.degree.name | Doctor of Philosophy |
