Multi-Language Datasets for Speech Recognition Based on the End-to-End Framework
aut.embargo | No | en_NZ |
aut.thirdpc.contains | No | en_NZ |
dc.contributor.advisor | Yan, Wei Qi | |
dc.contributor.author | Liang, Sendong | |
dc.date.accessioned | 2021-07-01T02:45:35Z | |
dc.date.available | 2021-07-01T02:45:35Z | |
dc.date.copyright | 2021 | |
dc.date.issued | 2021 | |
dc.date.updated | 2021-07-01T01:35:35Z | |
dc.description.abstract | In this thesis, the end-to-end framework for speech recognition is probed with multilanguage datasets. The focus of this thesis is on the end-to-end framework. Our objective is to improve the performance of the CTC/Attention model. To compare speech recognition performance in different languages, we designed and built three small datasets, including Chinese, English and Code-Switch. We compare the performance of the hybrid CTC/Attention model in multiple languages environment. Throughout our experiments, we explore that the end-to-end framework of the CTC/Attention model achieves similar or better performance with the HMM-DNN model in a single language and Code-Switch speaking environment. Moreover, speech recognition in different languages is compared in this thesis. | en_NZ |
dc.identifier.uri | https://hdl.handle.net/10292/14318 | |
dc.language.iso | en | en_NZ |
dc.publisher | Auckland University of Technology | |
dc.rights.accessrights | OpenAccess | |
dc.subject | Speech recognition | en_NZ |
dc.subject | End-to-end | en_NZ |
dc.subject | Attention model | en_NZ |
dc.subject | CTC model | en_NZ |
dc.title | Multi-Language Datasets for Speech Recognition Based on the End-to-End Framework | en_NZ |
dc.type | Thesis | en_NZ |
thesis.degree.grantor | Auckland University of Technology | |
thesis.degree.level | Masters Theses | |
thesis.degree.name | Master of Computer and Information Sciences | en_NZ |