Multi-Language Datasets for Speech Recognition Based on the End-to-End Framework

aut.embargoNoen_NZ
aut.thirdpc.containsNoen_NZ
dc.contributor.advisorYan, Wei Qi
dc.contributor.authorLiang, Sendong
dc.date.accessioned2021-07-01T02:45:35Z
dc.date.available2021-07-01T02:45:35Z
dc.date.copyright2021
dc.date.issued2021
dc.date.updated2021-07-01T01:35:35Z
dc.description.abstractIn this thesis, the end-to-end framework for speech recognition is probed with multilanguage datasets. The focus of this thesis is on the end-to-end framework. Our objective is to improve the performance of the CTC/Attention model. To compare speech recognition performance in different languages, we designed and built three small datasets, including Chinese, English and Code-Switch. We compare the performance of the hybrid CTC/Attention model in multiple languages environment. Throughout our experiments, we explore that the end-to-end framework of the CTC/Attention model achieves similar or better performance with the HMM-DNN model in a single language and Code-Switch speaking environment. Moreover, speech recognition in different languages is compared in this thesis.en_NZ
dc.identifier.urihttps://hdl.handle.net/10292/14318
dc.language.isoenen_NZ
dc.publisherAuckland University of Technology
dc.rights.accessrightsOpenAccess
dc.subjectSpeech recognitionen_NZ
dc.subjectEnd-to-enden_NZ
dc.subjectAttention modelen_NZ
dc.subjectCTC modelen_NZ
dc.titleMulti-Language Datasets for Speech Recognition Based on the End-to-End Frameworken_NZ
dc.typeThesisen_NZ
thesis.degree.grantorAuckland University of Technology
thesis.degree.levelMasters Theses
thesis.degree.nameMaster of Computer and Information Sciencesen_NZ
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
LiangS.pdf
Size:
1.46 MB
Format:
Adobe Portable Document Format
Description:
Thesis
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
897 B
Format:
Item-specific license agreed upon to submission
Description:
Collections