Yan, Wei QiLiang, Sendong2021-07-012021-07-0120212021https://hdl.handle.net/10292/14318In this thesis, the end-to-end framework for speech recognition is probed with multilanguage datasets. The focus of this thesis is on the end-to-end framework. Our objective is to improve the performance of the CTC/Attention model. To compare speech recognition performance in different languages, we designed and built three small datasets, including Chinese, English and Code-Switch. We compare the performance of the hybrid CTC/Attention model in multiple languages environment. Throughout our experiments, we explore that the end-to-end framework of the CTC/Attention model achieves similar or better performance with the HMM-DNN model in a single language and Code-Switch speaking environment. Moreover, speech recognition in different languages is compared in this thesis.enSpeech recognitionEnd-to-endAttention modelCTC modelMulti-Language Datasets for Speech Recognition Based on the End-to-End FrameworkThesisOpenAccess2021-07-01