Multi-Language Datasets for Speech Recognition Based on the End-to-End Framework

Date
2021
Authors
Liang, Sendong
Supervisor
Yan, Wei Qi
Item type
Thesis
Degree name
Master of Computer and Information Sciences
Journal Title
Journal ISSN
Volume Title
Publisher
Auckland University of Technology
Abstract

In this thesis, the end-to-end framework for speech recognition is probed with multilanguage datasets. The focus of this thesis is on the end-to-end framework. Our objective is to improve the performance of the CTC/Attention model. To compare speech recognition performance in different languages, we designed and built three small datasets, including Chinese, English and Code-Switch. We compare the performance of the hybrid CTC/Attention model in multiple languages environment. Throughout our experiments, we explore that the end-to-end framework of the CTC/Attention model achieves similar or better performance with the HMM-DNN model in a single language and Code-Switch speaking environment. Moreover, speech recognition in different languages is compared in this thesis.

Description
Keywords
Speech recognition , End-to-end , Attention model , CTC model
Source
DOI
Publisher's version
Rights statement
Collections