Repository logo
 

Multi-Language Datasets for Speech Recognition Based on the End-to-End Framework

Date

Authors

Liang, Sendong

Supervisor

Yan, Wei Qi

Item type

Thesis

Degree name

Master of Computer and Information Sciences

Journal Title

Journal ISSN

Volume Title

Publisher

Auckland University of Technology

Abstract

In this thesis, the end-to-end framework for speech recognition is probed with multilanguage datasets. The focus of this thesis is on the end-to-end framework. Our objective is to improve the performance of the CTC/Attention model. To compare speech recognition performance in different languages, we designed and built three small datasets, including Chinese, English and Code-Switch. We compare the performance of the hybrid CTC/Attention model in multiple languages environment. Throughout our experiments, we explore that the end-to-end framework of the CTC/Attention model achieves similar or better performance with the HMM-DNN model in a single language and Code-Switch speaking environment. Moreover, speech recognition in different languages is compared in this thesis.

Description

Keywords

Speech recognition, End-to-end, Attention model, CTC model

Source

DOI

Publisher's version

Rights statement

Collections