Multi-Language Datasets for Speech Recognition Based on the End-to-End Framework

Liang, Sendong

Multi-Language Datasets for Speech Recognition Based on the End-to-End Framework

Files

Thesis(1.46 MB)

Date

2021

Authors

Liang, Sendong

Supervisor

Yan, Wei Qi

Item type

Thesis

Degree name

Master of Computer and Information Sciences

Publisher

Auckland University of Technology

Abstract

In this thesis, the end-to-end framework for speech recognition is probed with multilanguage datasets. The focus of this thesis is on the end-to-end framework. Our objective is to improve the performance of the CTC/Attention model. To compare speech recognition performance in different languages, we designed and built three small datasets, including Chinese, English and Code-Switch. We compare the performance of the hybrid CTC/Attention model in multiple languages environment. Throughout our experiments, we explore that the end-to-end framework of the CTC/Attention model achieves similar or better performance with the HMM-DNN model in a single language and Code-Switch speaking environment. Moreover, speech recognition in different languages is compared in this thesis.

Keywords

Speech recognition, End-to-end, Attention model, CTC model

Permanent link

https://hdl.handle.net/10292/14318

Collections

Masters Theses

Full item page

Multi-Language Datasets for Speech Recognition Based on the End-to-End Framework

Files

Date

Authors

Supervisor

Item type

Degree name

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Source

DOI

Publisher's version

Rights statement

Permanent link

Collections