Speech Emotion Recognition Using Machine Learning — A Systematic Review

Date
2023-08-14
Authors
Madanian, Samaneh
Chen, Talen
Adeleye, Olayinka
Templeton, John Michael
Poellabauer, Christian
Parry, Dave
Schneider, Sandra L
Supervisor
Item type
Journal Article
Degree name
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier BV
Abstract

Speech emotion recognition (SER) as a Machine Learning (ML) problem continues to garner a significant amount of research interest, especially in the affective computing domain. This is due to its increasing potential, algorithmic advancements, and applications in real-world scenarios. Human speech contains para-linguistic information that can be represented using quantitative features such as pitch, intensity, and Mel-Frequency Cepstral Coefficients (MFCC). SER is commonly achieved following three key steps: data processing, feature selection/extraction, and classification based on the underlying emotional features. The nature of these steps, coupled with the distinct features of human speech, underpin the use of ML methods for SER implementation. Recent research works in affective computing employed various ML methods for SER tasks; however, only a few of them capture the underlying techniques and methods that can be used to facilitate the three core steps of SER implementation. In addition, the challenges associated with these steps, and the state-of-the-art approaches used in tackling them are either ignored or sparsely discussed in these works. In this paper, we present a systematic review of research that addressed SER tasks from ML perspectives over the last decade, with emphasis on the three SER implementation steps. Different challenges, including the issue of low-classification-accuracy of Speaker-Independent experiments, and solutions associated with them, are discussed in detail. The review also provides guidelines for SER evaluation with a focus on common baselines, and metrics available for experimentation. This paper is expected to serve as a comprehensive guideline for SER researchers to design SER solutions using ML techniques, motivate possible improvements of existing SER models, or trigger novel techniques to enhance SER performance.

Description
Keywords
46 Information and Computing Sciences , 4608 Human-Centred Computing
Source
Intelligent Systems with Applications, ISSN: 2667-3053 (Print), Elsevier BV, 200266-200266. doi: 10.1016/j.iswa.2023.200266
Rights statement