Speech Emotion Feature Extraction and Classification

Main Article Content

Hamed Suliman

Abstract

Emotion speech recognition has become a prominent research area in recent years due to its extensive applications across various industries. A key challenge in this field is extracting emotional states from the audio signal waveforms. Emotions serve as a way for individuals to communicate their moods or mental states to others. People experience a range of emotions, including sadness, joy, neutrality, disgust, anger, surprise, fear, and calmness. This paper focuses on extracting emotional features from audio waveforms. We have restructured an existing state-of-the-art multilayer perceptron (MLP) design to achieve this objective. The design features a hidden layer with 300 units, utilizes a batch size of 256, and restricts training to 500 epochs. The accuracy of the MLP classifier for speech emotion recognition reaches 95.65% when tested with a limited dataset composed of RAVDESS songs, which is significantly higher than the results reported in other studies in this field.

Article Details

How to Cite
Suliman, H. (2025). Speech Emotion Feature Extraction and Classification . University of Zawia Journal of Engineering Sciences and Technology, 3(2), 205–215. https://doi.org/10.26629/uzjest.2025.17
Section
Computer Engineering