A Study on Speaker Independent Emotion Recognition from Speech Signals
In this work, the spectral feature MFCC is used for high accuracy, and the prosody feature Pitch is used in conjunction with two other features, Cepstrum and DWT, to create an Emotion- Specific Feature Set. In recent years, a great study has been conducted to improve human-machine interaction in the area of Speech Emotion Recognition. The Speaker’s age, gender, and emotional state are all revealed in his or her speech. Emotion recognition is the difficult challenge of recognising a single emotion from a speaker. The database in question is Telugu-Database, which covers four emotions: joyful, angry, sad, and neutral, and is prompted by two male and female speakers. Different combinations of traits are utilised to identify the associated emotion, and these features are referred to as Emotionspecific characteristics. When these characteristics are taken into consideration, the rate of combination identification improves. The DWT, Cepstrum, MFCC, and MFCC features are used to extract function information. as well as Pitch After feature extraction, the data is classified using a back-propagation neural network technique, and the results are reviewed. The study found that increasing the number of nodes in the network and the number of iterations increases the recognition rate above 90%, that combining feature sets gives a better emotion recognition rate than using individual feature sets, and that the feature set combination DWT+Pitch+Cepstrum produced an individual emotion recognition rate of over 95%.
Author (s) Details
Dr. B. Rajasekhar
JNTUA, Anantapuramu, India.
Department of Electronics and Communication Engineering, Gudlavalleru Engineering College, Gudlavalleru, India.
Department of Electronics and Communication Engineering, JNTU College of Engineering, Anantapuramu, India.
View Book :- https://stm.bookpi.org/AAER-V12/article/view/1279