Skip to content

pdadial/Speech_Emotion_Recognition_CNN-LSTM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This project recognizes emotions from snippets of speech signals from the RAVDESS databases by using a Convolutional and LSTM network, in conjunction with Voice Activity Detection (VAD) and an extended feature set.

Prerequisites

pip install librosa
pip install tensorflow
pip install -U scikit-learn
pip install soundfile

References

  • S. R. Livingstone and F. A. Russo, “The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English,” PLOS ONE, vol. 13, no. 5, May 2018, DOI: 10.1371/journal.pone.0196391.
  • B. Mcfee et al., “librosa: Audio and Music Signal Analysis in Python,” 2015. [Online]. Available: https://www.youtube.com/watch?v=MhOdbtPhbLU