Source : Turkish-Music-Emotion | Main File : main.ipynb
Classification of verbal and non-verbal music from different genres of Turkish music into 4 discrete classes based on its Emotions, namely: happy, sad, angry, relax.
Dataset is created by extracting the intrinsic characteristics such as Mel Frequency Cepstral Coefficients (MFCCs), Tempo, Chromagram, Spectral and Harmonic features of Turkish-Music of various genres. The Dataset consists of 400 instances and 50 Fearures. The target Feature has 4 classes: happy, sad, angry, relax.
Link : Acoustic_features.csv
Duplicate instances are unnecessary and they might create bias in classification. Out of 400 instances, 12 instances were found duplicated and removed.
Encoding classes of target feature for Model understanding and easy representation of classes. Used LabelEncoder
from scikit-learn
as there is no intrinstic relationship among instances of the target feature.
Outliers are the points far away from the mean, and they distort it. In this Dataset, outliers are detected and capped using 3*Standard Deviation Method as the dataset is normal, except a few features(skew 1-3). After capping outliers, some of the features became close to normal distribution(skew b/w [-1.5 1.5]).
Feature Scaling brings all the features to one scale. I have applied StandardScaler
from scikit-learn
as the features were normal.
Feature Selection is crucial so as to reduce the dimensionality of the dataset and also to select relevent features to enhance the model's performance. Before selecting the optimum number of features, the dataset has been split into train and test with a ratio of 70:30 using train-test-split
from scikit-learn
. The feature engineering is applied only on train dataset. This method is divided in two phases:
- Phase 1: Mutual Information(Filter method)
This phase involves computing the importance of the feature to the target variable and arranging them in the decreasing order. Based on trail and error method, 47 features have been selected based on their imporatnce with target and among themselves. mutual_info_classif
from scikit-learn
is used to do the same.
- Phase 2: Sequential Forward Selection (Wrapper Method)
This phase involves Selecting the optimum features that gives the best accuracy. This method begins by selecting a single feature to train and predict against target variable, followed by pair, three and so on, until the optimum group of features gives the best performance. The scoring is based on accuracy of the group of features. Out of 50 features, 23 optimum features were selected. SequentialFeatureSelector
from scikit-learn
and KNeighborsClassifier
have been used.
After selecting the optimum number of features from training dataset, the same number of features are selected from test dataset and they are given to the multiple classifiers.
In this Model selection and training process, the reduced dataset from previous step is given to multiple classifiers including base classifiers, ensemble classifers & Stacking Classifiers. Out of all the classifiers, RandomForestClassifer
achieved the highest accuracy of 0.8462 and F1-Score (Macro) of 0.8478.
Results:

Confusion Matrix:
For Evaluation of the model, I have used Accuracy & F1-Score (Macro) as the main metric. As per the problem statement, all the classes have equal priority as this is Music Emotion Classification unlike Heart-disease, cancer dataset etc.. Other metrics include Precision-Macro & Recall-Macro. Using macro is evident that the average value is enough to justify the model's performance rather than that of individual classes'.
Here are the results from various other classifiers:
- Training with base and ensemble classifiers with feature selection.

- Trained Random Forest Classifier with hyper-parameter tuining using RandomisedSearchCV
Random Forest Algorithm trained with hyper-parameter tuining gave lesser F1-score compared to Simple RF.

- Training using Stacking Classifiers.

The given dataset, Acoustic_Features, based on its nature and number of features, we have applied feature engineering techniques: Mutual Information with Sequential Forward Selection(MISFS), finally given the extracted features to RandomForestClassifier, which performed best among all other classifiers with an accuracy of 84.62% and F1-Score of 84.78%.
- https://www.kaggle.com/code/nkitgupta/evaluation-metrics-for-multi-class-classification
- https://www.mdpi.com/2079-9292/12/10/2290#sec3-electronics-12-02290
- https://www.kaggle.com/code/nkitgupta/evaluation-metrics-for-multi-class-classification
- https://www.evidentlyai.com/classification-metrics/multi-class-metrics
- https://scikit-learn.org/stable/modules/preprocessing.html
- https://medium.com/@vinodkumargr/07-standardization-and-normalization-techniques-in-machine-learning-standardscaler-3890a89bddbf.