The Use Of Classical Classification to Distinguish between 16 MBTI given a vectorized text using CBOW, BERT Models vs Classification using The LSTM model
- LSTM Model
- BERT Model
- CBOW Model
- MBTI Classification
our primary objective was to develop a classification framework for the Meyers-Briggs Type Indicator (MBTI) based on social media posts. We adopted a dual-pronged approach to address this challenge. Initially, we employed a Long Short-Term Memory (LSTM) neural network model to categorize vectorized text into one of the 16 MBTI types. Subsequently, we took a dimensionality reduction approach, breaking down the 16 MBTI categories into their 4 fundamental dimensions that define each unique personality. We then applied traditional classification techniques to the vectorized text outputs from two Natural Language Processing (NLP) models, namely BERT and CBOW. In the second phase of our approach, we leveraged a set of six distinct models to optimize our classification results. This comprehensive strategy allowed us to conduct a thorough and accurate comparative analysis of the Vectors produced by the BERT and CBOW models. Our findings contribute to a nuanced understanding of the effectiveness of different NLP models in the context of MBTI classification, paving the way for enhanced accuracy and insights into personality Predictions based on social media content
Model | I/E | N/S | T/F | J/P | AVG |
Logistic Regression | 0.72 | 0.757 | 0.8315 | 0.6724 | 0.7452 |
SVC Accuracy | 0.7342 | 0.7834 | 0.8424 | 0.6858 | 0.7614 |
SGD Classifier | 0.691 | 0.764 | 0.8199 | 0.6538 | 0.7321 |
Random Forest | 0.6819 | 0.7347 | 0.8007 | 0.6512 | 0.7171 |
XGBoost | 0.7104 | 0.7672 | 0.8192 | 0.6664 | 0.7408 |
CatBoost | 0.7297 | 0.786 | 0.8370 | 0.6852 | 0.7594 |
Model | I/E | N/S | T/F | J/P | AVG |
Logistic Regression | 0.8006 | 0.7896 | 0.7311 | 0.7311 | 0.7878 |
SVC Accuracy | 0.7375 | 0.7682 | 0.8108 | 0.7032 | 0.7549 |
SGD Classifier | 0.7801 | 0.7878 | 0.8252 | 0.7196 | 0.7781 |
Random Forest | 0.7496 | 0.7124 | 0.769 | 0.6856 | 0.7291 |
XGBoost | 0.7767 | 0.7434 | 0.7933 | 0.6995 | 0.7532 |
CatBoost | 0.7838 | 0.7635 | 0.8064 | 0.7128 | 0.7666 |
We trained the LSTM and managed to get an Accuracy of 73.02%, a loss of 1.1323.
Alhossien Waly |
Ali Ibrahim |