Code and supplementary materials for our paper "DeepHateExplainer: Explainable Hate Speech Detection in Under-resourced Bengali Language" (Proceeding of IEEE International Conference on Data Science and Advanced Analytics~(DSAA'2021), October 6-9, 2021, Porto, Portugal). This repo will be updated with more reproducable resources, e.g., models, notebooks, etc.
In this paper, we propose an explainable approach for hate speech detection from under-resourced Bengali language, which we called DeepHateLingo. In our approach, Bengali texts are first comprehensively preprocessed, before classifying them into political, personal, geopolitical, and religious hates with a neural ensemble method of transformer architectures (i.e., Bangla BERT-base, multilingual BERT, and XLM-RoBERTa). Important (most and least) terms are then identified using sensitivity analysis and layer-wise relevance propagation (LRP), before providing human-interpretable explanations of individual predictions. Besides, we compute comprehensiveness and sufficiency scores to measure quality of explanations w.r.t faithfulness.
./input/new_data --> Contains the dataset related files.
./preprocess --> Contains the codes for preprocessing the dataset.
./results --> Contain the evaluation result based on test dataset.
./notebooks --> Contains some examples covering interpretability and ensemble prediction.
./training-script --> Contains the codes for all transformer-based classifiers.
We follow a bootstrap approach for data collection, where specific types of texts containing common slurs and terms, either directed towards a specific person or entity or generalized towards a group, are only considered. Texts were collected from Facebook, YouTube comments, and newspapers. We categorize the samples into political, personal, geopolitical, and religious hate. Three annotators (a linguist, a native Bengali speaker, and an NLP researcher) participated in the annotation process.
To reduce possible bias, unbiased contents are supplied to the annotators and each label was assigned based on a majority voting on the annotator's independent opinions. To evaluate the quality of the annotations and to ensure the decision based on the criteria of the objective, we measure inter-annotator agreement w.r.t Cohen's Kappa statistic. Overall, our Bengali Hate Speech Dataset are available as CSV files and categorized observations into political, personal, geopolitical, and religious. Examples are distributed across following hate types:
Hate type | # of examples |
---|---|
Personal | 3513 |
Geopolitical | 2364 |
Religious | 1211 |
Political | 999 |
Total | 8087 |
Then, we split the samples into train (5,887), validation (1,000), test (1,200) sets.
First clone this repo and move to the directory. Then, install necessary libraries. Also, following commands can be used:
$ git clone https://github.com/AwesomeDeepAI/DeepHateLingo.git
$ cd DeepHateLingo/
$ sudo pip3 install -r requirements.txt
For training BERT variants, BERT and
XLMRoberta like tokenizations are used. In the params
, make sure to set pretrained_model_name
used BERT variants and XLMRoberta based models in Huggingface.
python3 ./training-script/bangla-bert-base/train.py --bert_hidden 768 \
--epochs 6 \
--pretrained_model_name "sagorsarker/bangla-bert-base" \
--learning_rate 3e-5 \
--max_len 128 \
--dropout 0.3 \
--train_batch_size 16 \
--valid_batch_size 32 \
--model_layer "last_two" \
--model_specification "bangla-bert-training" \
--output "six_bangla_bert_last_two_pred_3e5.csv"
python3 ./training-script/bangla-bert-base/train.py --bert_hidden 768 \
--epochs 6 \
--pretrained_model_name "bert-base-multilingual-cased" \
--learning_rate 2e-5 \
--max_len 128 \
--dropout 0.3 \
--train_batch_size 16 \
--valid_batch_size 32 \
--model_layer "last_four" \
--model_specification "bert-base-multilingual-cased-training" \
--output "six_bert_base_multilingual_cased_last_four_pred_2e5.csv"
python3 ./training-script/bangla-bert-base/train.py --bert_hidden 768 \
--epochs 6 \
--pretrained_model_name "bert-base-multilingual-uncased" \
--learning_rate 5e-5 \
--max_len 128 \
--dropout 0.3 \
--train_batch_size 16 \
--valid_batch_size 32 \
--model_layer "last_two" \
--model_specification "bert-base-multilingual-uncased-training" \
--output "six_bert_base_multilingual_uncased_last_two_pred_5e5.csv"
python3 ./training-script/xlm-roberta/train.py --roberta_hidden 1024 \
--epochs 5 \
--pretrained_model_name "xlm-roberta-large" \
--learning_rate 2e-5 \
--max_len 128 \
--train_batch_size 16 \
--valid_batch_size 32 \
--model_layer "pool_output" \
--model_specification "xlm-roberta-large-training" \
--output "five_xlm_roberta_pool_pred_2e5.csv"
The default parametrs provided in the shell scripts can also be used. Networks will be trained on CPU by default, if GPU is not available. For training baseline DNN models such as LSTM, please refer to corresponding notebook. This requires pretrained FastText embedding model, which can be downloaded from here. Then, make sure to unzip it and make sure to set the path properly. Please note that a smaller version is provived here as the size of a FastText embedding model with dimension 300 is huge. However, we plan to provide all the pretrained and fine-tuned language models in coming weeks.
For standalone models, plase refer to 'result' folder. The following result is based on Ensemble prediction of RoBERTa, Bangla-BERT-base, and mBERT-uncased. Please note the following class encoding to interprete the class-specific classification reports:
- Personal hate: class 0
- Political hate: class 1
- Religius hate: class 2
- Geopolitical hate: class 3
Accuracy:: 0.8766666666666667
Precision:: 0.8775601042666743
Recall:: 0.8766666666666667
F1-score:: 0.8763746343903924
MCC:: 0.8205216999232595
---------------------------------------------------------
Class-wise classification report::
precision recall f1-score support
0 0.91 0.90 0.91 524
1 0.82 0.74 0.78 157
2 0.79 0.90 0.84 159
3 0.89 0.89 0.89 360
accuracy 0.88 1200
macro avg 0.85 0.86 0.85 1200
weighted avg 0.88 0.88 0.88 1200
----------------------------------------------------------
The following example is based on interpret_text library. Please refer to corresponding notebook for more detail.
The following example is based on LRP, SA, GI. Please refer to corresponding notebook for more detail.
If you use the code of this repository in your research, please consider citing the folowing papers:
Md. Rezaul Karim, Sumon Kanti Dey, Tanhim Islam, Sagor Sarker, Mehadi Hasan Menon, Kabir Hossain, Md. Azam Hossain, Stefan Decker
@inproceedings{DeepHateExplainer,
title={DeepHateExplainer: Explainable Hate Speech Detection in Under-resourced Bengali Language},
author={Md. Rezaul Karim, Sumon Kanti Dey, Tanhim Islam, Sagor Sarker, Mehadi Hasan Menon, Kabir Hossain, Md. Azam Hossain, Stefan Decker},
conference={Proceeding of IEEE International Conference on Data Science and Advanced Analytics)},
year={2021}
}
For any questions, feel free to open an issue or contact at rezaul.karim@rwth-aachen.de