Offensive Span Identification in Tamil @RANLP-2023

Offensive Language Detection in dravidian languages (Tamil)

Faculty	Slot	Course	Course Code
Dr. Ratnavel Rajalakshmi	L33+L34 (G1 Slot)	Essentials of Data Analytics	CSE3506

Name	Register Number	Branch
Hariket Sukesh Kumar Sheth (Team Leader)	20BCE1975	CSE Core
Manasvi Maheshwari	20BAI1032	CSE AI & ML
Suraj Shah	20BRS1122	CSE Robotics

All of the work completed for the tasks related to Offensive Language Identification that RANLP 2023 organised on Codalab is included in this repository. To execute these programs, you must have the following:

pytorch
transformers
sadice
seaborn
sklearn
matplotlib

The pretrained transformers BERT, IndicBERT, and XLM-Roberta were employed for the job of Identifying Offensive Language. We have utilised modified versions of these models in addition to the original versions of the pretrained transformers. The customised versions were created by freezing the basic layers and then layering a fc layer on top of it with nll_loss and sadice loss custom loss routines.

In order to reproduce the results obtained you can clone this repository and place ur dataset path in the train scripts to run the same.

Our results for the Offensive Language Identification Task

Table: Results on Offensive Language Development Dataset

Table: Results on Offensive Language Test Dataset

Model Name	Accuracy
mBERT Cased	0.76
XLMR	0.76
IndicBERT	0.74
XLMR with NLL Loss and Class Weights	0.64
XLMR with Sadice Loss	0.61
mBERT with Sadice Loss	0.61
mBERT with NLL Loss and Class Weights	0.58

Model Name	Accuracy
mBERT Cased	0.75
XLMR	0.75
IndicBERT	0.73
XLMR with NLL Loss and Class Weights	0.64
XLMR with Sadice Loss	0.61
mBERT with Sadice Loss	0.61
mBERT with NLL Loss and Class Weights	0.59

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Dataset		Dataset
Models		Models
Notebooks		Notebooks
SubmissionFiles		SubmissionFiles
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Offensive Span Identification in Tamil @RANLP-2023

Offensive Language Detection in dravidian languages (Tamil)

About

Releases 1

Packages

Languages

hariketsheth/Offensive-Span-Identification-in-Tamil---RANLP-2023

Folders and files

Latest commit

History

Repository files navigation

Offensive Span Identification in Tamil @RANLP-2023

Offensive Language Detection in dravidian languages (Tamil)

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages