GitHub - lidija-jovanovska/ner-dm-algorithms: Training language models for NER on ML papers using Keras.

Overview

This project covers the training (from scratch) of a transformer-based language model for the named entity recognition task (NER). The idea was to automate the process of annotating data describing machine learning (ML) algorithms, mainly in the form of papers. We used the SciERC dataset as training/validation data (Luan et al., 2018).

We chose three key entity types which we wanted to identify in the papers:

Task (e.g., node clustering)
Method (e.g., transformers)
Material (e.g., IAM dataset)

Project structure

The main files related to the model training are: main.py, models.py and utils.py. Exploratory data analysis, experimentation and preprocessing is done in notebooks/. A set of 50 ML papers taken from www.arxiv.org were manually annotated. The annotations are available in ml_sample_50_annotations.txt.

Model architecture and training

The model was built using the Keras library. The model architecture consisted of an Embedding layer, a Transformer block layer, and two pairs of Fully Connected and Dropout Layers, totaling 133,736 trainable (model) parameters. The Transformer block layer includes a Multihead Attention Layer, followed by a Fully Connected Layer, Normalization, and Dropout Layers. The Multihead Attention Layer allows the model to jointly attend to information from different representation subspaces.

To train the model we used the sparse categorical cross entropy loss function which is commonly used for multi-class classification problems because it outputs a probability distribution over the class labels. The SCIERC corpus was split into 90% training data (450 paper abstracts) and 10% validation data (50 paper abstracts). We used the Adam optimization algorithm to train the model over 100 epochs.

More info.

The development process is documented in more detail in Chapter 7 of my MSc thesis (p. 81).

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
jupyter		jupyter
src/ner		src/ner
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Project structure

Model architecture and training

More info.

About

Releases

Packages

Languages

lidija-jovanovska/ner-dm-algorithms

Folders and files

Latest commit

History

Repository files navigation

Overview

Project structure

Model architecture and training

More info.

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages