Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network [Augmented] \

Tejas Gupta, Ioana Marinescu, Levi Blinder

This repository contains the implementation and modificications for the paper (link):

Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network
Justin Lovelace, Denis Newman-Griffis, Shikhar Vashishth, Jill Fain Lehman, and Carolyn Penstein Rosé
Annual Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing (ACL-IJCNLP) 2021

Modifications \

To improve textual entity generation, we modified the existing transfoer to use Phrase-BERT
To use recent developments in ranking models for information retrieval, we implemented additional pointwise, pairwise, and listwise loss functions.

See paper in repository for more details and results.

Dependencies

Our work was performed with Python 3.8. The dependencies can be installed from requirements.txt.

Data Preparation

We conduct our work upon the existing FB15K-237 and CN100K datasets. We additionally developed the FB15K-237-Sparse and SNOMED-CT Core datasets for our work.
Running ./scripts/prepare_datasets.sh will unzip the dataset files and process them for use by our models.
Because the SNOMED-CT Core dataset was derived from the UMLS, we cannot directly release the dataset files. See here for full instructions for how to recreate the dataset.
The BERT embeddings can be downloaded from here. The bert_emb.pt files should be stored in the corresponding dataset directories, e.g. data/CN100K/bert_emb.pt

Training Ranking Models

We provide scripts to train our proposed ranking model, denoted as BERT-ResNet in our paper, for all four datasets.

FB15K-237: ./scripts/train_resnet_fb15k237.sh
FB15K-237-Sparse: ./scripts/train_resnet_fb15k237_sparse.sh
CN100K: ./scripts/train_resnet_cn100k.sh
SNOMED-CT Core ./scripts/train_resnet_snomed.sh

Training Re-Ranking Models

The re-ranking models can only be trained after the ranking model for the corresponding dataset has already finished training. First, download the BERT checkpoints used for our training from here. They should be unzipped and stored in reranking/bert_ckpts. A re-ranking model can then be trained with the provided scripts similarly to above.

FB15K-237: ./scripts/train_reranking_fb15k237.sh
FB15K-237-Sparse: ./scripts/train_reranking_fb15k237_sparse.sh
CN100K: ./scripts/train_reranking_cn100k.sh
SNOMED-CT Core ./scripts/train_reranking_snomed.sh

Evaluating Pretrained Ranking Models

Pretrained ranking models can be downloaded from here. After unzipping them in the robust-kg-completion/pretrained_models directory, they can be evaluated by running ./scripts/eval_pretrained_ranking_model.sh {DATASET} where {DATASET} is one of SNOMED_CT_CORE, FB15K_237, FB15K_237_SPARSE, or CN100K.

Evaluating Pretrained Re-Ranking Models

Pretrained re-ranking models can be downloaded from here. After unzipping them in the robust-kg-completion/reranking/pretrained_reranking_models directory, they can be evaluated by running the following commands.

FB15K-237: ./scripts/eval_pretrained_reranking_model.sh FB15K_237 0.75
FB15K-237-Sparse: ./scripts/eval_pretrained_reranking_model.sh FB15K_237_SPARSE 0.75
CN100K: ./scripts/eval_pretrained_reranking_model.sh CN100K 1.0
SNOMED-CT Core ./scripts/eval_pretrained_reranking_model.sh SNOMED_CT_CORE 0.5

Citation

@inproceedings{lovelace-etal-2021-robust,
  title={Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network},
  author={Justin Lovelace and Denis Newman-Griffis and Shikhar Vashishth and Jill Fain Lehman and Carolyn Penstein Rosé},
  booktitle = {Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP)},
  month = {August},
  year = {2021}
  eprint={2106.06555},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.idea		.idea
data		data
preprocessing		preprocessing
reranking		reranking
scripts		scripts
.gitignore		.gitignore
CONSTANTS.py		CONSTANTS.py
LICENSE		LICENSE
NLP_paper_final.pdf		NLP_paper_final.pdf
README.md		README.md
dataset.py		dataset.py
evaluation.py		evaluation.py
gen_triplet_dataset.py		gen_triplet_dataset.py
models.py		models.py
requirements.txt		requirements.txt
snomed_ct_core.md		snomed_ct_core.md
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network [Augmented] \

Dependencies

Data Preparation

Training Ranking Models

Training Re-Ranking Models

Evaluating Pretrained Ranking Models

Evaluating Pretrained Re-Ranking Models

Citation

About

Releases

Packages

Languages

License

TGNYC/robust-kg-completion

Folders and files

Latest commit

History

Repository files navigation

Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network [Augmented] \

Dependencies

Data Preparation

Training Ranking Models

Training Re-Ranking Models

Evaluating Pretrained Ranking Models

Evaluating Pretrained Re-Ranking Models

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages