Skip to content

Augmentation of Lovelace, et al. (2021) Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network

License

Notifications You must be signed in to change notification settings

TGNYC/robust-kg-completion

 
 

Repository files navigation

Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network [Augmented] \

Tejas Gupta, Ioana Marinescu, Levi Blinder

This repository contains the implementation and modificications for the paper (link):

Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network
Justin Lovelace, Denis Newman-Griffis, Shikhar Vashishth, Jill Fain Lehman, and Carolyn Penstein Rosé
Annual Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing (ACL-IJCNLP) 2021

Modifications \

  • To improve textual entity generation, we modified the existing transfoer to use Phrase-BERT
  • To use recent developments in ranking models for information retrieval, we implemented additional pointwise, pairwise, and listwise loss functions.

See paper in repository for more details and results.

Dependencies

Our work was performed with Python 3.8. The dependencies can be installed from requirements.txt.

Data Preparation

  • We conduct our work upon the existing FB15K-237 and CN100K datasets. We additionally developed the FB15K-237-Sparse and SNOMED-CT Core datasets for our work.
  • Running ./scripts/prepare_datasets.sh will unzip the dataset files and process them for use by our models.
  • Because the SNOMED-CT Core dataset was derived from the UMLS, we cannot directly release the dataset files. See here for full instructions for how to recreate the dataset.
  • The BERT embeddings can be downloaded from here. The bert_emb.pt files should be stored in the corresponding dataset directories, e.g. data/CN100K/bert_emb.pt

Training Ranking Models

We provide scripts to train our proposed ranking model, denoted as BERT-ResNet in our paper, for all four datasets.

  • FB15K-237: ./scripts/train_resnet_fb15k237.sh
  • FB15K-237-Sparse: ./scripts/train_resnet_fb15k237_sparse.sh
  • CN100K: ./scripts/train_resnet_cn100k.sh
  • SNOMED-CT Core ./scripts/train_resnet_snomed.sh

Training Re-Ranking Models

The re-ranking models can only be trained after the ranking model for the corresponding dataset has already finished training. First, download the BERT checkpoints used for our training from here. They should be unzipped and stored in reranking/bert_ckpts. A re-ranking model can then be trained with the provided scripts similarly to above.

  • FB15K-237: ./scripts/train_reranking_fb15k237.sh
  • FB15K-237-Sparse: ./scripts/train_reranking_fb15k237_sparse.sh
  • CN100K: ./scripts/train_reranking_cn100k.sh
  • SNOMED-CT Core ./scripts/train_reranking_snomed.sh

Evaluating Pretrained Ranking Models

Pretrained ranking models can be downloaded from here. After unzipping them in the robust-kg-completion/pretrained_models directory, they can be evaluated by running ./scripts/eval_pretrained_ranking_model.sh {DATASET} where {DATASET} is one of SNOMED_CT_CORE, FB15K_237, FB15K_237_SPARSE, or CN100K.

Evaluating Pretrained Re-Ranking Models

Pretrained re-ranking models can be downloaded from here. After unzipping them in the robust-kg-completion/reranking/pretrained_reranking_models directory, they can be evaluated by running the following commands.

  • FB15K-237: ./scripts/eval_pretrained_reranking_model.sh FB15K_237 0.75
  • FB15K-237-Sparse: ./scripts/eval_pretrained_reranking_model.sh FB15K_237_SPARSE 0.75
  • CN100K: ./scripts/eval_pretrained_reranking_model.sh CN100K 1.0
  • SNOMED-CT Core ./scripts/eval_pretrained_reranking_model.sh SNOMED_CT_CORE 0.5

Citation

@inproceedings{lovelace-etal-2021-robust,
  title={Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network},
  author={Justin Lovelace and Denis Newman-Griffis and Shikhar Vashishth and Jill Fain Lehman and Carolyn Penstein Rosé},
  booktitle = {Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP)},
  month = {August},
  year = {2021}
  eprint={2106.06555},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}

About

Augmentation of Lovelace, et al. (2021) Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.8%
  • Shell 3.2%