CoRT TensorFlow Models
I had trained CoRT models on Rhetorical Tagging Dataset for 8,000 steps in Pre-training and 1 epoch in Fine-tuning using A100 and V100 GPUs respectively.
To the models be run properly following exact hyperparameter setups are required.
Model | Performance | Hyperparameters | |||||
---|---|---|---|---|---|---|---|
Macro F1-score | Accuracy | model_name |
repr_size |
repr_classifier |
repr_act |
concat_hidden_states |
|
CoRT-KorSciBERT | 90.42 | 90.25 | korscibert | 1,024 | seq_cls | tanh | 2 |
CoRT-RoBERTa | 90.50 | 90.17 | klue/roberta-base | 1,024 | bi_lstm | tanh | 2 |
Usage Examples
Inference
There are two modes for inference, Inference mode
that results all datasets at once and Interactive mode
for interactively see results one by one.
Following command (for example) runs inference mode.
python run_inference.py \
# --interactive=True \ # Uncomment to activate interactive mode
--checkpoint_path=./CoRT-KorSciBERT/ckpt-0 \
--model_name=korscibert \
--tfrecord_path=./data/tfrecords/{model_name}/valid.fold-1-of-10.tfrecord \
--concat_hidden_states=2 \
--repr_act=tanh \
--repr_classifier=seq_cls \
--repr_size=1024 \
--batch_size=32