-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Training data and scripts used for wmt22-cometkiwi-da #217
Comments
To train
Your configs should be something like this: unified_metric:
class_path: comet.models.UnifiedMetric
init_args:
nr_frozen_epochs: 0.3
keep_embeddings_frozen: True
optimizer: AdamW
encoder_learning_rate: 1.0e-06
learning_rate: 1.5e-05
layerwise_decay: 0.95
encoder_model: XLM-RoBERTa
pretrained_model: microsoft/infoxlm-large
sent_layer: mix
layer_transformation: sparsemax
word_layer: 24
loss: mse
dropout: 0.1
batch_size: 16
train_data:
- TRAIN_DATA.csv
validation_data:
- VALIDATION_DATA.csv
hidden_sizes:
- 3072
- 1024
activations: Tanh
input_segments:
- mt
- src
word_level_training: False
trainer: ../trainer.yaml
early_stopping: ../early_stopping.yaml
model_checkpoint: ../model_checkpoint.yaml |
Hi @ricardorei , Thanks for the update. Can I use the same training parameters mentioned in master branch trainer.yaml file? |
Hmm maybe you should change them a bit. For example to train on a single GPU (which is usually faster) and with precision 16 use this: accelerator: gpu
devices: 1
# strategy: ddp # Comment this line for distributed training
precision: 16 You might also want to consider reducing the accumulate_grad_batches: 2 |
What is the format that the data should look like? |
Hi Team,
Can you share the training data and training scripts used for wmt22-cometkiwi-da. We want it reference for training with our own sample reference data.
The text was updated successfully, but these errors were encountered: