[QUESTION] Train Your Own Metric #176

ecroxford · 2023-10-30T19:29:08Z

What is your question?

I am trying to train my own metric. However, I continue to run into some version of the same problem (Just with a different validation key sometimes). I do not know if this is a bug, dependency issue, or something else. I followed the instructions exactly as stated in a brand virtual environment for this only. My input yaml file is explained below as well. No other files were changed.

Code

Yaml file:
regression_metric:
class_path: comet.models.RegressionMetric
init_args:
nr_frozen_epochs: 0.3
keep_embeddings_frozen: True
optimizer: AdamW
encoder_learning_rate: 1.0e-06
learning_rate: 1.5e-05
layerwise_decay: 0.95
encoder_model: XLM-RoBERTa
pretrained_model: xlm-roberta-large
pool: avg
layer: mix
layer_transformation: sparsemax
layer_norm: False
loss: mse
dropout: 0.1
batch_size: 16
train_data:
- data/train_mimic.csv
validation_data:
- data/validate_mimic.csv
hidden_sizes:
- 2048
- 1024
activations: Tanh

trainer: ../trainer.yaml
early_stopping: ../early_stopping.yaml
model_checkpoint: ../model_checkpoint.yaml

Terminal Line: comet-train --cfg configs/models/MIMIC_summ.yaml

Terminal Output:
usage: comet-train [-h] [--seed_everything SEED_EVERYTHING] [--cfg CFG]
[--print_config[=flags]]
[--regression_metric.help CLASS_PATH_OR_NAME]
[--regression_metric CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE]
[--referenceless_regression_metric.help CLASS_PATH_OR_NAME]
[--referenceless_regression_metric CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE]
[--ranking_metric.help CLASS_PATH_OR_NAME]
[--ranking_metric CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE]
[--unified_metric.help CLASS_PATH_OR_NAME]
[--unified_metric CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE]
[--early_stopping.help CLASS_PATH_OR_NAME]
[--early_stopping CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE]
[--model_checkpoint.help CLASS_PATH_OR_NAME]
[--model_checkpoint CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE]
[--trainer.help CLASS_PATH_OR_NAME]
[--trainer CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE]
[--load_from_checkpoint LOAD_FROM_CHECKPOINT]
[--strict_load]
error: Parser key "trainer":
Problem with given class_path 'pytorch_lightning.Trainer':
Validation failed: No action for key "use_distributed_sampler" to check its value.

What's your environment?

iOS
pip 23.3.1
python 3.11

ricardorei · 2023-11-01T19:01:15Z

This seems like a problem with your Trainer yaml. This sometimes happens when you are using a different pytorch-lightning version where the trainer class as new init args. Please check what version of lightning you have and if the yaml does not have any argument that is not in the Trainer class

ecroxford · 2023-11-01T19:43:17Z

HI @ricardorei ,

I was using pytorch-lightening 1.9.5 so I updated it to 2.1.0.

When it is upgraded to 2.1.0 then it results in this error instead:

AttributeError: Can't pickle local object 'CometModel.val_dataloader.locals.lambda'

I saw a similar conversation around this error in issue #159 and checked my torchmetrics package as well and it is 0.10.3 as recommended

It seems like the problem is coming from the fact that I am using a different accelerator. I have tried with cpu and gpu (though it is mps since I am on a M1 chip mac) and gotten the same error each time.

ecroxford added the question Further information is requested label Oct 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUESTION] Train Your Own Metric #176

[QUESTION] Train Your Own Metric #176

ecroxford commented Oct 30, 2023

ricardorei commented Nov 1, 2023

ecroxford commented Nov 1, 2023 •

edited

Loading

[QUESTION] Train Your Own Metric #176

[QUESTION] Train Your Own Metric #176

Comments

ecroxford commented Oct 30, 2023

What is your question?

Code

What's your environment?

ricardorei commented Nov 1, 2023

ecroxford commented Nov 1, 2023 • edited Loading

ecroxford commented Nov 1, 2023 •

edited

Loading