You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to train my own metric. However, I continue to run into some version of the same problem (Just with a different validation key sometimes). I do not know if this is a bug, dependency issue, or something else. I followed the instructions exactly as stated in a brand virtual environment for this only. My input yaml file is explained below as well. No other files were changed.
This seems like a problem with your Trainer yaml. This sometimes happens when you are using a different pytorch-lightning version where the trainer class as new init args. Please check what version of lightning you have and if the yaml does not have any argument that is not in the Trainer class
I was using pytorch-lightening 1.9.5 so I updated it to 2.1.0.
When it is upgraded to 2.1.0 then it results in this error instead:
AttributeError: Can't pickle local object 'CometModel.val_dataloader.locals.lambda'
I saw a similar conversation around this error in issue #159 and checked my torchmetrics package as well and it is 0.10.3 as recommended
It seems like the problem is coming from the fact that I am using a different accelerator. I have tried with cpu and gpu (though it is mps since I am on a M1 chip mac) and gotten the same error each time.
What is your question?
I am trying to train my own metric. However, I continue to run into some version of the same problem (Just with a different validation key sometimes). I do not know if this is a bug, dependency issue, or something else. I followed the instructions exactly as stated in a brand virtual environment for this only. My input yaml file is explained below as well. No other files were changed.
Code
Yaml file:
regression_metric:
class_path: comet.models.RegressionMetric
init_args:
nr_frozen_epochs: 0.3
keep_embeddings_frozen: True
optimizer: AdamW
encoder_learning_rate: 1.0e-06
learning_rate: 1.5e-05
layerwise_decay: 0.95
encoder_model: XLM-RoBERTa
pretrained_model: xlm-roberta-large
pool: avg
layer: mix
layer_transformation: sparsemax
layer_norm: False
loss: mse
dropout: 0.1
batch_size: 16
train_data:
- data/train_mimic.csv
validation_data:
- data/validate_mimic.csv
hidden_sizes:
- 2048
- 1024
activations: Tanh
trainer: ../trainer.yaml
early_stopping: ../early_stopping.yaml
model_checkpoint: ../model_checkpoint.yaml
Terminal Line: comet-train --cfg configs/models/MIMIC_summ.yaml
Terminal Output:
usage: comet-train [-h] [--seed_everything SEED_EVERYTHING] [--cfg CFG]
[--print_config[=flags]]
[--regression_metric.help CLASS_PATH_OR_NAME]
[--regression_metric CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE]
[--referenceless_regression_metric.help CLASS_PATH_OR_NAME]
[--referenceless_regression_metric CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE]
[--ranking_metric.help CLASS_PATH_OR_NAME]
[--ranking_metric CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE]
[--unified_metric.help CLASS_PATH_OR_NAME]
[--unified_metric CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE]
[--early_stopping.help CLASS_PATH_OR_NAME]
[--early_stopping CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE]
[--model_checkpoint.help CLASS_PATH_OR_NAME]
[--model_checkpoint CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE]
[--trainer.help CLASS_PATH_OR_NAME]
[--trainer CONFIG | CLASS_PATH_OR_NAME | .INIT_ARG_NAME VALUE]
[--load_from_checkpoint LOAD_FROM_CHECKPOINT]
[--strict_load]
error: Parser key "trainer":
Problem with given class_path 'pytorch_lightning.Trainer':
Validation failed: No action for key "use_distributed_sampler" to check its value.
What's your environment?
iOS
pip 23.3.1
python 3.11
The text was updated successfully, but these errors were encountered: