You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I conducted an experiment here: the ref and target were set to exactly the same text, and the Comet score and XCOMETXXL score were run separately. Here, a phenomenon was found that as the length of the target text became longer and longer, the Comet score was relatively stable at around 0.9, but the XCOMETXXL score dropped sharply, even reaching below 0.1.
This phenomenon is puzzling.
❓ Questions and Help
What is your question?
Hello, I conducted an experiment here: the ref and target were set to exactly the same text, and the Comet score and XCOMETXXL score were run separately. Here, a phenomenon was found that as the length of the target text became longer and longer, the Comet score was relatively stable at around 0.9, but the XCOMETXXL score dropped sharply, even reaching below 0.1.
This phenomenon is puzzling.
Code
self.model = load_from_checkpoint(self.comet_config[model_name]['model_path'])
model_output = self.model.predict(data, batch_size=self.batch_size, gpus=1)
What have you tried?
What's your environment?
The text was updated successfully, but these errors were encountered: