[Confirmation] Optimal Hyperparameters and Reproducibility #21

chao1224 · 2022-11-23T22:29:02Z

Hi there,

Thanks for providing the nice codebase. I'm trying to reproduce the results for downstream tasks, and I have the following questions.

I'm wondering if the scripts under this folder are only samples? For the optimal hyperparameters for OntoProtein, we should follow Table 6 in the paper?
For ProtBert, are you using the same optimal hyperparameters for each downstream task?
Table 6 doesn't cover the optimal values for gradient_accumulation_steps and eval_step. Can you help clarify this?

Any help is appreciated.

The text was updated successfully, but these errors were encountered:

cheng-siyuan · 2022-11-24T02:45:50Z

Hi, thanks for your question:

When running downstream tasks, please keep the parameters consistent with Table 6 in the paper. I need to explain that the batch_size in our paper Table 6 is equal to per_device_batch_size*gradient_accumulation_steps. This is because some GPU memory is not enough, which need to adjust per_device_batch_size and gradient_accumulation_steps to achieve the same batchsize. As for 'eval_step', I don’t think it will affect the training process, just follow the default in scripts. Protbert use the same hyperparameters as our model.

chao1224 · 2022-11-24T20:56:53Z

Hi, thank you for the prompt reply. It's very helpful!

Meanwhile, I carefully check the per_device_batch_size and gradient_accumulation_steps (listed below), and it seems that contact and ss3 have some minor mismatch issues when using the default hyperparameters under the script folder. Can you help check this once available?

	`per_device_batch_size`	`gradient_accumulation_steps`	batch-size in Table 6
contact	1	1	8
fluorescence	4	16	64
homology	1	64	64
ss3	2	8	32
ss8	2	16	32
stability	2	16	32

cheng-siyuan · 2022-11-25T03:06:22Z

您好，感谢您的及时回复。这非常有帮助！

同时，我仔细检查了per_device_batch_sizeand gradient_accumulation_steps（如下所列），似乎contact和ss3在使用脚本文件夹下的默认超参数时存在一些小的不匹配问题。你能帮忙检查一下吗？

per_device_batch_size gradient_accumulation_steps 表6中的batch-size
接触 1 1 8
荧光 4 16 64
同源性 1 64 64
SS3 2 8 32
ss8 2 16 32
稳定 2 16 32

Hello:
Thank you for your correction. First of all, I apologize to you for my carelessness yesterday. I forgot to mention that the batchsize in Table 6 is also related to the number of GPUs we use. I have corrected and updated our script.

chao1224 · 2022-11-27T19:20:25Z

Thanks. I can roughly reproduce the results using the latest scripts (except contact, which is still running).

chao1224 · 2022-12-07T01:08:28Z

Hi @zxlzr , I got the following result for OntoProtein on contact, which seems to be too. The paper reports 0.40 for l2. I'm not sure if I miss sth, can you help double-check it?

'accuracy_l5': 0.6621209383010864, 'accuracy_l2': 0.5874732732772827, 'accuracy_l': 0.4826046824455261

QQQrookie · 2022-12-07T02:25:43Z

Hi @zxlzr , I got the following result for OntoProtein on contact, which seems to be too. The paper reports 0.40 for l2. I'm not sure if I miss sth, can you help double-check it?
'accuracy_l5': 0.6621209383010864, 'accuracy_l2': 0.5874732732772827, 'accuracy_l': 0.4826046824455261

Hi, can you please provide more detailed information? For example, your hyperparameters and the sequence length between amino acids.

chao1224 · 2022-12-07T02:29:01Z

Hi there,
I just followed the hyperparameters in this link.
Can you help explain what is the sequence length between amino acids?

cheng-siyuan · 2022-12-07T02:39:17Z

sequence length between amino acids is the short-, medium- or long-range setting selected for the contact test. Specific details can be found in the TAPE.

chao1224 · 2022-12-07T02:59:42Z

Thanks. I'm reporting the one for medium&long range prediction, just as in the paper and #8.

chao1224 · 2022-12-15T02:55:18Z

Hi, may I ask if there are any follow-ups?

cheng-siyuan · 2022-12-15T04:17:14Z

I don't know precisely what the error cause is, but we can provide a model of the contact task fine-tuned according to our hyperparameters. And this model was also retrained by us, so there may be 2~4 points of fluctuation, but the difference will not be much.

chao1224 · 2022-12-15T13:00:13Z

Sounds good! That would be very helpful!

cheng-siyuan · 2022-12-15T14:56:51Z

You can download the checkpoint here

chao1224 · 2022-12-16T11:44:03Z

@cheng-siyuan
Thanks. So using the model you provided, I got the following output:

'accuracy_l5': 0.6149854063987732, 'accuracy_l2': 0.5102487802505493, 'accuracy_l': 0.4060076177120209

Not sure if I miss sth.

cheng-siyuan · 2022-12-16T11:55:50Z

Yes, this checkpoint is obtained later when we have updated the hyperparameters, so that the effect will be 3 to 4 points higher than our result at the beginning. The result in the paper is not the best result yet, and I apologize for the trouble to your experiment.

chao1224 · 2022-12-16T13:09:07Z

No problem at all, and you have already been very helpful in replying to the messages :)

So just to double-check, you mean that in Table 1 of your paper, it should be updated to 0.51 (with your checkpoint) for contact with OntoProtein, right?

cheng-siyuan · 2022-12-16T13:15:12Z

Yes, we recommend that you use the results of this checkpoint as a reference.

chao1224 · 2022-12-16T13:18:17Z

Sounds good! Appreciate your help and being responsible for your work!

cheng-siyuan · 2022-12-16T13:30:34Z

You're welcome:)

zxlzr added the question Further information is requested label Nov 25, 2022

zxlzr closed this as completed Nov 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Confirmation] Optimal Hyperparameters and Reproducibility #21

[Confirmation] Optimal Hyperparameters and Reproducibility #21

chao1224 commented Nov 23, 2022 •

edited

Loading

cheng-siyuan commented Nov 24, 2022

chao1224 commented Nov 24, 2022

cheng-siyuan commented Nov 25, 2022

chao1224 commented Nov 27, 2022

chao1224 commented Dec 7, 2022

QQQrookie commented Dec 7, 2022

chao1224 commented Dec 7, 2022

cheng-siyuan commented Dec 7, 2022

chao1224 commented Dec 7, 2022

chao1224 commented Dec 15, 2022

cheng-siyuan commented Dec 15, 2022

chao1224 commented Dec 15, 2022

cheng-siyuan commented Dec 15, 2022

chao1224 commented Dec 16, 2022

cheng-siyuan commented Dec 16, 2022

chao1224 commented Dec 16, 2022

cheng-siyuan commented Dec 16, 2022

chao1224 commented Dec 16, 2022

cheng-siyuan commented Dec 16, 2022

[Confirmation] Optimal Hyperparameters and Reproducibility #21

[Confirmation] Optimal Hyperparameters and Reproducibility #21

Comments

chao1224 commented Nov 23, 2022 • edited Loading

cheng-siyuan commented Nov 24, 2022

chao1224 commented Nov 24, 2022

cheng-siyuan commented Nov 25, 2022

chao1224 commented Nov 27, 2022

chao1224 commented Dec 7, 2022

QQQrookie commented Dec 7, 2022

chao1224 commented Dec 7, 2022

cheng-siyuan commented Dec 7, 2022

chao1224 commented Dec 7, 2022

chao1224 commented Dec 15, 2022

cheng-siyuan commented Dec 15, 2022

chao1224 commented Dec 15, 2022

cheng-siyuan commented Dec 15, 2022

chao1224 commented Dec 16, 2022

cheng-siyuan commented Dec 16, 2022

chao1224 commented Dec 16, 2022

cheng-siyuan commented Dec 16, 2022

chao1224 commented Dec 16, 2022

cheng-siyuan commented Dec 16, 2022

chao1224 commented Nov 23, 2022 •

edited

Loading