-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
E-Branchformerモデルの検証 #8
Comments
この前ESPnet2 Librispeechのレシピを使ってreazonspeech medium (500h~)を元で31epochの訓練を走ってみました。ログは以下です(まだサチっていないよう):
参考として、今のconformer-transformerモデル(パラメーターが変わりますが)はこういう感じです。
今のところ大規模で回す計画はないですが、branchformerの実験に関して何か進捗があったらまたここに貼らせていただきます。 |
@pyf98, maybe you can help them. I think their learning rate is too low in this scenario, or there is something wrong with the actual batchsize (with multiple GPUs or gradient accumulation). |
I'm not sure what Conformer and E-Branchformer configs are being used exactly. I feel some configs might have issues. The Conformer config provided above has 12 layers without Macaron FFN. The input layer downsamples 6 times. These are different from the configs in other recipes (e.g., LibriSpeech). If you simply use the same E-Branchformer config from LibriSpeech, there can be some issues. For example, the model can be much larger. In our experiments, we scale Conformer and E-Branchformer to have similar parameter counts. In such cases, we usually do not need to tune the training hyper-parameters again. We have added E-Branchformer configs and results in many other ESPnet2 recipes covering various types of speech. |
@pyf98 @sw005320 Will check the lr/accum_grads/multi-gpu/downsampling configurations and other recipes as well when we run more experiments on larger dataset! |
Thanks for the information. When comparing these models (E-Branchformer vs Conformer), we typically just replaced the encoder config (at a similar model size) but kept the other training configs the same. This worked well in general. |
チケットのゴール
参考リンク
The text was updated successfully, but these errors were encountered: