register_model_architecture not working in version 0.10.1 #3097

voidmagic · 2021-01-05T08:08:28Z

🐛 Bug

Using register_model_architecture to register new hyper parameter set failed. The hyper parameter is overrided by default settings.

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

Run cmd 'fairseq-train --task language_modeling data-bin --arch transformer_lm_big --batch-size 1 --optimizer adam'
See error
There are 6 layers in the model, which should be 12 as defined in transformer_lm_big . https://github.com/pytorch/fairseq/blob/v0.10.1/fairseq/models/transformer_lm.py#L311

Code sample

Expected behavior

Environment

fairseq Version (e.g., 1.0 or master): 0.10.1
PyTorch Version (e.g., 1.0)
OS (e.g., Linux):
How you installed fairseq (pip, source):
Build command you used (if compiling from source):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Additional context

The text was updated successfully, but these errors were encountered:

myleott · 2021-01-05T18:19:43Z

Good catch. It's only broken for TransformerLanguageModel in 0.10.1, because that model was migrated to Hydra (dataclass-based config) before the migration was completed.

The fix is to patch this commit: b7d8b9d. I'll release 0.10.2 with the fix shortly.

myleott · 2021-01-05T20:27:15Z

Fixed in 0.10.2

xuuHuang · 2022-04-29T09:10:00Z

The hyper parameters of transformer model are still overrided by default settings in TransformerConfig in 0.10.2.
If I set arch=transformer_tiny, there 6 layers in the model, which should be 2.
I think that's because add_args() in TransformerModelBase sets the default hyper parameters.

def add_args(parser):
    gen_parser_from_dataclass(
        parser, TransformerConfig(), delete_default=False, with_prefix=""
    )

voidmagic added bug needs triage labels Jan 5, 2021

myleott removed the needs triage label Jan 5, 2021

myleott self-assigned this Jan 5, 2021

myleott added the 0.10.2 label Jan 5, 2021

myleott closed this as completed Jan 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

register_model_architecture not working in version 0.10.1 #3097

register_model_architecture not working in version 0.10.1 #3097

voidmagic commented Jan 5, 2021

myleott commented Jan 5, 2021

myleott commented Jan 5, 2021

xuuHuang commented Apr 29, 2022

register_model_architecture not working in version 0.10.1 #3097

register_model_architecture not working in version 0.10.1 #3097

Comments

voidmagic commented Jan 5, 2021

🐛 Bug

To Reproduce

Code sample

Expected behavior

Environment

Additional context

myleott commented Jan 5, 2021

myleott commented Jan 5, 2021

xuuHuang commented Apr 29, 2022