Add support for max_length in run_generation #472

ankurneog · 2023-10-18T07:39:24Z

System Info

This is caught while executing the transformers unit tests for optimum-habana 
https://github.com/huggingface/optimum-habana/blob/main/tests/transformers/tests/models/gpt2/test_modeling_gpt2.py

several TCs are failing because the config in the tests is updated with max_length for text generation rather than max_new_tokens. 

Hence text generation is failing for decoder only models due to this check : 
            if not self.config.is_encoder_decoder:
                # only pad if bucket_size < -1. If we are bucketing (bucket_size > 0), then that is taken care in greedy_search()
                if not is_greedy_and_bucket:
                    # token_idx is the current index in the generation process, it is incremented each time a new token is generated
                    model_kwargs["token_idx"] = torch.tensor(inputs_tensor.shape[-1], device=inputs_tensor.device)
>                   inputs_tensor = torch.nn.functional.pad(
                        inputs_tensor, (0, generation_config.max_new_tokens), value=generation_config.pad_token_id
                    )
E                   TypeError: pad(): argument 'pad' must be tuple of ints, but found element of type NoneType at pos 2

max_new_tokens is 0 


FAILED test_modeling_gpt2.py::GPT2ModelTest::test_beam_search_generate - TypeError: pad(): argument 'pad' must be tuple of ints, but found element of type NoneType at pos 2

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

python -m pytest -vs test_modeling_gpt2.py::GPT2ModelTest::test_beam_search_generate

Expected behavior

test should pass

The text was updated successfully, but these errors were encountered:

ankurneog · 2023-10-18T07:40:09Z

fyi : @regisss @ssarkar2

ssarkar2 · 2023-10-18T15:45:04Z

Preliminary look shows max_new_tokens is None. run_generation.py was tested with max_new_tokens but not max_length, which are two mutually exclusive ways of specifying generation length as mentioned here. Test is called from here, which is using max_length instead of the more tested max_new_tokens

ssarkar2 · 2023-10-19T19:04:12Z

#476

ankurneog added the bug Something isn't working label Oct 18, 2023

ssarkar2 added the good first issue Good for newcomers label Oct 18, 2023

ssarkar2 mentioned this issue Oct 19, 2023

sarkar/Add support for max_length in run_generation #476

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for max_length in run_generation #472

Add support for max_length in run_generation #472

ankurneog commented Oct 18, 2023

ankurneog commented Oct 18, 2023

ssarkar2 commented Oct 18, 2023

ssarkar2 commented Oct 19, 2023

Add support for max_length in run_generation #472

Add support for max_length in run_generation #472

Comments

ankurneog commented Oct 18, 2023

System Info

Information

Tasks

Reproduction

Expected behavior

ankurneog commented Oct 18, 2023

ssarkar2 commented Oct 18, 2023

ssarkar2 commented Oct 19, 2023