Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

can't load tokenizer #603

Open
Guo-Chenxu opened this issue Nov 11, 2023 · 2 comments
Open

can't load tokenizer #603

Guo-Chenxu opened this issue Nov 11, 2023 · 2 comments

Comments

@Guo-Chenxu
Copy link

i run the code with the following instruction:

python finetune.py \
    --base_model='/home/guochenxu/pythonProjects/alpaca-lora/alpaca-lora-7b' \
    --num_epochs=10 \
    --cutoff_len=512 \
    --group_by_length \
    --output_dir='./lora-alpaca-512-qkvo' \
    --lora_target_modules='[q_proj,k_proj,v_proj,o_proj]' \
    --lora_r=16 \
    --micro_batch_size=8

i download the model files from https://huggingface.co/tloen/alpaca-lora-7b, and my directory is like this:

image

but i get the error as follows:

image

it seems i don't have the tokenizer files, so how can i get those, or can i solve this problem with other method?

i'm a beginner, so maybe this problem seems to be a little stupid, but i have tried searching the web, finally my problem is still existing. i would be appreciate, if anyone can answer me.

@hychaochao
Copy link

直接用中文回你啦,不知你解决了没有。我也是刚入门没多久,看你的代码是想在alpaca-lora的基础上再微调?可以这样:

python finetune_copy.py \
    --base_model 'llama1' \
    --data_path ‘XXX.json' \
    --output_dir './lora-alpaca' \
    --resume_from_checkpoint 'tloen/alpaca-lora'

@Guo-Chenxu
Copy link
Author

直接用中文回你啦,不知你解决了没有。我也是刚入门没多久,看你的代码是想在alpaca-lora的基础上再微调?可以这样:

python finetune_copy.py \
    --base_model 'llama1' \
    --data_path ‘XXX.json' \
    --output_dir './lora-alpaca' \
    --resume_from_checkpoint 'tloen/alpaca-lora'

感谢您的回答, 我出问题时因为是下错模型了, 应该用alpaca-7b (不得不说确实挺stupid😂

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants