Closed as not planned
Description
Current Behavior
While converting the 7B model I got the error:
{'dim': 4096, 'multiple_of': 256, 'n_heads': 32, 'n_layers': 32, 'norm_eps': 1e-06, 'vocab_size': -1}
Traceback (most recent call last):
File "/content/llama.cpp/llama.cpp/llama.cpp/llama.cpp/convert-pth-to-ggml.py", line 274, in <module>
main()
File "/content/llama.cpp/llama.cpp/llama.cpp/llama.cpp/convert-pth-to-ggml.py", line 239, in main
hparams, tokenizer = load_hparams_and_tokenizer(dir_model)
File "/content/llama.cpp/llama.cpp/llama.cpp/llama.cpp/convert-pth-to-ggml.py", line 105, in load_hparams_and_tokenizer
tokenizer = SentencePieceProcessor(fname_tokenizer)
File "/usr/local/lib/python3.9/dist-packages/sentencepiece/__init__.py", line 447, in Init
self.Load(model_file=model_file, model_proto=model_proto)
File "/usr/local/lib/python3.9/dist-packages/sentencepiece/__init__.py", line 905, in Load
return self.LoadFromFile(model_file)
File "/usr/local/lib/python3.9/dist-packages/sentencepiece/__init__.py", line 310, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())]
Info about my environment below - let me know if you have some hints, thanks!
Luigi
Environment and Context
environment is Google Colab. Weights have been verified via md5sum:
# according with: https://github.com/ggerganov/llama.cpp/issues/238
md5sum ./models/*/*.pth | sort -k 2,2
6efc8dab194ab59e49cd24be5574d85e ./models/7B/consolidated.00.pth
$ python3 --version
$ make --version
$ g++ --version
Python 3.9.16
GNU Make 4.2.1
Built for x86_64-pc-linux-gnu
Copyright (C) 1988-2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Environment info:
git log | head -1
commit 58c438cf7dfbbef710b1905a453a38a8a9ced07d
pip list | egrep "torch|numpy|sentencepiece
numpy 1.22.4
sentencepiece 0.1.97
torch 2.0.0+cu118
torchaudio 2.0.1+cu118
torchdata 0.6.0
torchsummary 1.5.1
torchtext 0.15.1
torchvision 0.15.1+cu118
llama.cpp$ python3 --version
Python 3.9.16