Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Bugfix: missing hparam type_vocab_size #32

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

FFengIll
Copy link

@FFengIll FFengIll commented Sep 14, 2023

  • type_vocab_size is also a hparam (can not use const as 2).
  • so does the converter.

@FFengIll
Copy link
Author

FFengIll commented Sep 14, 2023

Here is some of the model upon bert DO NOT use type_vocab_size=2 but type_vocab_size=1 (like e5).

https://huggingface.co/intfloat/multilingual-e5-base/blob/main/config.json#L25

@skeskinen
Copy link
Owner

Hi, This seems like a good change.

But surely bert.cpp:358 also needs to be changed? Where the hparams are read from the model file

@FFengIll
Copy link
Author

FFengIll commented Sep 19, 2023

Hi, This seems like a good change.

But surely bert.cpp:358 also needs to be changed? Where the hparams are read from the model file

sure, I will add it.


model.layers.resize(n_layer);

model.word_embeddings = ggml_new_tensor_2d(ctx, wtype, n_embd, n_vocab);
model.token_type_embeddings = ggml_new_tensor_2d(ctx, wtype, n_embd, 2);
model.token_type_embeddings = ggml_new_tensor_2d(ctx, wtype, n_embd, n_vocab_size);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skeskinen
here is a change since the tensor is related to n_vocab_size.
for many case, it is 2, but not a const.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -23,6 +23,7 @@ struct bert_hparams
int32_t n_intermediate = 1536;
int32_t n_head = 12;
int32_t n_layer = 6;
int32_t n_vocab_size = 2;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a break change since new field.
I believe some others will happen in future.
So we may try to shift into GGUF.

https://github.com/philpax/ggml/blob/gguf-spec/docs/gguf.md

@@ -364,6 +365,7 @@ struct bert_ctx * bert_load_from_file(const char *fname)
fin.read((char *)&hparams.n_intermediate, sizeof(hparams.n_intermediate));
fin.read((char *)&hparams.n_head, sizeof(hparams.n_head));
fin.read((char *)&hparams.n_layer, sizeof(hparams.n_layer));
fin.read((char *)&hparams.n_vocab_size, sizeof(hparams.n_vocab_size));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so does here.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants