Skip to content

Draft: feat: Support DBRX model in Llama #462

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

reneleonhardt
Copy link
Contributor

@reneleonhardt reneleonhardt commented Apr 15, 2024

The new Open Source model DBRX sounds amazing, is this enough and correct to integrate it into Llama?
ggml-org/llama.cpp#6515
https://huggingface.co/collections/phymbert/dbrx-16x12b-instruct-gguf-6619a7a4b7c50831dd33c7c8
https://www.databricks.com/blog/announcing-dbrx-new-standard-efficient-open-source-customizable-llms
https://github.com/databricks/dbrx
https://huggingface.co/collections/databricks/

llama.cpp seems to support splitted/sharded files, but I would need to download all of them first I suppose... 😅

@reneleonhardt reneleonhardt force-pushed the support-llama-model-dbrx branch 3 times, most recently from f852b16 to 27b9d62 Compare April 15, 2024 11:41
+ "Generation speed is significantly faster than LLaMA2-70B, while at the same time "
+ "beating other open source models, such as, LLaMA2-70B, Mixtral, and Grok-1 on "
+ "language understanding, programming, math, and logic.",
PromptTemplate.LLAMA,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it uses ChatML prompt template - PromptTemplate.CHAT_ML

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, done

@carlrobertoh
Copy link
Owner

Since the change was recent, we need to update the llama.cpp submodule as well

@reneleonhardt reneleonhardt force-pushed the support-llama-model-dbrx branch from 27b9d62 to 3bc8480 Compare April 15, 2024 11:54
@reneleonhardt
Copy link
Contributor Author

Since the change was recent, we need to update the llama.cpp submodule as well

Done

@carlrobertoh
Copy link
Owner

I'll try running the model locally soon and see if any other changes are necessary

@reneleonhardt
Copy link
Contributor Author

I'll try running the model locally soon and see if any other changes are necessary

Great! But in this PR I have to implement downloading all 10 files first I guess... 😅

@reneleonhardt reneleonhardt force-pushed the support-llama-model-dbrx branch 2 times, most recently from 4fb52b2 to 7aa08d9 Compare April 16, 2024 05:54
@reneleonhardt reneleonhardt force-pushed the support-llama-model-dbrx branch from 7aa08d9 to 05cdeed Compare April 21, 2024 07:00
@reneleonhardt reneleonhardt force-pushed the support-llama-model-dbrx branch from 05cdeed to c87c1b1 Compare April 21, 2024 07:04
@reneleonhardt
Copy link
Contributor Author

@phymbert I can download https://huggingface.co/phymbert/dbrx-16x12b-instruct-iq3_xxs-gguf without login in the browser, but inside the plugin I get 403 Forbidden, is this to be expected with the databricks-open-model-license (other) license?
Do you think DBRX is not particularly suited as a coding assistant? The smallest is 53 GB huge 😅

@phymbert
Copy link

Dbrx is a gated model, so I believe you have to pass a read token. There is an issue open on llama.cpp to support this.

@carlrobertoh carlrobertoh force-pushed the master branch 11 times, most recently from da70c82 to ad16f5c Compare March 4, 2025 00:26
@carlrobertoh carlrobertoh force-pushed the master branch 3 times, most recently from 91e6831 to 4a62471 Compare March 28, 2025 11:44
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants