Request: add support for Cerebras GPT models just released #579

Topping1 · 2023-03-28T15:29:40Z

Topping1
Mar 28, 2023

The announcement is here:

https://www.cerebras.net/press-release/cerebras-systems-releases-seven-new-gpt-models-trained-on-cs-2-wafer-scale-systems

The models are available here:
https://huggingface.co/cerebras

Excerpts:
"SUNNYVALE, CALIFORNIA – March 28, 2023 – Cerebras Systems, the pioneer in artificial intelligence (AI) compute for generative AI, today announced it has trained and is releasing a series of seven GPT-based large language models (LLMs) for open use by the research community. This is the first time a company has used non-GPU based AI systems to train LLMs up to 13 billion parameters and is sharing the models, weights, and training recipe via the industry standard Apache 2.0 license. All seven models were trained on the 16 CS-2 systems in the Cerebras Andromeda AI supercomputer."
"Cerebras’ release today directly addresses these issues. In a first among AI hardware companies, Cerebras researchers trained, on the Andromeda AI supercomputer, a series of seven GPT models with 111M, 256M, 590M, 1.3B, 2.7B, 6.7B, and 13B parameters."

slaren · 2023-03-28T15:45:21Z

slaren
Mar 28, 2023
Maintainer

Looking at the results from https://www.cerebras.net/blog/cerebras-gpt-a-family-of-open-compute-efficient-large-language-models/ I am not sure that these models are any better than GPT-J

0 replies

ggerganov · 2023-03-28T19:23:05Z

ggerganov
Mar 28, 2023
Maintainer

Can someone link the python inference code if it is available - I cannot seem to find it

6 replies

ggerganov Mar 28, 2023
Maintainer

Yup, I just realized it. I have almost converted it

ggerganov Mar 28, 2023
Maintainer

Ok, got it working:

Will try to add a ggml example tomorrow

appvoid Apr 1, 2023

Can you please give some insights on how you convert models? Is there a way to automate it or you need to know some structure behind the scenes? I'm new to all of this.

ggerganov Apr 1, 2023
Maintainer

Here is the example with instructions:

https://github.com/ggerganov/ggml/tree/master/examples/gpt-2

dogjamboree Apr 3, 2023

It works great, thanks! But for the 13B version the model is split into 2 huge bin files so the script doesn't know what to do with them. Any idea how to merge them for use?

forrackun · 2023-07-27T08:53:53Z

forrackun
Jul 27, 2023

New 3B model looks quite capable https://www.cerebras.net/blog/btlm-3b-8k-7b-performance-in-a-3-billion-parameter-model/

0 replies

bornjre · 2023-08-05T23:30:52Z

bornjre
Aug 5, 2023

ggml-org/ggml#427

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request: add support for Cerebras GPT models just released #579

{{title}}

Replies: 4 comments 6 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Request: add support for Cerebras GPT models just released #579

Topping1 Mar 28, 2023

Replies: 4 comments · 6 replies

slaren Mar 28, 2023 Maintainer

ggerganov Mar 28, 2023 Maintainer

ggerganov Mar 28, 2023 Maintainer

ggerganov Mar 28, 2023 Maintainer

appvoid Apr 1, 2023

ggerganov Apr 1, 2023 Maintainer

dogjamboree Apr 3, 2023

forrackun Jul 27, 2023

bornjre Aug 5, 2023

Topping1
Mar 28, 2023

Replies: 4 comments 6 replies

slaren
Mar 28, 2023
Maintainer

ggerganov
Mar 28, 2023
Maintainer

ggerganov Mar 28, 2023
Maintainer

ggerganov Mar 28, 2023
Maintainer

ggerganov Apr 1, 2023
Maintainer

forrackun
Jul 27, 2023

bornjre
Aug 5, 2023