-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Phudge Ollama version #1
Comments
Hey @rasbt so happy to see that it reached to you somehow and thank you for the kind words. Would definitely add it to ollama. As I haven't worked personally in the past, will try my best to finish it ASAP. |
One question to that: This seems to be a modified PHUDGE model as it uses a Is there any way to download your vanilla PHUDGE model having a I have seen your notebook on Kaggle. I was able to set up a CausalLM, but there is no PEFT model for CausalLM. For
Can you please provide a LoRA model for CausalLM ( |
Yes @d-kleine the PHUDGE model, which gives the best results uses a Classification Layer. If you read the paper you'll know that the Causal head makes it not only slower but has around ~4% worse results. I trained it on Causal also but need to check for weights. In the meantime, you can use the training script to train your Causal model end to end by creating data from this notebook and then training using this script |
@deshwalmahesh Any update on this? |
Hi there,
I recently stumbled upon your paper, and Phudge looks great! I was wondering if you considered adding it to ollama so that it can be used in an efficient manner (ollama is nice because you can run the eval even super conveniently on CPU).
Here'd be an instruction for how to convert it: https://github.com/ollama/ollama/blob/main/docs/import.md#publishing-your-model-optional--early-alpha.
PS: I am not affiliated with ollama in any way, shape, or form, I just like it :). Here's and example on using a Llama 3 model via ollama for model eval. I think doing something similar with Phudge would be cool!
The text was updated successfully, but these errors were encountered: