Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Phudge Ollama version #1

Open
rasbt opened this issue Jun 8, 2024 · 4 comments
Open

Phudge Ollama version #1

rasbt opened this issue Jun 8, 2024 · 4 comments

Comments

@rasbt
Copy link

rasbt commented Jun 8, 2024

Hi there,

I recently stumbled upon your paper, and Phudge looks great! I was wondering if you considered adding it to ollama so that it can be used in an efficient manner (ollama is nice because you can run the eval even super conveniently on CPU).

Here'd be an instruction for how to convert it: https://github.com/ollama/ollama/blob/main/docs/import.md#publishing-your-model-optional--early-alpha.

PS: I am not affiliated with ollama in any way, shape, or form, I just like it :). Here's and example on using a Llama 3 model via ollama for model eval. I think doing something similar with Phudge would be cool!

@deshwalmahesh
Copy link
Owner

Hey @rasbt so happy to see that it reached to you somehow and thank you for the kind words. Would definitely add it to ollama. As I haven't worked personally in the past, will try my best to finish it ASAP.

@d-kleine
Copy link

d-kleine commented Jun 12, 2024

One question to that:
https://huggingface.co/vicgalle/Phudge-3/blob/b66fb03609f66097b098e2ede782170082de56cd/config.json#L4

This seems to be a modified PHUDGE model as it uses a Phi3ForSequenceClassification architecture, so having a have a linear layer head for sequence classification on top, instead of a Phi3ForCausalLM architecture.

Is there any way to download your vanilla PHUDGE model having a Phi3ForCausalLM architecture?

I have seen your notebook on Kaggle. I was able to set up a CausalLM, but there is no PEFT model for CausalLM. For LORA_PATH = "./input/lora_128_REG" it says:

The model 'PeftModelForSequenceClassification' is not supported for text-generation.

Can you please provide a LoRA model for CausalLM (PeftModelForCausalLM) here, like lora_128_CAUSAL?

@deshwalmahesh
Copy link
Owner

Yes @d-kleine the PHUDGE model, which gives the best results uses a Classification Layer. If you read the paper you'll know that the Causal head makes it not only slower but has around ~4% worse results. I trained it on Causal also but need to check for weights.

In the meantime, you can use the training script to train your Causal model end to end by creating data from this notebook and then training using this script

@d-kleine
Copy link

d-kleine commented Aug 7, 2024

@deshwalmahesh Any update on this?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants