Phudge Ollama version #1

rasbt · 2024-06-08T13:17:20Z

Hi there,

I recently stumbled upon your paper, and Phudge looks great! I was wondering if you considered adding it to ollama so that it can be used in an efficient manner (ollama is nice because you can run the eval even super conveniently on CPU).

Here'd be an instruction for how to convert it: https://github.com/ollama/ollama/blob/main/docs/import.md#publishing-your-model-optional--early-alpha.

PS: I am not affiliated with ollama in any way, shape, or form, I just like it :). Here's and example on using a Llama 3 model via ollama for model eval. I think doing something similar with Phudge would be cool!

deshwalmahesh · 2024-06-10T08:08:36Z

Hey @rasbt so happy to see that it reached to you somehow and thank you for the kind words. Would definitely add it to ollama. As I haven't worked personally in the past, will try my best to finish it ASAP.

d-kleine · 2024-06-12T14:38:22Z

One question to that:
https://huggingface.co/vicgalle/Phudge-3/blob/b66fb03609f66097b098e2ede782170082de56cd/config.json#L4

This seems to be a modified PHUDGE model as it uses a Phi3ForSequenceClassification architecture, so having a have a linear layer head for sequence classification on top, instead of a Phi3ForCausalLM architecture.

Is there any way to download your vanilla PHUDGE model having a Phi3ForCausalLM architecture?

I have seen your notebook on Kaggle. I was able to set up a CausalLM, but there is no PEFT model for CausalLM. For LORA_PATH = "./input/lora_128_REG" it says:

The model 'PeftModelForSequenceClassification' is not supported for text-generation.

Can you please provide a LoRA model for CausalLM (PeftModelForCausalLM) here, like lora_128_CAUSAL?

deshwalmahesh · 2024-06-13T15:06:25Z

Yes @d-kleine the PHUDGE model, which gives the best results uses a Classification Layer. If you read the paper you'll know that the Causal head makes it not only slower but has around ~4% worse results. I trained it on Causal also but need to check for weights.

In the meantime, you can use the training script to train your Causal model end to end by creating data from this notebook and then training using this script

d-kleine · 2024-08-07T09:13:20Z

@deshwalmahesh Any update on this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phudge Ollama version #1

Phudge Ollama version #1

rasbt commented Jun 8, 2024

deshwalmahesh commented Jun 10, 2024

d-kleine commented Jun 12, 2024 •

edited

Loading

deshwalmahesh commented Jun 13, 2024

d-kleine commented Aug 7, 2024

Phudge Ollama version #1

Phudge Ollama version #1

Comments

rasbt commented Jun 8, 2024

deshwalmahesh commented Jun 10, 2024

d-kleine commented Jun 12, 2024 • edited Loading

deshwalmahesh commented Jun 13, 2024

d-kleine commented Aug 7, 2024

d-kleine commented Jun 12, 2024 •

edited

Loading