Adds guidance extension #2554

paolorechia · 2023-06-06T22:08:39Z

What is this
Adds a small API wrapper to the guidance library (https://github.com/microsoft/guidance), using the model loaded by oobabooga's UI.

Use cases in mind
Implementation of chain of thought flows (and more complex flows) with guidance, using oobabooga as the model loader, so people can have an easy time loading GPTQ and other types of models.

Why

I've tried adding support to oobabooga's existing API to the guidance library (https://github.com/paolorechia/local-guidance), however not all features were possible to support, since guidance depends on logprobs and other data fields from the Hugging Face API for certain features.

Limitations
Only works for HuggingFace and GPTQ models so far. Someone already opened a PR in guidance to add supported llama cpp Python bindings: guidance-ai/guidance#70 - with this, we'll be able to support LLaMA GGML models.

How to use
The exposed API endpoint can easily be used with the thin wrapper (andromeda-chain):

pip install andromeda-chain

Repository: https://github.com/ChuloAI/andromeda-chain
Example code:

from andromeda_chain import AndromedaChain, AndromedaPrompt, AndromedaResponse

chain = AndromedaChain("http://0.0.0.0:9000/guidance_api/v1/generate")

prompt = AndromedaPrompt(
    name="hello",
    prompt_template="""Howdy: {{gen 'expert_names' temperature=0 max_tokens=300}}""",
    input_vars=[],
    output_vars=["expert_names"]
)

response: AndromedaResponse = chain.run_guidance_prompt(prompt)

Alternatively, the extension can be used by just implementing a simple HTTP client.

bilwis · 2023-06-10T22:45:50Z

EDIT: Okay, nvm, it's late here. Turns out I just wasn't loading the guidance extension 🤦. I'm going to leave this up for posterity.

Small suggestion though, the description parser.add_argument('--guidance-device', type=str, default='cuda', help='The listening port for the blocking guidance API.') should probably read something else. I'm definitely blaming that for thinking it'd be the network device...

Thanks again for your work! Can't wait to try it out when I've had some sleep 😄

When running your fork with python server.py --listen-port 7500 --xformers --api --guidance --guidance-port 8000 --guidance-device 0.0.0.0 --listen, trying to connect to the API with

from andromeda_chain import AndromedaChain, AndromedaPrompt, AndromedaResponse

chain = AndromedaChain("http://localhost:8000/guidance_api/v1/generate")

prompt = AndromedaPrompt(
    name="hello",
    prompt_template="""Howdy: {{gen 'expert_names' temperature=0 max_tokens=300}}""",
    input_vars=[],
    output_vars=["expert_names"]
)

response: AndromedaResponse = chain.run_guidance_prompt(prompt)

results in the following error:

ConnectionError: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /guidance_api/v1/generate (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f53cc622800>: Failed to establish a new connection: [Errno 111] Connection refused'))

Reinstalling dependencies, loading the model before the extensions, accessing from different ports, etc., all didn't solve the issue.

Not sure what the problem is, here, and sadly my coding knowledge doesn't reach far enough to tackle this. Hopefully, you can shine a light on it. Thanks for the work with the repo, btw! I'd love to get guidance running without having to (re)load the model in my notebooks all the time.

I'm running on WSL2 (Ubuntu 22.04.2 LTS), installed packages are as follows:

accelerate 0.20.3
aiofiles 23.1.0
aiohttp 3.8.4
aiosignal 1.3.1
altair 4.2.2
andromeda-chain 0.2
annoy 1.17.2
anyio 3.6.2
asciitree 0.3.3
asttokens 2.2.1
async-timeout 4.0.2
attrs 22.2.0
auto-gptq 0.2.2+cu117
backcall 0.2.0
backoff 2.2.1
beautifulsoup4 4.12.2
bitsandbytes 0.39.0
blinker 1.6.2
blis 0.7.9
boto3 1.26.137
botocore 1.29.137
brotlipy 0.7.0
cachetools 5.3.1
catalogue 2.0.8
certifi 2022.12.7
cffi 1.15.1
charset-normalizer 2.0.4
chromadb 0.3.18
click 8.1.3
clickhouse-connect 0.5.24
cmake 3.26.3
colorama 0.4.6
confection 0.0.4
contourpy 1.0.7
cryptography 39.0.1
cycler 0.11.0
cymem 2.0.7
datasets 2.10.1
decorator 5.1.1
deepspeed 0.8.2
dill 0.3.6
diskcache 5.6.1
docopt 0.6.2
duckdb 0.8.0
einops 0.6.1
en-core-web-sm 3.5.0
encodec 0.1.1
entrypoints 0.4
exceptiongroup 1.1.1
executing 1.2.0
fastapi 0.95.0
fasteners 0.18
ffmpy 0.3.0
filelock 3.9.0
Flask 2.3.2
flask-cloudflared 0.0.12
flexgen 0.1.7
flit_core 3.6.0
fonttools 4.39.2
frozenlist 1.3.3
fsspec 2023.3.0
funcy 2.0
gmpy2 2.1.2
gptcache 0.1.30
gptq-llama 0.2.2
gradio 3.33.1
gradio_client 0.2.5
guidance 0.0.61
h11 0.14.0
hjson 3.1.0
hnswlib 0.7.0
httpcore 0.16.3
httptools 0.5.0
httpx 0.23.3
huggingface-hub 0.14.1
idna 3.4
iniconfig 2.0.0
ipython 8.13.2
itsdangerous 2.1.2
jedi 0.18.2
Jinja2 3.1.2
jmespath 1.0.1
joblib 1.2.0
jsonschema 4.17.3
kiwisolver 1.4.4
langcodes 3.3.0
linkify-it-py 2.0.0
lit 16.0.5
llama-cpp-python 0.1.57
lz4 4.3.2
Markdown 3.4.3
markdown-it-py 2.2.0
MarkupSafe 2.1.1
matplotlib 3.7.1
matplotlib-inline 0.1.6
mdit-py-plugins 0.3.3
mdurl 0.1.2
mkl-fft 1.3.1
mkl-random 1.2.2
mkl-service 2.4.0
monotonic 1.6
mpmath 1.2.1
msal 1.22.0
multidict 6.0.4
multiprocess 0.70.14
murmurhash 1.0.9
mypy-extensions 1.0.0
nest-asyncio 1.5.6
networkx 2.8.4
ninja 1.11.1
nltk 3.8.1
num2words 0.5.12
numcodecs 0.11.0
numpy 1.24.2
openai 0.27.8
orjson 3.8.7
packaging 23.0
pandas 1.5.3
parsimonious 0.10.0
parso 0.8.3
pathy 0.10.1
peft 0.4.0.dev0
pexpect 4.8.0
pickleshare 0.7.5
Pillow 9.5.0
pip 23.1.2
platformdirs 3.5.3
pluggy 1.0.0
posthog 2.4.2
preshed 3.0.8
prompt-toolkit 3.0.38
protobuf 3.20.2
psutil 5.9.4
ptyprocess 0.7.0
PuLP 2.7.0
pure-eval 0.2.2
py-cpuinfo 9.0.0
pyarrow 11.0.0
pycparser 2.21
pycryptodome 3.17
pydantic 1.10.6
pydub 0.25.1
Pygments 2.15.1
pygtrie 2.5.0
PyJWT 2.7.0
pyOpenSSL 23.0.0
pyparsing 3.0.9
pyre-extensions 0.0.29
pyrsistent 0.19.3
PySocks 1.7.1
pytest 7.2.2
python-dateutil 2.8.2
python-dotenv 1.0.0
python-multipart 0.0.6
pytz 2022.7.1
PyYAML 6.0
regex 2022.10.31
requests 2.28.2
responses 0.18.0
rfc3986 1.5.0
rouge 1.0.1
rwkv 0.7.3
s3transfer 0.6.1
safetensors 0.3.1
scikit-learn 1.2.2
scipy 1.10.1
semantic-version 2.10.0
sentence-transformers 2.2.2
sentencepiece 0.1.97
setuptools 67.8.0
six 1.16.0
smart-open 6.3.0
sniffio 1.3.0
soupsieve 2.4.1
spacy 3.5.3
spacy-legacy 3.0.12
spacy-loggers 1.0.4
srsly 2.4.6
stack-data 0.6.2
starlette 0.26.1
suno-bark 0.0.1a0
sympy 1.11.1
texttable 1.6.7
thinc 8.1.10
threadpoolctl 3.1.0
tiktoken 0.4.0
tokenizers 0.13.3
toml 0.10.2
tomli 2.0.1
toolz 0.12.0
torch 2.0.0
torchaudio 2.0.0
torchvision 0.15.0
tqdm 4.65.0
traitlets 5.9.0
transformers 4.30.0
triton 2.0.0
typer 0.7.0
typing_extensions 4.5.0
typing-inspect 0.8.0
uc-micro-py 1.0.1
urllib3 1.26.14
uvicorn 0.21.1
uvloop 0.17.0
wasabi 1.1.1
watchfiles 0.19.0
wcwidth 0.2.6
websockets 11.0.2
Werkzeug 2.3.6
wheel 0.40.0
xformers 0.0.19
xxhash 3.2.0
yarl 1.8.2
zarr 2.14.2
zstandard 0.21.0

paolorechia · 2023-06-10T23:07:44Z

Hi, @bilwis, thanks for trying it out and finding this issue. I was going to point the argument looked weird, but you found it yourself first :)

It used the api extension code as a base, so I forgot to update the description, sorry about the confusion. I’ll fix this description hopefully tomorrow or during the week.

Let me know how it goes for you.

bilwis · 2023-06-11T08:46:43Z

Here we are again, well rested but still stupid. I've got a question about passing input variables. You've got the input_vars field in AndromedaPrompt, but does this actually do anything? From what I've figured out, you have to pass the variables to the run_guidance_prompt command.

input_vars = {'word': 'Howdy'}

prompt = AndromedaPrompt(
    name = 'test',
    prompt_template = """{{word}}: {{gen 'response'}}""",
    input_vars = [], #What do I put here?
    output_vars = ['response']
)

response: AndromedaResponse = chain.run_guidance_prompt(prompt, input_vars)

Again, thanks for your work, I hope it'll be merged soon, and that more people get to play around with guidance.

paolorechia · 2023-06-11T09:54:23Z

Here we are again, well rested but still stupid. I've got a question about passing input variables. You've got the input_vars field in AndromedaPrompt, but does this actually do anything? From what I've figured out, you have to pass the variables to the run_guidance_prompt command.
input_vars = {'word': 'Howdy'}

prompt = AndromedaPrompt(
    name = 'test',
    prompt_template = """{{word}}: {{gen 'response'}}""",
    input_vars = [], #What do I put here?
    output_vars = ['response']
)

response: AndromedaResponse = chain.run_guidance_prompt(prompt, input_vars)
Again, thanks for your work, I hope it'll be merged soon, and that more people get to play around with guidance.

My fault that the documentation is not clear. You should pass a dictionary of values for the variables that are not generated. You have already it defined as {“word”: “Howdy”} in the example.

oobabooga · 2023-06-20T21:12:10Z

Could you please submit the extension to https://github.com/oobabooga/text-generation-webui-extensions? I am not familiar enough with guidance to properly maintain the extension in the future, and would prefer to have something integrated with the UI rather than an additional API.

paolorechia added 3 commits June 6, 2023 22:52

Guidance extension

bc8b942

First working version of extension

c8b8306

Merge branch 'main' into main

56a0906

paolorechia added 2 commits June 11, 2023 16:31

Fix description

f6f3d94

Merge branch 'main' of github.com:paolorechia/text-generation-webui

5bbb266

oobabooga closed this Jun 20, 2023

oobabooga mentioned this pull request Jun 20, 2023

fixed errors from Guidance API extensions added #2709

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds guidance extension #2554

Adds guidance extension #2554

paolorechia commented Jun 6, 2023

bilwis commented Jun 10, 2023 •

edited

Loading

paolorechia commented Jun 10, 2023

bilwis commented Jun 11, 2023 •

edited

Loading

paolorechia commented Jun 11, 2023

oobabooga commented Jun 20, 2023

Adds guidance extension #2554

Adds guidance extension #2554

Conversation

paolorechia commented Jun 6, 2023

bilwis commented Jun 10, 2023 • edited Loading

paolorechia commented Jun 10, 2023

bilwis commented Jun 11, 2023 • edited Loading

paolorechia commented Jun 11, 2023

oobabooga commented Jun 20, 2023

bilwis commented Jun 10, 2023 •

edited

Loading

bilwis commented Jun 11, 2023 •

edited

Loading