Use conversation template for api proxy, fix eventsource format #2383

zeyugao · 2023-07-25T10:26:53Z

In this pr, it adds a --chat-prompt-model parameter enables the use of a model registered in fastchat/conversation.py. As model prompt templates, like llama 2, become more intricate, handling them exclusively with tools such as --chat-prompt and --user-name becomes less manageable. Thus, a community-maintained conversation template has been developed as a more user-friendly solution.

Currently, the customized system message is pending the merge of lm-sys/FastChat#2069. Yet, the current fschat version should operate without exceptions.

Furthermore, there exists an issue when presenting data in an event-source format. The data must conclude with two \n characters, rather than just one \n, implying the necessity for a blank line that contains only a single \n character, which is what OpenAI did.

Fix eventsource format

Make fschat and flask-cors optional

Azeirah · 2023-07-26T09:11:02Z

Thank you! The PHPStorm plugin I was using, codeGPT didn't work with the main branch api_like_OAI.py.

With yours it works smoothly! With the new llama-2 based wizard-13b I finally have a usable local-only assistant that integrates seamlessly in my existing workflows.

:D

zeyugao · 2023-08-02T01:37:31Z

The pr has been merged in upstream, and due to the limitation from GitHub (https://github.com/orgs/community/discussions/5634), it seems that I cannot allow editing by maintainer.

thomasbergersen · 2023-08-05T08:27:10Z

Thank you! The llama-cpp-python generated result is missing some key words

vmajor · 2023-08-07T07:46:54Z

I am observing this error with a 70B Llama 2 model when attempting to run the guidance tutorial notebook and dropping in openai.api_base = "http://127.0.0.1:8081/": https://github.com/microsoft/guidance/blob/main/notebooks/tutorial.ipynb

[2023-08-07 09:31:43,210] ERROR in app: Exception on /completions [POST]
Traceback (most recent call last):
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/vmajor/llama.cpp/examples/server/api_like_OAI_fixed.py", line 225, in completion
    postData = make_postData(body, chat=False, stream=stream)
  File "/home/vmajor/llama.cpp/examples/server/api_like_OAI_fixed.py", line 106, in make_postData
    if(is_present(body, "stop")): postData["stop"] += body["stop"]
TypeError: 'NoneType' object is not iterable
127.0.0.1 - - [07/Aug/2023 09:31:43] "POST //completions HTTP/1.1" 500 -

See reddit thread for setup: https://www.reddit.com/r/LocalLLaMA/comments/15ak5k4/short_guide_to_hosting_your_own_llamacpp_openai/

zeyugao added 2 commits July 26, 2023 15:26

Use coversation template from fastchat for api proxy

ea5a7fb

Fix eventsource format

Add docs

bee2a3d

Make fschat and flask-cors optional

zeyugao added 2 commits August 2, 2023 09:25

Merge remote-tracking branch 'origin/master'

59484c6

Use conv.set_system_message from upstream

712c2e9

zeyugao force-pushed the master branch from d8a8d0e to 712c2e9 Compare August 2, 2023 01:35

zeyugao added 2 commits August 7, 2023 18:08

Merge remote-tracking branch 'origin/master'

6ae3702

Fix when stop in request is null

ea73dac

AlienKevin added a commit to AlienKevin/llama.cpp that referenced this pull request Aug 21, 2023

Add OpenAI server PR from ggml-org/llama.cpp#2383

3af9477

See reddit thread for setup: https://www.reddit.com/r/LocalLLaMA/comments/15ak5k4/short_guide_to_hosting_your_own_llamacpp_openai/

jcushman mentioned this pull request Sep 8, 2023

OpenAI API Wrapper huggingface/text-generation-inference#735

Closed

Azeirah mentioned this pull request Nov 8, 2023

fix oai proxy #3972

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use conversation template for api proxy, fix eventsource format #2383

Use conversation template for api proxy, fix eventsource format #2383

zeyugao commented Jul 25, 2023

Azeirah commented Jul 26, 2023

zeyugao commented Aug 2, 2023

thomasbergersen commented Aug 5, 2023 •

edited

Loading

vmajor commented Aug 7, 2023

Use conversation template for api proxy, fix eventsource format #2383

Are you sure you want to change the base?

Use conversation template for api proxy, fix eventsource format #2383

Conversation

zeyugao commented Jul 25, 2023

Azeirah commented Jul 26, 2023

zeyugao commented Aug 2, 2023

thomasbergersen commented Aug 5, 2023 • edited Loading

vmajor commented Aug 7, 2023

thomasbergersen commented Aug 5, 2023 •

edited

Loading