Skip to content
This repository has been archived by the owner on Dec 6, 2024. It is now read-only.

💡 [REQUEST] - CPU 的 qwen-cpp 如何封装为一个 http 服务? #65

Open
micronetboy opened this issue Dec 14, 2023 · 4 comments
Assignees
Labels
question Further information is requested

Comments

@micronetboy
Copy link

起始日期 | Start Date

No response

实现PR | Implementation PR

CPU 的 qwen-cpp 如何封装为一个 http 服务?

相关Issues | Reference Issues

摘要 | Summary

基本示例 | Basic Example

缺陷 | Drawbacks

未解决问题 | Unresolved questions

No response

@micronetboy micronetboy added the question Further information is requested label Dec 14, 2023
@jklj077
Copy link
Collaborator

jklj077 commented Dec 14, 2023

如果是要HTTP的API服务的话,qwen-cpp有python binding,openai_api.py的model更换下也许可以。
如果是要HTTP的Web服务的话,web_demo.py应该也是要替换模型创建的部分。

对C实现的模型有需求,建议关注llama.cpp,现在也支持Qwen了,那个的生态也更丰富些。

@sheiy
Copy link

sheiy commented Dec 19, 2023

@jklj077 麻烦问下。怎么让openai_api.py支持并发请求?

@jklj077
Copy link
Collaborator

jklj077 commented Dec 20, 2023

@sheiy 本repo中的openai_pai.py支持不了并发哈。如果有并发的需要,建议使用FastChat+vLLM,也可以提供OpenAI API类似的接口。

@jklj077 jklj077 transferred this issue from QwenLM/Qwen Dec 20, 2023
@sheiy
Copy link

sheiy commented Dec 22, 2023

@jklj077 多谢

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants