Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

lmi cpu container with vLLM #2009

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

lanking520
Copy link
Contributor

@lanking520 lanking520 commented Jun 1, 2024

Description

Support CPU container build for vLLM based LLM inference. Tested with LLAMA3-7B, worked, but extremely slow

engine=Python
option.rolling_batch=vllm
option.model_id=NousResearch/Hermes-2-Pro-Llama-3-8B
option.tensor_parallel_degree=1

@lanking520 lanking520 requested review from zachgk, frankfliu and a team as code owners June 1, 2024 18:24
@lanking520 lanking520 changed the title [WIP] lmi cpu container with vLLM lmi cpu container with vLLM Jun 3, 2024
VLLM_TARGET_DEVICE=cpu python3 setup.py bdist_wheel


FROM base AS lmi-cpu
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that there could only be one FROM in each Dockerfile, I may be wrong but just want to check

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants