HabanaAI / vllm-fork Public

forked from vllm-project/vllm

Notifications You must be signed in to change notification settings
Fork 78
Star 58

Code
Issues 9
Pull requests 37
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: HabanaAI/vllm-fork

Labels 17 Milestones 0

New pull request New

37 Open 791 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Automatic Prefix Caching - ux habana

Issues or PRs submitted by Habana Labs

#902 opened Mar 10, 2025 by adobrzyn

Loading…

Cherrypick merged prefill 2

#901 opened Mar 10, 2025 by kamil-kaczor • Draft

Update compile CI tests

#899 opened Mar 10, 2025 by afierka-intel • Draft

Synchronize vLLM flags to support cross-node inference

#897 opened Mar 7, 2025 by IT-Forrest

Loading…

Add LoRA parameter to test_lm_eval_correctness

#894 opened Mar 6, 2025 by mkrze • Draft

Cherry-pick of "Selective merged prefill #643"

#893 opened Mar 6, 2025 by kamil-kaczor • Draft

[SW-221458] Synchronization between HPU and CPU for more precise TTFT measurement

#892 opened Mar 6, 2025 by yuwenzho

Loading…

Bump jinja2 from 3.1.4 to 3.1.6 dependencies

Pull requests that update a dependency file

python

Pull requests that update python code

#891 opened Mar 6, 2025 by dependabot bot

Loading…

Added the logic to fix the warmup phase for spec decoding when enforce_eager is not used

#880 opened Feb 28, 2025 by pallavijaini0525

Loading…

[DO NOT MERGE] Add possibility to execute with LoRA adapter for lm_eval

#879 opened Feb 28, 2025 by mkrze • Draft

[Gaudi][Model] Qwen2.5-vl New Model

Issue o PR to enable a new model

#870 opened Feb 26, 2025 by malkomes

Loading…

[CI] Add APC tests

#866 opened Feb 25, 2025 by kzawora-intel

Loading…

Update Dockerfile.hpu

#864 opened Feb 25, 2025 by michalkuligowski • Draft

Update requirements-hpu.txt for open telemetry tracing support

#857 opened Feb 21, 2025 by louie-tsai

Loading…

enable multi-modal embedding for TIGER-Lab/VLM2Vec-Full T+I on HPU

#854 opened Feb 20, 2025 by libinta

Loading…

Draft: Another attempt at v1 HPU integration

#831 opened Feb 14, 2025 by kzawora-intel • Draft

22 of 24 tasks

Extend accuracy tests for models that we support

#824 opened Feb 13, 2025 by AnetaKaczynska

Loading…

Resolve Speculative Decode RTE

#823 opened Feb 13, 2025 by tannervoas742

Loading…

enable LoRA for embedding models

#821 opened Feb 12, 2025 by skaulintel

Loading…

Update documentation to reflect current bucket defaults

#817 opened Feb 12, 2025 by nngokhale

Loading…

Support qwenvl model for HPU New Model

Issue o PR to enable a new model

#793 opened Feb 7, 2025 by yingjie-han

Loading…

[DEEPSEEK_V3/R1] includes features of fp8 dequant, MLA, Expert parallelism

#792 opened Feb 6, 2025 by xuechendi

Loading…

Enable roberta embedding

#786 opened Feb 5, 2025 by yeonsily

Loading…

[DO NOT MERGE][PoC] Mark dynamic shapes in torch.compile mode

#755 opened Jan 29, 2025 by kzawora-intel • Draft

Pipeline Parallelism implementation.

#731 opened Jan 23, 2025 by jmaksymczuk • Draft

Previous 1 2 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly