Skip to content

Disaggregated serving #365

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 15 commits into from
Apr 18, 2025
Merged

Disaggregated serving #365

merged 15 commits into from
Apr 18, 2025

Conversation

quic-amitraj
Copy link
Contributor

@quic-amitraj quic-amitraj commented Apr 16, 2025

Adding support of-

  1. prefill_only
  2. mdp_ts_json_path

quic-rishinr and others added 9 commits April 16, 2025 05:30
Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
@@ -300,7 +308,7 @@ def _compile(
command.append(f"-custom-IO-list-file={custom_io_yaml}")

# Write mdp_config.json file
if mdp_ts_num_devices > 1:
if not mdp_ts_json_path:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not mdp_ts_json_path and mdp_ts_num_devices>1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not addressed yet?

@@ -300,7 +308,7 @@ def _compile(
command.append(f"-custom-IO-list-file={custom_io_yaml}")

# Write mdp_config.json file
if mdp_ts_num_devices > 1:
if not mdp_ts_json_path:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not addressed yet?

ochougul and others added 4 commits April 18, 2025 14:25
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
@ochougul ochougul marked this pull request as ready for review April 18, 2025 11:02
@ochougul ochougul requested a review from quic-rishinr as a code owner April 18, 2025 11:02
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
@quic-amitraj quic-amitraj merged commit 3de4072 into main Apr 18, 2025
5 checks passed
@quic-xiyushi
Copy link

Is the prefill_only flag available only for QEFFAutoModelForCausalLM? Why don't we support it for other classes as well, such as multimodal models?

eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
Adding support of-
1. `prefill_only`
2. `compile_for` for VLM
3. `mdp_ts_json_path`

---------

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com>
Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com>
Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
Adding support of-
1. `prefill_only`
2. `compile_for` for VLM
3. `mdp_ts_json_path`

---------

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com>
Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com>
Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
Adding support of-
1. `prefill_only`
2. `compile_for` for VLM
3. `mdp_ts_json_path`

---------

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com>
Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com>
Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Signed-off-by: eplatero <quic_eplatero@quicinc.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
Adding support of-
1. `prefill_only`
2. `compile_for` for VLM
3. `mdp_ts_json_path`

---------

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com>
Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com>
Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>

Signed-off-by: eplatero <quic_eplatero@quicinc.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
Adding support of-
1. `prefill_only`
2. `compile_for` for VLM
3. `mdp_ts_json_path`

---------

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com>
Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com>
Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>

Signed-off-by: eplatero <quic_eplatero@quicinc.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
Adding support of-
1. `prefill_only`
2. `compile_for` for VLM
3. `mdp_ts_json_path`

---------

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com>
Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com>
Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>

Signed-off-by: eplatero <quic_eplatero@quicinc.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
Adding support of-
1. `prefill_only`
2. `compile_for` for VLM
3. `mdp_ts_json_path`

---------

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com>
Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com>
Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>

Signed-off-by: eplatero <quic_eplatero@quicinc.com>
@quic-rishinr quic-rishinr deleted the dist_serve branch June 13, 2025 08:39
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants