-
Notifications
You must be signed in to change notification settings - Fork 51
Disaggregated serving #365
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
QEfficient/base/modeling_qeff.py
Outdated
@@ -300,7 +308,7 @@ def _compile( | |||
command.append(f"-custom-IO-list-file={custom_io_yaml}") | |||
|
|||
# Write mdp_config.json file | |||
if mdp_ts_num_devices > 1: | |||
if not mdp_ts_json_path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not mdp_ts_json_path and mdp_ts_num_devices>1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not addressed yet?
QEfficient/base/modeling_qeff.py
Outdated
@@ -300,7 +308,7 @@ def _compile( | |||
command.append(f"-custom-IO-list-file={custom_io_yaml}") | |||
|
|||
# Write mdp_config.json file | |||
if mdp_ts_num_devices > 1: | |||
if not mdp_ts_json_path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not addressed yet?
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Is the prefill_only flag available only for QEFFAutoModelForCausalLM? Why don't we support it for other classes as well, such as multimodal models? |
Adding support of- 1. `prefill_only` 2. `compile_for` for VLM 3. `mdp_ts_json_path` --------- Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com> Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com> Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com> Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com> Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com> Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Adding support of- 1. `prefill_only` 2. `compile_for` for VLM 3. `mdp_ts_json_path` --------- Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com> Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com> Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com> Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com> Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com> Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Adding support of- 1. `prefill_only` 2. `compile_for` for VLM 3. `mdp_ts_json_path` --------- Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com> Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com> Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com> Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com> Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com> Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com> Signed-off-by: eplatero <quic_eplatero@quicinc.com>
Adding support of- 1. `prefill_only` 2. `compile_for` for VLM 3. `mdp_ts_json_path` --------- Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com> Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com> Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com> Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com> Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com> Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com> Signed-off-by: eplatero <quic_eplatero@quicinc.com>
Adding support of- 1. `prefill_only` 2. `compile_for` for VLM 3. `mdp_ts_json_path` --------- Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com> Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com> Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com> Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com> Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com> Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com> Signed-off-by: eplatero <quic_eplatero@quicinc.com>
Adding support of- 1. `prefill_only` 2. `compile_for` for VLM 3. `mdp_ts_json_path` --------- Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com> Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com> Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com> Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com> Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com> Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com> Signed-off-by: eplatero <quic_eplatero@quicinc.com>
Adding support of- 1. `prefill_only` 2. `compile_for` for VLM 3. `mdp_ts_json_path` --------- Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com> Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com> Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com> Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com> Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com> Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com> Signed-off-by: eplatero <quic_eplatero@quicinc.com>
Adding support of-
prefill_only
mdp_ts_json_path