Skip to content

[Text Generation][V2] LinearRouter to accept SPLIT/JOIN #1434

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Draft
wants to merge 50 commits into
base: feature/damian/no_kv_cache
Choose a base branch
from

Conversation

dbogunowicz
Copy link
Contributor

It seems that fundamentally at the Pipeline level, there is an assumption that ops is a list, not a dictionary.

To reproduce:

from deepsparse.v2.text_generation import TextGenerationPipelineNoCache

prompt = ["Some funny prompt"]

pipeline = TextGenerationPipelineNoCache(model_path="hf:mgoin/TinyStories-1M-ds",
                                         onnx_model_name="model-orig.onnx",
                                         sequence_length=20)

out = pipeline(prompt=prompt)
Traceback (most recent call last):
  File "/home/ubuntu/.cache/JetBrains/RemoteDev/dist/67886da002816_pycharm-professional-231.9225.5/plugins/python/helpers/pydev/pydevd.py", line 1496, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/home/ubuntu/.cache/JetBrains/RemoteDev/dist/67886da002816_pycharm-professional-231.9225.5/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/home/ubuntu/damian/deepsparse/hehe2.py", line 10, in <module>
    out = pipeline(prompt=prompt,
  File "/home/ubuntu/damian/deepsparse/src/deepsparse/v2/pipeline.py", line 265, in __call__
    return self.run(*args, **kwargs)
  File "/home/ubuntu/damian/deepsparse/src/deepsparse/v2/text_generation/pipeline_no_kv_cache.py", line 123, in run
    return super().run(*args, **kwargs)
  File "/home/ubuntu/damian/deepsparse/src/deepsparse/v2/pipeline.py", line 217, in run
    operator=self.ops[next_step],
KeyError: 0

bfineran and others added 30 commits October 26, 2023 13:22
… router and image classification pipeline/operators/example (#1325)

* initial functionality and working example with image classification

* remove testing image

* update args

* initial functionality and working example with image classification

* remove testing image

* pr comments

* defines schemas for operators and test

* add image classification test, PR comments

* fix input/output handling in pipeline and operator base classes to be more generic; remove context

* add additional operator input message

* typo fix
* [v2] EngineOperator updates to make continuous batching easier

* test fixes
…ity (#1348)

* initial functionality and working example with image classification

* remove testing image

* rebase fixes

* initial functionality and working example with image classification

* text gen

* updates func

* prompt inference, initial functionality

* remove image; update state docstring

* Fix typo

* add todo for split/join

* remove context, clean-up args, remove prefill_preprocess_operaator

* fix docstrings
…generation functionality (#1356)

* initial functionality and working example with image classification

* remove testing image

* rebase fixes

* initial functionality and working example with image classification

* text gen

* updates func

* prompt inference, initial functionality

* remove image; update state docstring

* Fix typo

* add todo for split/join

* remove context, clean-up args, remove prefill_preprocess_operaator

* fix docstrings

* initial functionality and working example with image classification

* updates func

* prompt inference, initial functionality

* finish generation operators and update routes

* further breakdown operators

* add operators

* fix can_operate condition

* update can_operate to not rely on the inference_state

* rebase + update

* fix condition

* fix capacity settting again

* typo fixes
…e code to remove repeat code, update map function
…eature/damian/v2/factor_out_transformation_utils
)

* add split/join functionality

* update router to include split/join in parent class, refactor pipeline code to remove repeat code, update map function

* process multiple generations

* move map to base class
…eature/damian/v2/factor_out_transformation_utils
* unit testing for text generation operators

* additional changes

* unit testing completion

* remove debug

* fix

* add todo

* more clean-up

* fix test

* add docstrings/comments

* break out tests to individual unit test files; add conftest and make scope of fixtures module to help with speed

* fix name
…ng and prioritization (#1373)

* [Continuous Batching] Queue Implementation to support batching grouping and prioritization

* has_key method

* thread safety

* add blocking option for pop_batch

* update docstring

* allow mutex to be shared across continuous batching objects

* revert last commit
…#1374)

* [Continuous Batching] Executor thread for running continuous batching

* quality

* ensure that executor stops when main thread does - clean up test hack
* [ContinuousBatching] ContinuousBatchingScheduler Implementation

* cleanup unnecessary stop condition
* [continuous batching] singleton pattern for scheduler

* catch from review
dbogunowicz and others added 20 commits November 14, 2023 10:46
…ating engine_inputs (#1364)

* rebasing off my initial commit

* cleanups

* unit testing for text generation operators

* additional changes

* unit testing completion

* remove debug

* fix

* add todo

* more clean-up

* fix test

* add docstrings/comments

* break out tests to individual unit test files; add conftest and make scope of fixtures module to help with speed

* Delete tests/deepsparse/v2/unit/text_generation/test_msic.py

---------

Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
…functions (#1394)

* add split/join functionality

* update router to include split/join in parent class, refactor pipeline code to remove repeat code, update map function

* process multiple generations

* initial commit

* fix error

* unit testing for text generation operators

* additional changes

* unit testing completion

* remove debug

* fix

* add todo

* more clean-up

* fix test

* add docstrings/comments

* break out tests to individual unit test files; add conftest and make scope of fixtures module to help with speed

* Delete tests/deepsparse/v2/unit/text_generation/test_msic.py

* pipeline runs, but incorrectly

* Revert "pipeline runs, but incorrectly"

This reverts commit 51c4ee6.

* PR review comments

---------

Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
* initial commit

* initial commit

* its working now

* beautification

* thank you Dipika <3

* ready to review
…#1409)

* update split/join

* use map

* update

* run end-to-end

* clean-up

* fix bug with batch size, introduce SplitRoute dataclass

* update tests to use new inputs/outputs

* use the normal scheduler for internal kv_cache

* add pipeline inpuits

* clean-up

* change engine type, update docstrings, update override function to be more generic

* move subgraph functionality to its own function; clean-up cont batching in text gen pipeline

* update linear pathway to also use subgraph execution

* rebase fix

* fix tests
* initial registry functionality

* use sparsezoo mixin
@dbogunowicz dbogunowicz changed the base branch from main to feature/damian/no_kv_cache November 28, 2023 08:05
@dbogunowicz dbogunowicz force-pushed the feature/damian/no_kv_cache branch from e0a9dee to 7f3eb12 Compare December 18, 2023 16:10
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants