Skip to content

Optimizations #2

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 34 commits into from
Apr 24, 2025
Merged

Optimizations #2

merged 34 commits into from
Apr 24, 2025

Conversation

quic-sanising
Copy link
Owner

No description provided.

@quic-sanising quic-sanising merged commit 50953e2 into on-device-sampling Apr 24, 2025
quic-sanising added a commit that referenced this pull request Apr 24, 2025
* Initial commit

* Reformat code

* Fix bug

* Add Gumbel-Max trick based random sampling

* Bring up to date

* Use Gumbel-Max Trick based Random Sampling as default

* Clip k to max value

* Add docstring for sampling parameters

* Fix bug

* Add support for continuous batching

* Fix ONNX error for batch_size 1 treated as a Constant

* Undo docstring deletion

* Remove device and unncessary reshapes

* Revert batch_size to 1

* Remove vocab_size from dynamic axes

* Change condition

* Change size of each sampling parameter to (batch_size, 1)

* Reformat code

* Add optimizations

* Identify optimizations

* Fix bug

* Fix merge issue

* Optimizations:
Perform random sampling only on topk_values_asc
Only need logits for probs when self.return_pdfs is True

* Remove where clause for temperature

* Remove boolean type casting for retain state

* Always return next_tokens

* Fix bug

* Reformat code

* Initialize retain states

* Optimize imports

* Remove torch.index_select()

* Change dtype of penalty buffers to bool

---------

Signed-off-by: quic-sanising <quic_sanising@quicinc.com>
@quic-sanising quic-sanising deleted the optimizations branch April 24, 2025 18:21
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant