A high-throughput and memory-efficient inference and serving engine for LLMs
-
Updated
Feb 3, 2025 - Python
A high-throughput and memory-efficient inference and serving engine for LLMs
Package for writing high-level code for parallel high-performance stencil computations that can be deployed on both GPUs and CPUs
Purplecoin/XPU Core integration/staging tree
A Cloud-native Bare-metal Networking Orchestration
Add a description, image, and links to the xpu topic page so that developers can more easily learn about it.
To associate your repository with the xpu topic, visit your repo's landing page and select "manage topics."