MachineLearningSystem
Popular repositories Loading
-
25ASPLOS-Medusa
25ASPLOS-Medusa PublicForked from thustorage/Medusa
Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]
-
24MLSYS-prompt-cache
24MLSYS-prompt-cache PublicForked from yale-sys/prompt-cache
Modular and structured prompt caching for low-latency LLM inference
Python 6
-
-
Optimus-CC
Optimus-CC Public[ASPLOS'23] Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression
-
-
Awesome-DL-Scheduling-Papers
Awesome-DL-Scheduling-Papers PublicForked from S-Lab-System-Group/Awesome-DL-Scheduling-Papers
Repositories
- OSDI25-PipeANN Public Forked from thustorage/PipeANN
[OSDI'25] Achieving Low-Latency Graph-Based Vector Search via Aligning Best-First Search Algorithm with SSD
- 25ASPLOS-Hetu-Galvatron Public Forked from PKU-DAIR/Hetu-Galvatron
Galvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs).
- specreason Public Forked from ruipeterpan/specreason
PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [arXiv '25]
- Triton-distributed Public Forked from ByteDance-Seed/Triton-distributed
Distributed Triton for Parallel Systems
- 25NSDI-ByteCheckpoint Public Forked from ByteDance-Seed/ByteCheckpoint
ByteCheckpoint: An Unified Checkpointing Library for LFMs
- 25ASPLOS-Ayo Public Forked from NetX-lab/Ayo
[ASPLOS'25] Towards End-to-End Optimization of LLM-based Applications with Ayo
- async_rlhf Public Forked from mnoukhov/async_rlhf
Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…