trainium

Here are 2 public repositories matching this topic...

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda inference pytorch transformer llama gpt rocm model-serving tpu hpu mlops xpu llm inferentia llmops llm-serving trainium

Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.

benchmarking benchmark p5 bedrock evaluation-metrics sagemaker g6 p4d g5 foundation-models inferentia generative-ai llama2 trainium llama3 g6e

Add a description, image, and links to the trainium topic page so that developers can more easily learn about it.

To associate your repository with the trainium topic, visit your repo's landing page and select "manage topics."