Skip to content
This repository was archived by the owner on Oct 23, 2024. It is now read-only.
/ triton-mock Public archive

Proof-of-concept gRPC stream record-and-replay for Triton Inference Server

License

Notifications You must be signed in to change notification settings

YurtsAI/triton-mock

Repository files navigation

triton-mock

Overview

A proof-of-concept mock server for the NVIDIA Triton Inference Server. It operates in two modes:

  1. Replay mode: In this mode, the mock server replays the requests it has seen before.

  2. Record mode: In this mode, the mock server records the requests it sees and saves them to disk.

The mock server utilizes the gRPC definitions from the Triton Inference Server.

To run in recording mode:

RUST_LOG=debug cargo run --release -- --remote-host 0.0.0.0 --record

This requires a real Triton Inference Server running on ports 8302-8307. Right now the mapping of model names to ports is hard-coded in src/main.rs.

These recordings can be replayed using:

RUST_LOG=debug cargo run --release -- --remote-host 0.0.0.0

Releasing

See PUBLISHING.md