triton-mock

Overview

A proof-of-concept mock server for the NVIDIA Triton Inference Server. It operates in two modes:

Replay mode: In this mode, the mock server replays the requests it has seen before.
Record mode: In this mode, the mock server records the requests it sees and saves them to disk.

The mock server utilizes the gRPC definitions from the Triton Inference Server.

To run in recording mode:

RUST_LOG=debug cargo run --release -- --remote-host 0.0.0.0 --record

This requires a real Triton Inference Server running on ports 8302-8307. Right now the mapping of model names to ports is hard-coded in src/main.rs.

These recordings can be replayed using:

RUST_LOG=debug cargo run --release -- --remote-host 0.0.0.0

Releasing

See PUBLISHING.md

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.github/workflows		.github/workflows
protos		protos
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
PUBLISHING.md		PUBLISHING.md
README.md		README.md
build.rs		build.rs
triton-mock-base.yaml		triton-mock-base.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

triton-mock

Overview

Releasing

About

Releases 11

Packages

Languages

License

YurtsAI/triton-mock

Folders and files

Latest commit

History

Repository files navigation

triton-mock

Overview

Releasing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 11

Packages 0

Languages

Packages