Skip to content

Enabled configurable auto Tensor Parallelism (TP) for the inference of diverse models #11456

Enabled configurable auto Tensor Parallelism (TP) for the inference of diverse models

Enabled configurable auto Tensor Parallelism (TP) for the inference of diverse models #11456