Enabled high-performance Automatic Tensor Parallelism (auto TP) for the Qwen2-MoE and DeepSeek-V2 models on multiple GPUs/HPUs #4018
Run time
Learn about OS # on GitHub ActionsJob | Run time |
---|---|
6h 0m 14s | |
6h 0m 14s |
Job | Run time |
---|---|
6h 0m 14s | |
6h 0m 14s |