sglang/ GB200 perf

dsr1-fp4-1k1k-mid-curve

concurrency 8,192 · 5 days ago

cron
passed
re-run
Commit

[AMD] Document Mori XGMI for Single-Node PD Disaggregation (#25094)

clintg6·5 days ago
PR #25094 · [AMD] Document Mori XGMI for Single-Node PD Disaggregation
Run
GitHub Actions27803336065-1
Slurm job5322
GPUs48·prefill 16 / decode 32
ISL / OSL1024 / 1024

Metrics

28 captured

best_of

1.00

burstiness

1.00

completed

81,920

duration

716

max_concurrency

8,192

mean_e2el_ms

69,315ms

mean_itl_ms

1,181ms

mean_tpot_ms

24.10ms

mean_ttft_ms

47,128ms

median_e2el_ms

72,858ms

median_itl_ms

996ms

median_tpot_ms

21.51ms

median_ttft_ms

53,915ms

num_prompts

81,920

output_throughput

105,478tok/s

p99_e2el_ms

84,795ms

p99_itl_ms

3,496ms

p99_tpot_ms

42.42ms

p99_ttft_ms

59,734ms

peak_output_tokens_per_s

149,689s

request_throughput

114tok/s

std_e2el_ms

9,261ms

std_itl_ms

657ms

std_tpot_ms

7.28ms

std_ttft_ms

12,245ms

total_input_tokens

75,511,905

total_output_tokens

75,496,082

total_token_throughput

210,977tok/s