sglang/ GB200 perf

dsr1-fp8-1k1k-max-tpt

concurrency 1,024 · 4 days ago

manual
passed
re-run
Commit

fix dependencies

csahithi·4 days ago
Run
GitHub Actions24591826053-2
Slurm job4667
GPUs48·prefill 16 / decode 32
ISL / OSL1024 / 1024

Metrics

27 captured

best_of

1.00

burstiness

1.00

completed

10,240

duration

205

max_concurrency

1,024

mean_e2el_ms

19,385ms

mean_itl_ms

873ms

mean_tpot_ms

17.91ms

mean_ttft_ms

2,889ms

median_e2el_ms

18,585ms

median_itl_ms

881ms

median_tpot_ms

17.74ms

median_ttft_ms

2,023ms

num_prompts

10,240

output_throughput

46,123tok/s

p99_e2el_ms

35,311ms

p99_itl_ms

1,404ms

p99_tpot_ms

18.90ms

p99_ttft_ms

18,039ms

request_throughput

50.03tok/s

std_e2el_ms

3,576ms

std_itl_ms

153ms

std_tpot_ms

0.39ms

std_ttft_ms

3,227ms

total_input_tokens

9,440,068

total_output_tokens

9,440,624

total_token_throughput

92,244tok/s