sglang/ GB200 perf

dsr1-fp4-1k1k-mid-curve

concurrency 4,096 · 19 days ago

cron
passed
re-run
Commit

docs: sync legacy docs/-only updates into docs_new (Mintlify) (#27308)

zijiexia·19 days ago
PR #27308 · docs: sync legacy docs/-only updates into docs_new (Mintlify)
Run
GitHub Actions26993287488-1
Slurm job5119
GPUs48·prefill 16 / decode 32
ISL / OSL1024 / 1024

Metrics

28 captured

best_of

1.00

burstiness

1.00

completed

40,960

duration

312

max_concurrency

4,096

mean_e2el_ms

30,300ms

mean_itl_ms

1,112ms

mean_tpot_ms

22.70ms

mean_ttft_ms

9,407ms

median_e2el_ms

30,286ms

median_itl_ms

1,102ms

median_tpot_ms

22.83ms

median_ttft_ms

10,432ms

num_prompts

40,960

output_throughput

121,003tok/s

p99_e2el_ms

46,569ms

p99_itl_ms

2,205ms

p99_tpot_ms

28.02ms

p99_ttft_ms

22,300ms

peak_output_tokens_per_s

158,549s

request_throughput

131tok/s

std_e2el_ms

4,137ms

std_itl_ms

322ms

std_tpot_ms

3.11ms

std_ttft_ms

4,021ms

total_input_tokens

37,769,666

total_output_tokens

37,742,239

total_token_throughput

242,093tok/s