sglang/ GB200 perf

dsr1-fp8-1k1k-max-tpt

concurrency 6,144 · 7 days ago

cron
passed
re-run
Commit

update codeowners (#28478)

mickqian·7 days ago
PR #28478 · update codeowners
Run
GitHub Actions27663513953-1
Slurm job5315
GPUs48·prefill 16 / decode 32
ISL / OSL1024 / 1024

Metrics

28 captured

best_of

1.00

burstiness

1.00

completed

61,440

duration

471

max_concurrency

6,144

mean_e2el_ms

45,846ms

mean_itl_ms

1,798ms

mean_tpot_ms

36.83ms

mean_ttft_ms

11,947ms

median_e2el_ms

45,232ms

median_itl_ms

1,746ms

median_tpot_ms

37.25ms

median_ttft_ms

12,575ms

num_prompts

61,440

output_throughput

120,181tok/s

p99_e2el_ms

72,890ms

p99_itl_ms

3,390ms

p99_tpot_ms

43.13ms

p99_ttft_ms

34,516ms

peak_output_tokens_per_s

173,527s

request_throughput

130tok/s

std_e2el_ms

6,732ms

std_itl_ms

547ms

std_tpot_ms

4.75ms

std_ttft_ms

6,366ms

total_input_tokens

56,636,934

total_output_tokens

56,621,450

total_token_throughput

240,395tok/s