[AsyncGRPO] aiohttp limits to 100 reqs when max_inflight_tasks > 100

By default `max_inflight_tasks = max_staleness * per_device_train_batch_size * gradient_accumulation_steps * num_processes`

`AsyncRolloutWorker` `aiohttp` TCP connector limits it to 100 by default which results in vLLM having 100 reqs running even in cases where `max_inflight_tasks > 100`

```
# My max_inflight_tasks = 256 and yet I only have 100 vLLM reqs

Engine 000: Avg prompt throughput: 255.0 tokens/s, Avg generation throughput: 4757.7 tokens/s
Running: 100 reqs, Waiting: 0 reqs, GPU KV cache usage: 12.7%, Prefix cache hit rate: 0.0%
         ^^^^^^^^           ^^^^^^
``` 

I wonder if this behavior is intended for stability. Setting that limit to `max(100, max_inflight_tasks)` so that vLLM handles all max_inflight_tasks? User can adjust `--max-num-seqs` from vLLM to control Running requests from there...

I can move to discussion if needed, thanks in advance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AsyncGRPO] aiohttp limits to 100 reqs when max_inflight_tasks > 100 #5847

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[AsyncGRPO] aiohttp limits to 100 reqs when max_inflight_tasks > 100 #5847

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions