[BUG] [vLLM] Batch truncation bug: context length check uses first prompt instead of longest

## Describe the bug
When running batched generation with `VLLMModel._greedy_until`, context-length checks were based only on the first prompt in the batch (`len(inputs[0])`) instead of the longest prompt.  
If the first prompt was short but another prompt in the same batch was longer, truncation could be skipped incorrectly, causing some samples to exceed `max_length`.

## To Reproduce
1. Use `lighteval` with the vLLM backend and configure a finite `max_length`.
2. Create a batch with prompts of different lengths, where:
   - first prompt is short,
   - at least one later prompt is long enough that `prompt_len + max_new_tokens > max_length`.
3. Run a generation call that reaches `_greedy_until` (e.g. a normal evaluation batch with `max_new_tokens` set).
4. Observe that truncation/logging decisions are made from the first prompt length, so longer prompts in the same batch may not be truncated as required.

Minimal example logic (conceptual):
- `max_length = 1024`
- `max_new_tokens = 200`
- prompt lengths in same batch: `[100, 950]`
- old behavior checks `100 + 200`, decides no truncation, but second sample actually needs truncation (`950 + 200 > 1024`).

## Expected behavior
Truncation decisions should use the worst-case prompt length in the batch (the maximum prompt length), so all samples remain within `max_length`.  
Warnings should clearly indicate batch-aware length handling.

## Version info
- **lighteval main/33acf35f02c41d234c7df5cbdf1fd3e9d33ecd76**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] [vLLM] Batch truncation bug: context length check uses first prompt instead of longest #1204

Describe the bug

To Reproduce

Expected behavior

Version info

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[BUG] [vLLM] Batch truncation bug: context length check uses first prompt instead of longest #1204

Description

Describe the bug

To Reproduce

Expected behavior

Version info

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions