Skip to content

fix: align correct-sample rewards with DP-local lengths#1900

Open
miamia0 wants to merge 2 commits into
THUDM:mainfrom
miamia0:fix/correct-sample-raw-reward-alignment
Open

fix: align correct-sample rewards with DP-local lengths#1900
miamia0 wants to merge 2 commits into
THUDM:mainfrom
miamia0:fix/correct-sample-raw-reward-alignment

Conversation

@miamia0
Copy link
Copy Markdown

@miamia0 miamia0 commented May 10, 2026

No description provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant