Skip to content

How to build the construction of the local retrieval corpus #2

@qiaodan-cuhk

Description

@qiaodan-cuhk

Hi,

Thanks for your impressive work on DeepResearch-9K and the R1 framework! It provides a very solid baseline for agentic RL research. The paper mentions a low-cost autonomous pipeline for synthesizing the "Wikipedia-style summaries" used in the local retriever.

However, I couldn't find the specific scripts for this synthesis process in the repository and huggingface. I would like to confirm if the current retrieval scripts in the repo are designed for this specific summary-based corpus or the original Wiki-18 dataset as used in Search-R1. If it's the former, are there any scripts available to build such a corpus?

Thanks for the great contribution!

Best

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions