Skip to content

Operator backfill for L2 dataset decoders (PoX-4 / sBTC / BNS) #37

@ryanwaits

Description

@ryanwaits

Context

Phase 2 datasets (PoX-4, sBTC, BNS — tracked in C1-C3) each get a new L2 decoder + dedicated tables. The default behavior is forward-only: decoders run from their checkpoint onward, dataset shows "data from cursor X forward."

Per the May 6 planning conversation, this is acceptable for v0 launch — it lets us ship the datasets faster without a heavy historical backfill pass for each.

What's deferred

A one-shot operator job that walks each new decoder backward from block 0 and populates the historical rows in its dedicated table. Mirrors the existing genesis-backfill pattern documented in docker/docs/OPERATIONS.md §8 for the parquet publishers.

For each new decoder:

  • A --from-block / --to-block CLI mode (or a flag on the existing decoder consume loop)
  • A wrapper script that iterates 10K-block windows from genesis up to the decoder's current checkpoint
  • Idempotency: re-running an already-processed window is a no-op or fails by default
  • Documentation alongside the publisher backfill in OPERATIONS.md §8

When to run

After a Foundation grant + customer demand makes "the full historical archive in dataset form" a real ask. Until then, the forward-only path is fine — customers who need historical data hit /v1/streams/events directly.

Out of scope

  • Automatic backfill on first run (deliberate — the operator decides cost).
  • Parallelization across windows (sequential is fine; the indexer DB is the bottleneck, not the decoder).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions