Why
I am building a multi-agent orchestrator/ADE on top of the Cloudflare stack (@cloudflare/sandbox + Agents SDK). Right now lifecycle transitions are inferred by polling and side effects, which adds latency and creates race conditions around start/stop/destroy transitions.
Sandbox already has internal lifecycle hook points (onStart, onStop, destroy in packages/sandbox/src/sandbox.ts), so this proposal makes those transitions available as a stable public event contract.
Proposal
Phase 1 (MVP): add a first-class lifecycle event API in SDK/DO.
- Event types:
sandbox.started, sandbox.stopped, sandbox.destroyed.
- Each event includes
eventId, eventType, sandboxId, timestamp, optional traceId, and optional event-specific data.
- Add
sandbox.listLifecycleEvents(options) with cursor + limit + optional type filter, so orchestrators can consume events without polling internal state.
Phase 2: add outbound webhook delivery on top of the same event source.
Semantics and scope
- Delivery semantics: at-least-once, best-effort ordering
- Consumers dedupe using
eventId
- Non-goals: exactly-once guarantees and global strict ordering
- Backward compatibility: additive only
- Out of scope for initial implementation: webhook management UI and a large integrations framework.
Maintainer feedback requested
- Should
sandbox.stopped represent only DO stop, or any container sleep transition?
- Is event journal + list API the preferred first release shape?
- Should webhook delivery live in core repo (phase 2) or example/integration package first?
Proposed delivery plan
Planned sequence: PR1 establishes the lifecycle event contract. PR2 adds the paginated listing API. PR3+ (including webhook delivery) builds on that contract after maintainer sign-off on PR1 semantics.
If this direction looks good, I’ll start with PR1 (event contract + emission + tests) and follow with PR2+.
Why
I am building a multi-agent orchestrator/ADE on top of the Cloudflare stack (
@cloudflare/sandbox+ Agents SDK). Right now lifecycle transitions are inferred by polling and side effects, which adds latency and creates race conditions around start/stop/destroy transitions.Sandbox already has internal lifecycle hook points (
onStart,onStop,destroyinpackages/sandbox/src/sandbox.ts), so this proposal makes those transitions available as a stable public event contract.Proposal
Phase 1 (MVP): add a first-class lifecycle event API in SDK/DO.
sandbox.started,sandbox.stopped,sandbox.destroyed.eventId,eventType,sandboxId,timestamp, optionaltraceId, and optional event-specificdata.sandbox.listLifecycleEvents(options)with cursor + limit + optional type filter, so orchestrators can consume events without polling internal state.Phase 2: add outbound webhook delivery on top of the same event source.
Semantics and scope
eventIdMaintainer feedback requested
sandbox.stoppedrepresent only DO stop, or any container sleep transition?Proposed delivery plan
Planned sequence: PR1 establishes the lifecycle event contract. PR2 adds the paginated listing API. PR3+ (including webhook delivery) builds on that contract after maintainer sign-off on PR1 semantics.
If this direction looks good, I’ll start with PR1 (event contract + emission + tests) and follow with PR2+.