Skip to content

Fix viewer join broadcasting device registry DELETE changesets#366

Closed
cursor[bot] wants to merge 1 commit into
mainfrom
cursor/critical-bug-detection-85aa
Closed

Fix viewer join broadcasting device registry DELETE changesets#366
cursor[bot] wants to merge 1 commit into
mainfrom
cursor/critical-bug-detection-85aa

Conversation

@cursor
Copy link
Copy Markdown
Contributor

@cursor cursor Bot commented May 26, 2026

Bug and impact

Cluster-wide device registry corruption (data loss)
When a desktop connects to a remote brain via sync.connectToBrain, clearClusterRegistryForViewerJoin deletes all rows in the CRR-backed devices and sync_cluster_state tables locally, then flushLocalChanges pushes those DELETE changesets to the brain. The brain applies them via applyChanges, which can tombstone every paired device for the project cluster-wide. The same pending DELETEs can reach iOS peers when the machine becomes local brain again after disconnectFromBrain.

Concrete trigger: Machine A is the sync brain with phone + desktop B registered. Desktop B connects to A as viewer → local registry clear → successful connect → flushLocalChanges → brain merges mass DELETEs → paired devices disappear from sync status.

Root cause

  • devices / sync_cluster_state are CRR tables; SQL DELETE generates crsql_changes rows for the local site.
  • Viewer join intentionally clears stale local registry state but immediately flushed those unpublished deletes upstream.
  • disconnectFromBrain uses the same clear path without isolating changes from relay.

Fix

  • Add AdeDb.sync.discardUnpublishedChangesForTables() and call it from clearClusterRegistryForViewerJoin for devices and sync_cluster_state so local clears are not relayed.
  • Reset the peer outbound cursor via syncPeerService.acknowledgeLocalDbVersion() before connecting so any stray version gap cannot replay discarded batches.

Validation

  • npm run test -- src/main/services/sync/deviceRegistryService.test.ts -t "viewer join clear" (runs when cr-sqlite is available; skipped on Linux CI without the extension)
  • Updated AdeDb sync mocks in affected unit tests

Does not overlap open PRs #363 (CRR alter / remote switch binding), #364 (orchestration approval bypass), or #365 (orchestration stale manifest).

Open in Web View Automation 

When a desktop connects to a remote brain as viewer, clearClusterRegistryForViewerJoin
wiped local devices/sync_cluster_state and flushLocalChanges relayed those DELETEs to
the brain, which could tombstone the entire paired device registry cluster-wide.

Discard unpublished crsql_changes for those tables after the local clear and reset the
peer outbound cursor before connecting so only post-handshake updates (e.g. touchLocalDevice)
are synced.

Co-authored-by: Arul Sharma <arul28@users.noreply.github.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 26, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
ade Ignored Ignored May 26, 2026 9:23am

Copy link
Copy Markdown
Owner

arul28 commented May 28, 2026

Folded this finding into the active files-fix work. I kept the viewer-join protection, but changed the implementation away from directly deleting from crsql_changes because the current cr-sqlite runtime rejects DELETEs against that table. The fix now persists a local-only suppression watermark for devices and sync_cluster_state, filters those local clears out of exports, acknowledges the peer cursor before reconnecting, and adds a regression test covering the viewer clear path.

@arul28 arul28 closed this May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants