(feat): per-IOC locking#165
Open
anderslindho wants to merge 5 commits into
Open
Conversation
Previously abort() returned the error on CancelledError and raised a new one on other failures, leaving the chain errored so any queued commit (notably the disconnect transaction) was never executed — IOCs could stay Active in CF after an upload failure. Co-authored-by: Sky Brewer <sky.brewer@ess.eu>
The single global DeferredLock serialised all IOC commits behind one queue. Under load, the lock depth determined how long IOCs waited to be marked active or inactive, and one stuck commit blocked every other. Per-IOC locks let different IOCs commit in parallel while keeping same-IOC transactions serialised. _state_lock (threading.Lock) guards the shared iocs/channel_ioc_ids dicts against concurrent thread writes. The CF push receives a targeted snapshot of the channels relevant to this commit (records being deleted plus the IOC's current channel set) so snapshot cost scales with the commit rather than total channel count. _ioc_channels tracks which channels belong to each IOC for O(1) disconnect cost; registration deduplicates to prevent double-counting. stopService drains all in-flight per-IOC locks before running the clean_on_stop sweep, preventing Active pushes from racing the sweep. The commit path moves to @inlineCallbacks so retry waits use task.deferLater — no sleeping worker threads between attempts. Also introduces CFUpdateAbortedError to distinguish exhausted CF retries from cancellation, and fixes chain_error to surface unexpected exceptions rather than swallowing them. Co-authored-by: Sky Brewer <sky.brewer@ess.eu>
deferToThread is monkeypatched to return synchronously-resolved Deferreds so the full _commit_with_lock callback chain can be exercised without a reactor. The prepare_result tuple matches the (ioc_info, record_info_by_name, records_to_delete, iocs_snap, ciids_snap) signature of _prepare_commit; the push phase is controlled via _push_to_cf_async. Covers lock identity, iocid routing, prune lifecycle, and all four chain_error paths. Co-authored-by: Sky Brewer <sky.brewer@ess.eu>
The global lock is gone; update demo.conf and recast.py to reflect that pushAlwaysRetry and maxActive now operate per-IOC, and fix an incorrect default value and a long-standing typo in the same pass. Also rename single-letter CastFactory identifiers (P/P2 → proto/waiting, addr → _addr) to satisfy linter, and correct the _stop_service_with_lock docstring which still claimed the lifecycle lock prevents concurrent commits (commits now use per-IOC locks and are drained explicitly before the clean_on_stop sweep).
application.py unconditionally overwrote session.trlimit with the config value, which defaulted to 0 (no limit), silently negating the trlimit=5000 class default added in 89be20c. Reference CollectionSession.trlimit directly so the default cannot silently drift if the class default ever changes, and add a regression test.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



No description provided.