Fix JVM <clinit> deadlock by removing static final accessor fields#48689
Fix JVM <clinit> deadlock by removing static final accessor fields#48689jeet1995 merged 6 commits intoAzure:mainfrom
Conversation
|
Closing — bridge classes don't allow adding new methods. Proceeding with #48667 (Class.forName with explicit classloader). |
e57066d to
66afd43
Compare
There was a problem hiding this comment.
Pull request overview
This PR addresses a JVM <clinit> deadlock in the Cosmos Java SDK by removing many static final ...Accessor caches in consuming classes and switching call sites to resolve accessors lazily via ImplementationBridgeHelpers.*Helper.get*Accessor() on demand, reducing class-initialization-time cross-dependencies.
Changes:
- Replaced numerous
private static final XxxAccessor ... = getXxxAccessor()fields with inline (lazy) getter calls at usage sites. - Added/adjusted
static { initialize(); }blocks and<clinit>ordering to ensure accessors are registered safely during class initialization where required. - Added forked-JVM regression/enforcement tests around concurrent
<clinit>behavior and accessor registration.
Reviewed changes
Copilot reviewed 54 out of 54 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk/spring/azure-spring-data-cosmos/README.md | Trailing whitespace/newline adjustment. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/util/CosmosPagedFluxStaticListImpl.java | Removed static accessor cache; inline FeedResponse accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/util/CosmosPagedFluxDefaultImpl.java | Removed static accessor cache; inline diagnostics context accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/models/FeedResponse.java | Removed static diagnostics accessor cache; inline accessor calls. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/models/CosmosOperationDetails.java | Added static { initialize(); } registration block. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/models/CosmosItemRequestOptions.java | Removed static thresholds accessor cache; inline thresholds accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/StaleResourceRetryPolicy.java | Removed static exception accessor cache. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/SessionTokenMismatchRetryPolicy.java | Removed static accessor cache; inline session retry options accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/RxDocumentClientImpl.java | Removed multiple static accessor caches; inline accessor usage across implementation. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/QueryPlanRetriever.java | Removed static accessor caches; inline accessors for options/exception handling. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/PipelinedQueryExecutionContext.java | Removed static accessor cache; inline accessor usage when cloning options. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/PipelinedDocumentQueryExecutionContext.java | Removed static accessor caches; inline options and serializer accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/ParallelDocumentQueryExecutionContext.java | Removed static accessor caches; inline options/diagnostics accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/OrderByUtils.java | Removed static diagnostics accessor cache; inline diagnostics accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/OrderByDocumentQueryExecutionContext.java | Removed static accessor caches; inline feed/diagnostics accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/OrderByDocumentProducer.java | Removed static feed accessor cache; inline feed accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/NonStreamingOrderByUtils.java | Removed static diagnostics accessor cache; inline diagnostics accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/NonStreamingOrderByDocumentQueryExecutionContext.java | Removed static accessor caches; inline feed/diagnostics accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/HybridSearchDocumentQueryExecutionContext.java | Removed static accessor caches; inline feed/diagnostics accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/GroupByDocumentQueryExecutionContext.java | Removed static diagnostics accessor cache; inline diagnostics accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/Fetcher.java | Removed static diagnostics accessor cache; inline diagnostics accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/DocumentQueryExecutionContextFactory.java | Inline options accessor usage in creation flow. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/DocumentQueryExecutionContextBase.java | Removed static accessor caches; inline accessor usage for request creation and cloning. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/DocumentProducer.java | Removed static accessor cache; inline accessor usage when cloning options. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/DefaultDocumentQueryExecutionContext.java | Removed static accessor cache; inline accessor usage for partition key definition/properties. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/DCountDocumentQueryExecutionContext.java | Removed static diagnostics accessor cache; inline diagnostics accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/ChangeFeedFetcher.java | Removed static feed accessor cache; inline feed accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/AggregateDocumentQueryExecutionContext.java | Removed static accessor caches; inline feed/diagnostics accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/JsonSerializable.java | Removed static serializer accessor cache; inline serializer accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/ImplementationBridgeHelpers.java | Renamed thresholds accessor getter (getCosmosAsyncClientAccessor → getCosmosDiagnosticsThresholdsAccessor). |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/http/HttpClientConfig.java | Removed static HTTP2 config accessor cache; inline accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/Document.java | Removed static serializer accessor cache; inline serializer accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/directconnectivity/GoneAndRetryWithRetryPolicy.java | Removed static exception accessor cache; inline exception accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/DiagnosticsProvider.java | Removed multiple static accessor caches; inline accessors throughout tracing/metrics paths. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/CosmosQueryRequestOptionsImpl.java | Updated thresholds accessor getter name usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/CosmosQueryRequestOptionsBase.java | Updated thresholds accessor getter name usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/ConnectionPolicy.java | Removed static HTTP2 config accessor cache; inline accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/clienttelemetry/ClientTelemetryMetrics.java | Removed static accessor caches; inline accessors for telemetry metrics recording. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/clienttelemetry/ClientMetricsDiagnosticsHandler.java | Removed static telemetry config accessor cache. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/ChangeFeedQueryImpl.java | Removed static accessor caches; inline accessors for change feed request/response. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/caches/RxCollectionCache.java | Removed static exception accessor cache; inline exception accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/batch/TransactionalBulkExecutor.java | Removed static batch request options accessor cache; inline accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/batch/BulkExecutor.java | Removed static accessor caches; inline accessors for batch response and diagnostics provider. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/CosmosRequestContext.java | Added static { initialize(); } registration block. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/CosmosItemSerializer.java | Reordered <clinit> to register accessor before static fields. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/CosmosDiagnosticsContext.java | Added static { initialize(); } registration block. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/CosmosContainerProactiveInitConfig.java | Removed static container identity accessor cache; inline accessor usage. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/CosmosAsyncUser.java | Removed static accessor caches; inline accessors for query naming/feed response creation. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/CosmosAsyncScripts.java | Removed static accessor caches; inline accessors for query naming/feed response creation. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/CosmosAsyncDatabase.java | Removed static accessor caches; inline accessors for query naming/feed response creation. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/CosmosAsyncContainer.java | Removed many static accessor caches; inline accessors across request/response, policies, and telemetry. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/CosmosAsyncClient.java | Removed static accessor caches; inline telemetry/options/feed response accessors. |
| sdk/cosmos/azure-cosmos/CHANGELOG.md | Added changelog bullet for <clinit> deadlock fix. |
| sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/implementation/ImplementationBridgeHelpersTest.java | Added forked-JVM deadlock regression test and accessor registration enforcement test. |
… stale docs - Removed remaining static final accessor fields in DocumentQueryExecutionContextFactory, CosmosQueryRequestOptionsBase, CosmosQueryRequestOptionsImpl - Extracted local variables for long inline accessor chains in SessionTokenMismatchRetryPolicy, RxDocumentClientImpl, CosmosPagedFluxDefaultImpl - Updated test Javadoc to reflect lazy accessor approach (not Class.forName) - Reduced child JVM runs from 3 to 1 (invocationCount=5 provides repetition) - Fixed CHANGELOG PR link to Azure#48689 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
4e41bc9 to
101efda
Compare
… stale docs - Removed remaining static final accessor fields in DocumentQueryExecutionContextFactory, CosmosQueryRequestOptionsBase, CosmosQueryRequestOptionsImpl - Extracted local variables for long inline accessor chains in SessionTokenMismatchRetryPolicy, RxDocumentClientImpl, CosmosPagedFluxDefaultImpl - Updated test Javadoc to reflect lazy accessor approach (not Class.forName) - Reduced child JVM runs from 3 to 1 (invocationCount=5 provides repetition) - Fixed CHANGELOG PR link to Azure#48689 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
d8edfe4 to
7332533
Compare
|
✅ Review complete (36:01) Posted 2 inline comment(s). Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage |
… null Fixes Azure#48622, Azure#48585 Replace all static final accessor fields and inline ImplementationBridgeHelpers calls with uniform private static getter methods. This eliminates <clinit>-time class loading that caused permanent deadlocks under concurrent class initialization (JLS 12.4.2). Fix CosmosItemSerializer.DEFAULT_SERIALIZER circular <clinit> — create instance directly and move INTERNAL_DEFAULT_SERIALIZER to parent class to prevent concurrent <clinit> between parent and child. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- ModelBridgeInternal.java: Remove 26 duplicate ImplementationBridgeHelpers imports - ItemBulkOperation.java: Remove 2 duplicate ImplementationBridgeHelpers imports - SqlQuerySpecWithEncryption.java: Add private static internalDefaultSerializer() getter (matching uniform pattern), replace inline accessor calls, remove unused DefaultCosmosItemSerializer import Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/azp run java - cosmos - ci |
|
/azp run java - cosmos - spark |
|
/azp run java - cosmos - tests |
|
/azp run java - cosmos - kafka |
|
Azure Pipelines successfully started running 1 pipeline(s). |
3 similar comments
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…Block test Clarify that the test verifies accessor resolvability (via <clinit> or initializeAllAccessors fallback), not that each class independently registers its accessor. Structural enforcement is done by the companion noStaticOrInstanceAccessorFieldsInConsumingClasses test. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/azp run java - cosmos - ci |
|
/azp run java - cosmos - tests |
|
/azp run java - cosmos - kafka |
|
/azp run java - cosmos - spark |
|
/azp run java - spring - ci |
|
Azure Pipelines successfully started running 1 pipeline(s). |
4 similar comments
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Thin-client test failures are due to service side config updates. |
|
/check-enforcer override |
Summary
Fixes a JVM
<clinit>deadlock (#48622, #48585) that permanently hangs threads when multiple threads concurrently trigger Cosmos SDK class loading. Also fixes a latentCosmosItemSerializer.DEFAULT_SERIALIZERnull bug.Fixes: #48622, #48585
Root Cause
Deadlock: Consuming classes cached accessors in
private static finalfields. During<clinit>, the getter callsinitializeAllAccessors(), eagerly loading 40+ classes. Concurrent<clinit>of different classes creates circular init-lock waits — permanent deadlock per JLS §12.4.2.DEFAULT_SERIALIZER null:
CosmosItemSerializer.DEFAULT_SERIALIZERcross-referencedDefaultCosmosItemSerializer.DEFAULT_SERIALIZER. WhenDefaultCosmosItemSerializer.<clinit>ran first, recursive same-thread<clinit>of the parent read the child's field before it was set.Parent-child
<clinit>deadlock:DefaultCosmosItemSerializer.INTERNAL_DEFAULT_SERIALIZERwas accessed independently by implementation code, triggering child<clinit>on a different thread than the parent — creating an AB/BA init-lock deadlock between parent and child.Fix
1. Uniform static getter pattern
2. Break CosmosItemSerializer ↔ DefaultCosmosItemSerializer cycle
CosmosItemSerializer.DEFAULT_SERIALIZERcreates instance directly vianew DefaultCosmosItemSerializer(...)— no cross-class<clinit>dependencyINTERNAL_DEFAULT_SERIALIZERmoved fromDefaultCosmosItemSerializertoCosmosItemSerializer(private, exposed viaCosmosItemSerializerAccessor.getInternalDefaultSerializer()) — so that accessing it no longer triggers child<clinit>from a different thread, eliminating the AB/BA init-lock between parent and childstatic { initialize(); }placed beforeDEFAULT_SERIALIZERso the accessor is registered before construction — eliminatesinitializeAllAccessors()fallback during<clinit>DefaultCosmosItemSerializer.DEFAULT_SERIALIZERand itsserializationInclusionModeAwareObjectMapperremoved (dead code)Scope
private static XxxAccessor xxx()methodsstatic { initialize(); }addedCosmosRequestContext,CosmosOperationDetails,CosmosDiagnosticsContextgetCosmosAsyncClientAccessor()→getCosmosDiagnosticsThresholdsAccessor()inCosmosDiagnosticsThresholdsHelpercheckNotNullbug fixDefaultCosmosItemSerializerconstructor passed string literal instead of parameterExceptions (not converted to static getters)
HttpClient.java— Java 8 interface, noprivate staticmethodsTests
concurrentAccessorInitializationShouldNotDeadlock(×5 invocations)<clinit>— catches deadlock via 30s timeoutallAccessorClassesMustHaveStaticInitializerBlock<clinit>noStaticOrInstanceAccessorFieldsInConsumingClassesstaticorfinalAccessor fieldaccessorInitializationinitializeAllAccessors()bootstrap path