Skip to content

[8.7.0] 19/23: Get the local and remote repo contents cache to work together#29085

Draft
fmeum wants to merge 19 commits intobazelbuild:release-8.7.0from
fmeum:rrcc-8.7.0-19
Draft

[8.7.0] 19/23: Get the local and remote repo contents cache to work together#29085
fmeum wants to merge 19 commits intobazelbuild:release-8.7.0from
fmeum:rrcc-8.7.0-19

Conversation

@fmeum
Copy link
Copy Markdown
Collaborator

@fmeum fmeum commented Mar 24, 2026

Cherry-pick of b143070 for release 8.7.0 (part 19/23 of the remote repo contents cache feature). Depends on #29084.

  • Also upload to the remote cache when the local cache is in use. The fix is simple but subtle: the logic for the two caches in RepositoryDelegatorFunction has to be flipped since the Skyframe restart after adding an entry to the local cache meant that the same code path would not be taken again.
  • Fix a crash when using both by ensuring that the local repo contents cache uses the file system backing the output base, not the workspace directory.

Closes #28002.

fmeum and others added 19 commits March 13, 2026 18:06
…seful

Non-functional changes only: remove Pair indirection in
ExternalFilesHelper, extract getExternalRepoName() and
getExternalDirectory() helpers, move addExternalFilesDependencies
into ExternalFilesHelper, modernize switch expression in
DirtinessCheckerUtils, formatting fixes.

Does not include the functional behavior change of refetching repos
on external modifications.

(cherry picked from commit 5e3f0c8)
… inputs

Ports the essential API changes from 41ccfef needed by later feature
commits:
- Add RepoRecordedInput.WithValue record with parse/toString/escape/unescape
- Add overloaded isAnyValueOutdated(Environment, BlazeDirectories, List<WithValue>)
- Remove Comparable<RepoRecordedInput> and COMPARATOR (replaced by order preservation)
- Change TreeMap to LinkedHashMap in RepositoryDelegatorFunction for order preservation

(cherry picked from commit 41ccfef)
…nv handling

Ports the essential API changes from 01407ce needed by later feature
commits:
- Add EnvironmentVariableValue record type
- Add RepoEnvironmentFunction with REPO_ENV + client env fallback
- Register REPOSITORY_ENVIRONMENT_VARIABLE in SkyFunctions and SkyframeExecutor
- Update EnvVar.getSkyKey() to use RepoEnvironmentFunction
- Update EnvVar.isOutdated() to use EnvironmentVariableValue

On 8.7.0, RepoEnvironmentFunction checks --repo_env first, then falls
back to the client environment via ClientEnvironmentFunction, since the
consolidated repo env computation from CommandEnvironment is not present.

(cherry picked from commit 01407ce)
Ports the essential changes from fe040a3:
- Rename DigestWriter.ruleKey to predeclaredInputHash and make it
  package-private (needed by later feature commits)
- Switch RepoRecordedInput.File, Dirents, DirTree, EnvVar types to
  implement Comparable and use ImmutableSortedMap
- Add ImmutableSortedMap Gson type adapter
- Update LockFileModuleExtension, RunnableExtension, and related types
  to use ImmutableSortedMap for recorded inputs

Does NOT include the change to fold environ values into the
predeclared input hash computation itself; that requires
CommandEnvironment changes not present on 8.7.0.

(cherry picked from commit fe040a3)
* Rename `RepoContentsCache` to `LocalRepoContentsCache`
* Generalize `RemoteRepositoryRemoteExecutorFactory` to `RemoteRepositoryHelperFactory`

Work towards bazelbuild#6359

Closes bazelbuild#27311.

PiperOrigin-RevId: 822553693
Change-Id: I1bad204340c06621cea806368d6bec99ca450a0f
(cherry picked from commit 32be423)
…test

(cherry picked from commit 0336a868183ebcf27e3d4f7fdfac8c9f8b5b3ad3)
I haven't been able to reproduce this in a test, but this should fix the following crash observed while running `bazel info`:
```
FATAL: bazel crashed due to an internal error. Printing stack trace:
java.lang.NullPointerException: Cannot invoke "java.util.concurrent.ExecutorService.shutdownNow()" because "this.materializationExecutor" is null
	at com.google.devtools.build.lib.remote.RemoteExternalOverlayFileSystem.afterCommand(RemoteExternalOverlayFileSystem.java:145)
	at com.google.devtools.build.lib.remote.RemoteModule.afterCommand(RemoteModule.java:1034)
	at com.google.devtools.build.lib.runtime.BlazeRuntime.afterCommand(BlazeRuntime.java:787)
	at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.execExclusively(BlazeCommandDispatcher.java:807)
	at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.exec(BlazeCommandDispatcher.java:266)
	at com.google.devtools.build.lib.server.GrpcServerImpl.executeCommand(GrpcServerImpl.java:608)
	at com.google.devtools.build.lib.server.GrpcServerImpl.lambda$run$0(GrpcServerImpl.java:679)
	at io.grpc.Context$1.run(Context.java:566)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
```

Closes bazelbuild#27690.

PiperOrigin-RevId: 833722608
Change-Id: I88c485a01e5967657ec3b5529a47639b743b18e6
(cherry picked from commit a7d0e91)
Don't print a message when it's successful. Users can always look under `external` to verify which repo came from the cache.

Closes bazelbuild#27699.

PiperOrigin-RevId: 834096735
Change-Id: I3916fb240218a6b68ecf48417142b998ca281598
(cherry picked from commit 3ca9ce1)
Fixes the creation of empty directories and also contains a speculative fix for the following issue observed during a sequence of real builds:

```
Error in path: Failed to materialize remote repo @@protoc-gen-validate+: [unix_jni.cc:302] /home/ubuntu/.cache/bazel/_bazel_ubuntu/123/external/protoc-gen-validate+/example-workspace/.bazelrc (File exists)
ERROR: //:foo :: Error loading option //:foo: error evaluating module extension @@gazelle+//:extensions.bzl%go_deps
```

The mentioned file is a symlink.

Closes bazelbuild#27711.

PiperOrigin-RevId: 836122472
Change-Id: I8becd8c3640a659d28dc433340db962c18563d9f
(cherry picked from commit b27ea05)
Ensures that the returned `Path` is still in the overlay file system.

Also make the error message emitted by `Path#checkSameFileSystem` more informative. This is motivated by and helped discover the above as the fix for the following crash observed when using the remote repo contents cache with an explicit `--sandbox_base`:

```
Caused by: java.lang.IllegalArgumentException: Files are on different filesystems: /dev/shm/bazel-sandbox.b10976335efa519b0184f3091ac8e21f7beefb92142303f9ab2c3341f45a2f28/linux-sandbox/18/execroot/_main/external/c-ares+/configs/ares_build.h (on com.google.devtools.build.lib.unix.UnixFileSystem@5e0a8154), /home/ubuntu/.cache/bazel/_bazel_ubuntu/123/execroot/_main/external/c-ares+/configs/ares_build.h (on com.google.devtools.build.lib.remote.RemoteExternalOverlayFileSystem@6cd9bfda)
        at com.google.devtools.build.lib.vfs.Path.checkSameFileSystem(Path.java:964)
        at com.google.devtools.build.lib.vfs.Path.createSymbolicLink(Path.java:523)
        at com.google.devtools.build.lib.vfs.Path.createSymbolicLink(Path.java:535)
        at com.google.devtools.build.lib.sandbox.SymlinkedSandboxedSpawn.copyFile(SymlinkedSandboxedSpawn.java:129)
```

Alternative to bazelbuild#27721

Closes bazelbuild#27802.

PiperOrigin-RevId: 837832265
Change-Id: I3b73167496b011aef66954d59ca3804b4b64996f
(cherry picked from commit 8eaf6a9)
Fixes bazelbuild#27981

Fixes the following type of crash and, incidentally, a remote repo contents cache test that resulted in a related crash:
```
    FATAL: bazel crashed due to an internal error. Printing stack trace:
    java.lang.IllegalStateException: Unknown error during configuration creation evaluation
            at com.google.devtools.build.lib.skyframe.SkyframeExecutor.getConfiguration(SkyframeExecutor.java:2143)
            at com.google.devtools.build.lib.skyframe.SkyframeExecutor.createConfiguration(SkyframeExecutor.java:1876)
            at com.google.devtools.build.lib.analysis.BuildView.update(BuildView.java:281)
            at com.google.devtools.build.lib.buildtool.AnalysisPhaseRunner.runAnalysisPhase(AnalysisPhaseRunner.java:399)
            at com.google.devtools.build.lib.buildtool.AnalysisPhaseRunner.execute(AnalysisPhaseRunner.java:144)
            at com.google.devtools.build.lib.buildtool.BuildTool.buildTargetsWithoutMergedAnalysisExecution(BuildTool.java:512)
            at com.google.devtools.build.lib.buildtool.BuildTool.buildTargets(BuildTool.java:414)
            at com.google.devtools.build.lib.buildtool.BuildTool.processRequest(BuildTool.java:907)
            at com.google.devtools.build.lib.runtime.commands.CqueryCommand.exec(CqueryCommand.java:197)
            at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.execExclusively(BlazeCommandDispatcher.java:783)
            at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.exec(BlazeCommandDispatcher.java:266)
            at com.google.devtools.build.lib.server.GrpcServerImpl.executeCommand(GrpcServerImpl.java:608)
            at com.google.devtools.build.lib.server.GrpcServerImpl.lambda$run$0(GrpcServerImpl.java:679)
            at io.grpc.Context$1.run(Context.java:566)
            at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
            at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
            at java.base/java.lang.Thread.run(Unknown Source)
    Caused by: com.google.devtools.build.lib.skyframe.toolchains.PlatformLookupUtil$InvalidPlatformException: com.google.devtools.build.lib.packages.BuildFileNotFoundException: no such package '@@[unknown repo 'toolchains_llvm_boostrapped' requested from @@ (did you mean 'toolchains_llvm_bootstrapped'?)]//platforms': The repository '@@[unknown repo 'toolchains_llvm_boostrapped' requested from @@ (did you mean 'toolchains_llvm_bootstrapped'?)]' could not be resolved: No repository visible as '@toolchains_llvm_boostrapped' from main repository
            at com.google.devtools.build.lib.analysis.platform.PlatformFunction.compute(PlatformFunction.java:75)
            at com.google.devtools.build.lib.analysis.platform.PlatformFunction.compute(PlatformFunction.java:43)
            at com.google.devtools.build.skyframe.ParallelEvaluator.bubbleErrorUp(ParallelEvaluator.java:414)
            at com.google.devtools.build.skyframe.ParallelEvaluator.waitForCompletionAndConstructResult(ParallelEvaluator.java:207)
            at com.google.devtools.build.skyframe.ParallelEvaluator.doMutatingEvaluation(ParallelEvaluator.java:173)
            at com.google.devtools.build.skyframe.ParallelEvaluator.eval(ParallelEvaluator.java:672)
            at com.google.devtools.build.skyframe.AbstractInMemoryMemoizingEvaluator.evaluate(AbstractInMemoryMemoizingEvaluator.java:182)
            at com.google.devtools.build.lib.skyframe.SkyframeExecutor.evaluate(SkyframeExecutor.java:4279)
            at com.google.devtools.build.lib.skyframe.SkyframeExecutor.lambda$evaluateSkyKeys$0(SkyframeExecutor.java:2278)
            at com.google.devtools.build.lib.concurrent.Uninterruptibles.callUninterruptibly(Uninterruptibles.java:35)
            at com.google.devtools.build.lib.skyframe.SkyframeExecutor.evaluateSkyKeys(SkyframeExecutor.java:2274)
            at com.google.devtools.build.lib.skyframe.SkyframeExecutor.getConfiguration(SkyframeExecutor.java:2126)
            ... 16 more
```

Closes bazelbuild#28004.

PiperOrigin-RevId: 845941915
Change-Id: I6ead8dd1662efe90f529a6e21041a225882415dc
(cherry picked from commit d6dc631)
`.bzl` files are typically small, but can form deep DAGs that require a large number of sequential cache requests to fetch lazily. By prefetching them (as well as `REPO.bazel` files) eagerly, the wall time of one particular fully cached cold `--nobuild` build of Bazel itself decreased by a factor of 5.

Along the way, make remote repo contents cache failures non-fatal, matching the behavior of the remote cache.

Closes bazelbuild#27910.

PiperOrigin-RevId: 853153815
Change-Id: I368a14a845a8d9fb543f473d8c0c2178a4590c78
(cherry picked from commit 361c420)
…erbose_failures`

Makes it easier to debug issues with this experimental feature and also matches the behavior of remote execution/caching.

Work towards bazelbuild#27965

Closes bazelbuild#27970.

PiperOrigin-RevId: 853238791
Change-Id: Id46ccbb105d93fd17114fab13b086d0b46139fb4
(cherry picked from commit fc5f160)
Ensures that files under repo contents cache entries are not reported as missing after the cache has been deleted while the Bazel server is running. See the long comment in `RepositoryFetchFunction` for why this happens and how it is fixed.

Fixes bazelbuild#26450

Closes bazelbuild#28147.

PiperOrigin-RevId: 853622194
Change-Id: Ifba953b72258030e0a640ac49947ac5c5fc7620a
(cherry picked from commit 7019132)
* Also upload to the remote cache when the local cache is in use. The fix is simple but subtle: the logic for the two caches in `RepositoryFetchFunction` has to be flipped since the Skyframe restart after adding an entry to the local cache meant that the same code path would not be taken again.
* Fix a crash when using both by ensuring that the local repo contents cache uses the file system backing the output base, not the workspace directory:
```
FATAL: bazel crashed due to an internal error. Printing stack trace:
java.lang.RuntimeException: Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@@rules_python+' (requested by nodes 'REPO_FILE:@@rules_python+')
	at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:552)
	at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:435)
	at java.base/java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
Caused by: java.lang.IllegalArgumentException: Files are on different filesystems: C:/users/runneradmin/_bazel_runneradmin/ebfu7cpi/external/@rules_python+.marker (on com.google.devtools.build.lib.remote.RemoteExternalOverlayFileSystem@79583b9), C:/Users/runneradmin/.cache/bazel-repo/contents/_trash/26a5feef-bf8c-4326-bf3d-500997c7362e (on com.google.devtools.build.lib.windows.WindowsFileSystem@24180f0f)
	at com.google.devtools.build.lib.vfs.Path.checkSameFileSystem(Path.java:964)
	at com.google.devtools.build.lib.vfs.Path.renameTo(Path.java:630)
	at com.google.devtools.build.lib.vfs.FileSystemUtils.moveFile(FileSystemUtils.java:456)
	at com.google.devtools.build.lib.bazel.repository.cache.LocalRepoContentsCache.moveToCache(LocalRepoContentsCache.java:172)
	at com.google.devtools.build.lib.bazel.repository.RepositoryFetchFunction.compute(RepositoryFetchFunction.java:297)
	at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:471)
```

Closes bazelbuild#28002.

PiperOrigin-RevId: 855211557
Change-Id: I2f3c40a6aef594682fba989853f7ee982f30c294
(cherry picked from commit b143070)
@iancha1992 iancha1992 added this to the 8.7.0 release blockers milestone Mar 27, 2026
@iancha1992 iancha1992 added the soft-release-blocker Soft release blockers that are nice to have, but shouldn't block the release if it's the last one. label Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

soft-release-blocker Soft release blockers that are nice to have, but shouldn't block the release if it's the last one.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants