You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Cache Priming restores build parallelism in BuildStream by speculatively executing compiler invocations ahead of time, warming the Remote Execution ActionCache before actual element builds run. This dramatically reduces end-to-end build latency without compromising correctness.
Key Insight: After building an element once, we know all the subactions (e.g. recc compiler invocations) that occurred. We can store these with overlay instructions, then reuse them in future builds by adapting inputs for source/dependency changes.
BuildStream and Remote Execution: BuildStream is built on Remote Execution API (REAPI) paradigms, using a local buildbox-casd as a CAS proxy that connects to remote Execution, ActionCache, and CAS services. Cache Priming extends this architecture by adding speculative action submission to warm the ActionCache before element builds execute.
2. Problem Statement
BuildStream orchestrates repository-level builds across many elements, allowing each element to use its native builds system, but not achieving the fine-grained parallelism available within individual compilation units. C/C++ projects are particularly affected; translation units could compile in parallel but instead wait sequentially for their element's turn in the build graph.
Why Cache Priming is Particularly Effective
Code churn patterns favor cache reuse: Research consistently shows that code changes are concentrated in a small subset of files, with most code remaining stable between builds. This stability pattern makes speculative execution with cached results highly effective.
Key Research Findings:
Adams et al. (2016) conducted a comprehensive case study on GLib, PostgreSQL, Qt, and Ruby systems analyzing C/C++ header file "hotspots"—headers that change frequently and trigger expensive rebuild processes. They empirically demonstrated that most header files are relatively stable compared to implementation files, with only a small subset accounting for the majority of build-time impact.
Misu et al. (2024) analyzed 383 GitHub projects using Bazel and found that incremental builds with caching achieve median speedups of 4.22x (build system tool-independent cache) to 4.71x (build system tool-specific cache) for long-build duration projects. This demonstrates the substantial real-world value of build caching when most inputs haven't changed.
Matev (2020) studied large C++ projects using ccache for distributed compilation, demonstrating that with a hot cache, compilation can be 4-5 times faster than with cache misses. The study showed that "incremental builds often do not significantly speed up compilation because even just a change of the modification time of a widely used header leads to many compiler and linker invocations"—highlighting both the cost of header changes and the benefit when they remain stable.
Chowdhury et al. (2024) analyzed 774,051 source code methods from 49 prominent open-source Java projects and found that "approximately 80% of changes are concentrated in just 20% of the methods, demonstrating the Pareto 80/20 principle. Moreover, this subset of methods is responsible for the majority of the identified bugs in these projects." This concentration of changes means the vast majority of code remains unchanged between builds.
Supporting Evidence on Change Concentration:
Shin et al. (2011): In Mozilla Firefox and Red Hat Enterprise Linux, 70.8% of known vulnerabilities were found in only 10.9% of files, demonstrating extreme concentration of churn in a small subset of the codebase.
Nagappan and Ball (2007): In Windows Server 2003, software dependencies and churn measures were efficient predictors of post-release failures, further confirming that changes cluster in predictable locations.
Munson and Elbaum (1998): Established that code churn, the amount of code change over time, is measurable and correlates with fault-proneness, confirming that change patterns are non-uniform and predictable.
In C/C++ projects specifically:
Implementation files (.cpp) churn more frequently than header files (.h)
Header files are more stable and change less often (Adams et al., 2016)
Header file changes are expensive: Even just changing the modification time of a widely-used header triggers many recompilations (Matev, 2020)
Build caching is highly effective: Studies show 4-5x speedups with warm caches (Matev, 2020; Misu et al., 2024)
Changes tend to cluster in specific modules while the broader codebase remains stable
Example: In a typical incremental build of a C++ project with 1000 compilation units:
~200 files changed (20% following Pareto)
~800 files unchanged (80%)
With effective build caching, the 800 unchanged compilations can be served from cache
Only the 200 changed files need actual compilation
Result: Build time dominated by the 200 changed files, achieving 4-5x speedup
This is why Cache Priming is particularly powerful for C/C++ codebases: the stability of header files and the concentration of changes in a minority of implementation files means that speculative compilation actions can frequently reuse cached results from previous builds.
Priming Phase (before build): Instantiate speculative actions by applying overlays, submit for execution
Build Phase: Element builds hit warmed ActionCache, reducing latency
Generation Phase (after build): Record subactions with overlay instructions describing how to adapt inputs
4. Data Model
4.1 Protocol Buffer Extensions
Artifact.speculative_actions (new field)
messageArtifact {
Digestspeculative_actions=19; // Points to SpeculativeActions in CAS
}
messageSpeculativeActions {
repeatedSpeculativeActionactions=1;
messageReferencedSpeculativeAction {
stringelement=1; // Element that speculative actions are referenced fromDigestspeculative_actions=2; // The speculative actions of the element at the time of artifact creation
}
repeatedReferencedSpeculativeActionreferenced_speculative_actions=2; // All speculative actions from all elements referenced in ACTION overlays
}
messageSpeculativeAction {
Digestbase_action_digest=1; // Original Action from previous buildrepeatedOverlayoverlays=2; // How to adapt inputsmessageOverlay {
enumOverlayType {
SOURCE=0; // From element's source treeARTIFACT=1; // From dependency element's artifact outputACTION=2; // From another speculative action output
}
OverlayTypetype=1;
stringsource_element=2; // Element referenceDigestsource_action=3; // For ACTION type: which actionstringsource_path=4; // Path within sourceDigesttarget_digest=5; // Digest to replace in input tree
}
}
SpeculativeAction {
base_action_digest: "bbb222..."
overlays: [
Overlay {
type: SOURCE
source_element: "libB"
source_path: "src/vector_math.cpp"
target_digest: "cpp_bbb222..." # Replace this digest
},
Overlay {
type: SOURCE # Prefers SOURCE over ARTIFACT!
source_element: "libA" # Found in libA's source tree
source_path: "src/math_utils.h" # Original source path
target_digest: "h_aaa111..." # Same digest as in /usr/include/
}
]
}
Key Insight: The overlay generator finds that math_utils.h (digest h_aaa111...) appears in libA's ARTIFACT output at /usr/include/math_utils.h, but also discovers it exists in libA's SOURCE at src/math_utils.h with the same digest. Following the SOURCE > ACTION > ARTIFACT priority, it records a SOURCE overlay pointing to libA's source tree. This creates a tighter dependency on the original source rather than the built artifact.
graph TD
Start[Input File Digest] --> Q1{Found in current or<br/>dependency element's<br/>source tree?}
Q1 -->|YES| S[①<br/>SOURCE overlay]
Q1 -->|NO| Q2{Found in current or<br/>dependency element's<br/>sub-action result?}
Q2 -->|YES| A[②<br/>ACTION overlay]
Q2 -->|NO| Q3{Found in dependency<br/>element's artifact?}
Q3 -->|YES| AR[③<br/>ARTIFACT overlay]
style S fill:#e6f3ff
style A fill:#e6ffe6
style AR fill:#ffe6cc
Loading
Overlay Type Summary
Type
Source
Example
Use Case
SOURCE
Element's source tree
.cpp files, .h headers
Tracks source code changes; preferred over ARTIFACT when digest matches
ARTIFACT
Dependency element's artifact output
.a libraries, built artifacts
Tracks dependency artifacts that don't exist in source
ACTION
Previous speculative action output
.o object files
Links compilation steps within an element
Priority: When the same digest appears in multiple locations, overlays prefer SOURCE > ACTION > ARTIFACT. This creates tighter dependencies on original sources rather than intermediate artifacts.
How This Exploits Code Stability
The overlay mechanism directly exploits the research findings from Section 2:
SOURCE overlays for .cpp files: Will require updates when implementation files change (~20% of files per build)
SOURCE overlays for .h files: Often remain valid across multiple builds due to header stability (Adams et al., 2016)
ARTIFACT overlays for dependency artifacts: Very stable, as dependency libraries rarely change between builds
Given that research shows 80% of files typically remain unchanged, we expect:
~80% of SOURCE overlays to resolve successfully with unchanged digests
Cache hits from previous speculative executions provide the 4-5x speedup observed in caching studies (Matev, 2020; Misu et al., 2024)
Identify elements to build: Determine which elements need building based on cache keys
Retrieve element artifacts and SpeculativeActions: Query Artifact Cache using element weak cache keys to get artifacts and their associated SpeculativeActions
Load Referenced SpeculativeActions: For elements referenced in overlays that we don't have loaded yet, load them from ReferencedSpeculativeActions.
As not all code in an element will churn, keeping the dependency chain of SpeculativeActions intact is likely more beneficial than not.
TODO: decide whether to do this for elements referenced in all overlay types or to limit to the ACTION type
Aggregate, deduplicate, and topologically sort: Combine all SpeculativeActions, remove duplicates, and sort respecting dependencies
Instantiate speculative actions: Apply overlays to base_action_digest input trees
If not all overlays are resolved (e.g., src_path not found in overlay source), skip the speculative action and all its transitive dependents
Store and record mapping: Store new speculative actions in CAS and record mapping { base_action_digest → speculative_action_digest } for dependency resolution in following overlays.
Submit for speculative execution: Submit instantiated actions to Remote Execution
Interleave with element builds: BuildStream scheduler continues instantiating additional speculative actions as dependencies become available during ongoing element builds
Cancel remaining executions: After build completion, cancel any speculative executions still in progress
6.2 Data Flow Diagram
flowchart TB
subgraph BuildN["Build #N"]
Start[Build #N Starts]
subgraph Priming["Cache Priming Phase"]
Start --> Sched[BuildStream Scheduler]
Sched -->|query via Asset API| AC1[(Artifact Cache)]
AC1 -->|return SpeculativeActions<br/>from Build #N-1| SAList[SpeculativeActions]
SAList --> Inst[Instantiator]
Inst --> Over{Overlay<br/>Type?}
Over -->|SOURCE| Src[Source Tree]
Over -->|ARTIFACT| Art[Artifact Output]
Over -->|ACTION| Act[Action Output]
Src --> New[New Action]
Art --> New
Act --> New
CASD1[buildbox-casd<br/>local CAS proxy]
New -->|store via CAS API| CASD1
CASD1 -->|proxy to| CAS1[(BuildGrid CAS)]
New -->|Execute API| BuildGrid1[BuildGrid<br/>Execution Service]
BuildGrid1 -->|check| ACache[ActionCache]
ACache -->|miss| BuildGrid1
BuildGrid1 -->|dispatch via Bots API| Workers1[buildbox-worker]
Workers1 -->|fetch via| CASD_W1[buildbox-casd<br/>on worker]
CASD_W1 -->|proxy to| CAS1
Workers1 -->|execute| Runner1[buildbox-run]
Runner1 -->|result| Workers1
Workers1 -->|store result| ACache
end
subgraph Execution["Element Builds"]
Sched -->|trigger| BSElement[BuildStream<br/>Element Build]
BSElement -->|create Action| CASD2[buildbox-casd<br/>local CAS proxy]
CASD2 -->|store via CAS API| CAS1
BSElement -->|Execute API| BuildGrid2[BuildGrid<br/>Execution Service]
BuildGrid2 -->|check| ACache
ACache -.->|cache hit for sub-actions!| BuildGrid2
BuildGrid2 -->|dispatch via Bots API| Workers2[buildbox-worker]
Workers2 -->|fetch via| CASD_W2[buildbox-casd<br/>on worker]
CASD_W2 -->|proxy to| CAS1
Workers2 -->|execute| Runner2[buildbox-run]
Runner2 -->|spawns| SubActions[Sub-Actions:<br/>recc invocations]
SubActions -.->|benefit from warmed ActionCache| ACache
SubActions -->|record digests| Runner2
Runner2 -->|result + subactions| Workers2
Workers2 -->|ActionResult| BuildGrid2
BuildGrid2 -->|result| BSElement
BSElement -->|store via| CASD2
end
subgraph Generation["Overlay Generation (Async)"]
Gen[Overlay Generator<br/>ASYNC]
BSElement -.->|triggers async| Gen
Gen -->|fetch Actions via CAS API| CASD2
Gen -->|create| SA[SpeculativeActions]
SA -->|attach to| Art1[Artifact]
Art1 -->|store via Asset API| AC2[(Artifact Cache)]
end
end
AC2 -.->|↓ Data for Build #N+1 ↓| Boundary
Boundary[/"═══════════════════════════════════════════════════════════════════"\]
subgraph BuildN1["Build #N+1 (Uses SpeculativeActions from Build #N)"]
NextBuild[Build #N+1 will use<br/>SpeculativeActions<br/>generated above]
end
style Start fill:#e6f3ff
style SAList fill:#ffeb99
style Gen fill:#ffe6cc
style SA fill:#ffe6cc
style Art1 fill:#ffe6cc
style AC2 fill:#ffe6cc
style Workers1 fill:#fff4e1
style Workers2 fill:#fff4e1
style BSElement fill:#e1f0ff
style ACache fill:#fff9e1
style Boundary fill:#f0f0f0,stroke:#333,stroke-width:3px,stroke-dasharray: 5 5
style BuildN1 fill:#f8f8ff
style NextBuild fill:#ffeb99
style CASD1 fill:#e8f4f8
style CASD2 fill:#e8f4f8
style CASD_W1 fill:#e8f4f8
style CASD_W2 fill:#e8f4f8
Loading
Reading the Diagram:
This is an example setup with BuildGrid for Remote Execution services.
Note: this diagram could do with some additional cleanup and scrutiny
REAPI Architecture:
BuildStream runs with local buildbox-casd as CAS proxy
buildbox-casd proxies to BuildGrid's CAS, ActionCache, and Execution services
Workers also run buildbox-casd locally for efficient blob access
All communication uses REAPI (Execute, CAS, Asset, Bots APIs)
Cache Priming Flow:
Priming Phase: Retrieves SpeculativeActions → instantiates → submits via Execute API → BuildGrid dispatches to workers → results stored in ActionCache
Element Builds: BuildStream creates Actions → Execute API → BuildGrid checks ActionCache → dispatches to workers → buildbox-run spawns sub-actions (recc) → sub-actions benefit from warmed ActionCache
Overlay Generation: Asynchronously fetches sub-action digests and generates SpeculativeActions for Build #N+1
Key Flow: Build #N uses speculative actions from Build #N-1 → builds → generates speculative actions for Build #N+1
7. Detailed Algorithms
Note: The following pseudocode is illustrative and non-optimal. Production implementations should optimize for performance, especially around CAS operations and dependency resolution.
7.1 Overlay Generation (after build)
Note: This process runs asynchronously and does not block the element build completion or downstream element builds.
# Triggered asynchronously after element build completesasyncdefgenerate_overlays(element, element_subactions):
speculative_actions= []
forsubaction_digestinelement_subactions:
subaction=fetch_cas(subaction_digest)
overlays= []
forinput_fileinsubaction.input_root.files:
# Match digest to source: SOURCE > ACTION > ARTIFACT priority# This means if digest "h_aaa111..." appears in:# - libA's source at "src/math_utils.h" (SOURCE)# - libA's artifact at "/usr/include/math_utils.h" (ARTIFACT)# We prefer the SOURCE matchsrc_element, src_path, src_type=find_source_element(input_file.digest)
overlay=Overlay(
type=src_type,
source_element=src_element,
source_path=src_path,
target_digest=input_file.digest# The digest we'll replace during instantiation
)
overlays.append(overlay)
speculative_actions.append(SpeculativeAction(
base_action_digest=subaction_digest,
overlays=overlays
))
# Attach to Artifact (update artifact in cache)artifact.speculative_actions=store_cas(SpeculativeActions(actions=speculative_actions))
update_artifact_cache(element, artifact)
# Element's downstream dependencies can already be building# This overlay data will be available for the NEXT build session
Note on find_source_element: This function searches through all elements in the dependency graph that were built before the current element. It looks for files matching the target digest in:
SOURCE trees (element source files)
ACTION outputs (from other speculative actions, including those from earlier elements in the dependency graph)
ARTIFACT outputs (built artifacts from dependencies)
When the same digest appears in multiple locations, it returns the SOURCE match, creating a direct dependency on the original source rather than intermediate artifacts.
Use Cases for ACTION Overlays Beyond Current Element:
ACTION overlays are particularly valuable for intermediate build artifacts that are generated deterministically:
Code generation: protoc, flex/bison output files can be referenced across elements
Update instantiation logic to resolve ARTIFACT overlays
Measure: Additional performance gains, complexity vs. benefit
Phase 4: ACTION Overlays (potentially combined with rear and other re- wrappers)
Implement ACTION overlay support in generator and instantiator
Evaluate whether ACTION overlays provide value for code generation (protoc, flex/bison) and other deterministic intermediate steps
If beneficial for archive creation, implement rear (remote execution ar) to complete the speculative action chain (see Section 12 for details)
Consider other re-wrapper candidates: ranlib, strip, resource compilers (see Section 12)
Measure: End-to-end performance improvement
Phase 5: Optimization
Optimize digest resolution using Apache Arrow for columnar digest storage and vectorized lookups
Skip priming for elements with strong cache key hits and their transitive dependencies up the dependency graph (unchanged subtrees)
Optimize topological sorting and deduplication
Implement adaptive throttling for speculative action submission
Add comprehensive monitoring and metrics
9. Benefits
Restores Parallelism: Compilation units execute in parallel across the entire build graph
Transparent: No changes to element definitions or build scripts
Safe: Incorrect/stale overlays simply become redundant; correctness never affected
Efficient: Minimal storage overhead (overlays are small metadata)
Incremental: Works with partial cache hits and source changes
Extensible: Can be enhanced with local execution optimization (rear) and other re-wrapped commands to further improve performance
10. Key Design Decisions
Store speculative actions with artifacts: Keeps priming data co-located with build results
Three speculative action overlay types: SOURCE/ARTIFACT/ACTION cover all input sources with appropriate granularity
Topological sorting: Ensures ACTION overlays can resolve dependencies
Skip on unresolved: Gracefully handles missing/changed inputs without blocking
Early submission: Submit speculative actions as soon as dependencies resolve, regardless of element depth in graph
Asynchronous overlay generation: Overlay generation runs in the background and does not block element build completion or downstream element builds. SpeculativeActions become available for future builds without impacting the current build's critical path.
11. Open Questions and Key Uncertainties
Implementation Questions
Resource limits: How many concurrent speculative actions should we allow?
Metrics: What telemetry do we need to measure priming effectiveness?
Critical Success Factors
1. Phased Rollout Strategy
Question: Should we implement all overlay types at once, or phase them?
Proposal: Start with SOURCE overlays only (Phase 2), measure impact, then add ARTIFACT (Phase 3), then ACTION (Phase 4).
Rationale:
SOURCE overlays are simplest and likely provide the most benefit
ARTIFACT overlays add cross-element dependencies but may have lower resolution rates
ACTION overlays require careful dependency management and may only be valuable combined with rear
Required: Define success criteria for each phase before proceeding to the next.
2. Overlay Resolution Success Rate
Question: How often will overlays successfully resolve in practice?
Context: Speculative actions are skipped if any overlay cannot be resolved. Success rate depends on:
How frequently source files change between builds
Whether the same digests appear in both SOURCE and ARTIFACT locations
For ACTION overlays: whether dependency chains remain intact
Research Evidence: Studies of C/C++ build caching and code churn patterns provide encouraging data:
Adams et al. (2016): Most C/C++ header files are relatively stable compared to implementation files
Misu et al. (2024): Incremental builds with caching achieve 4.22x to 4.71x speedups, demonstrating high cache hit rates in practice
Matev (2020): Hot ccache provides 4-5x speedup over cold cache in large C++ projects
Chowdhury et al. (2024): Approximately 80% of code changes concentrate in 20% of methods (Pareto principle)
Shin et al. (2011): In Mozilla and Red Hat Linux, 89.1% of files were in the low-churn, stable category
Macho et al. (2021): Analysis of 144 Maven projects and build changes found that while dependency version updates are among the top-10 most frequent build changes, structural changes to build configuration (adding/removing dependencies, changing build order) occur much less frequently. Build maintenance is primarily driven by accommodating changes to production code rather than restructuring the build itself.
McIntosh et al. (2012): Study of Java build systems showed that build complexity evolves slowly over time, with build structure remaining relatively stable between major refactorings.
Implication for Build Structure Stability: Research shows that while source code changes frequently, build structure (which libraries are linked, link order, build rules) changes much less often:
~80% of SOURCE overlays for .cpp files to resolve successfully (unchanged sources)
90% of SOURCE overlays for .h files to resolve (headers more stable per Adams)
95% of ARTIFACT overlays to resolve (dependency artifacts rarely change)
~85-90% of ACTION overlays to resolve (link lines and build structure change infrequently per Macho/McIntosh)
Real-world cache effectiveness: The Misu and Matev studies showing 4-5x speedups from caching suggest that cache hit rates of 75-80% are achievable in practice, directly supporting the viability of Cache Priming.
Required:
Instrument prototype to measure actual resolution rates on real projects
Identify common failure patterns
Validate against freedesktop-sdk and GNOME builds
Determine if partial resolution strategies would help
Risk: If <50% of speculative actions resolve successfully, the system may not provide sufficient benefit to justify the complexity. However, research suggests success rates should be substantially higher (70-80%+).
3. Non-Determinism Handling
Question: How do non-deterministic builds affect speculative action correctness?
Context: If a build produces different outputs for the same inputs, the recorded base_action_digest may not be reusable.
Current Assumption: Non-deterministic actions will still return cached results when the ActionCache entry hasn't been evicted. If a new execution produces different outputs, it generates a new subaction, and SpeculativeActions are updated for the next build.
Required: Validate this assumption and document expected behavior for non-deterministic builds.
4. Digest Resolution Performance Impact
Question: Can we make find_source_element fast enough to avoid becoming a bottleneck?
Context: During overlay generation, every input file of every subaction needs its digest resolved against all source trees, artifact outputs, and prior actions in the dependency graph. For large builds, this could be thousands of files × thousands of potential sources.
Critical Insight: Since overlay generation runs asynchronously and doesn't block element builds or speculative action execution, performance requirements are relaxed:
Overlay generation for element N can run while elements N+1, N+2, etc. are building
Generated SpeculativeActions are used in the next build session, not the current one
Slow overlay generation only delays availability of speculative actions for future builds
Optimization Strategies:
Apache Arrow for digest lookups: Use Apache Arrow's columnar format to store digest tables with built-in vectorized operations
Arrow provides O(1) random access with cache-efficient columnar layout
Built-in SIMD optimizations for hash lookups and comparisons
Cross-language support (C++, Python, Rust) for BuildStream integration
Set lookup kernels with hash table support already implemented
No need to implement custom indices, bloom filters, or vectorization
Columnar digest storage: Store digests in Arrow tables: [(digest: FixedSizeBinary(32), element: string, path: string, type: int8)]
Arrow compute kernels: Use pc.is_in() for individual digest checks and pc.join(..., join_type="semi") for bulk digest matching against large sets
Memory efficiency: Arrow's columnar format provides 64-byte alignment for optimal SIMD performance
Why Apache Arrow:
Arrow's columnar layout enables vectorization using SIMD operations, with 64-byte alignment matching AVX-512 register width
Arrow provides built-in set lookup kernels with hash table support for exactly this use case
Arrow's pipeline and SIMD algorithms deliver 10-100x performance improvements through cache locality and parallel operations
Arrow enables efficient data processing without serialization overhead, making cross-module data sharing trivial
Proven at scale: PySpark achieved 10-100x performance improvements using Arrow
Required:
Profile and optimize on real-world projects (freedesktop-sdk, GNOME)
Measure: time to generate overlays vs. element build time
Implement Apache Arrow-based digest lookup tables as primary strategy
Leverage Arrow's built-in set lookup kernels and hash table support
Acceptable Performance: Overlay generation should complete within 1-2x the element's build time. Since it runs asynchronously, it won't impact the current build's critical path.
Risk: If overlay generation is extremely slow (>10x element build time), it might not complete before the next build session starts, reducing cache priming effectiveness.
5. Speculative Action Scale and Remote Execution Capacity
Question: Won't submitting thousands of speculative actions overwhelm the remote execution cluster?
Context: A large C/C++ BuildStream project might generate thousands of speculative actions (one per compilation unit). For example, a project with 100 elements averaging 100 compilation units each would produce 10,000 speculative actions.
Comparison to Existing Systems: This scale is actually modest compared to build systems already using Remote Execution:
Bazel builds of TensorFlow generate over 8,191 actions executing across hundreds of remote workers
Google runs millions of builds executing millions of test cases every day, with builds depending on tens of thousands of targets
A large TypeScript project with 10M lines of code generates 1,100 Bazel targets, with benchmarks using 100 remote executors
Bazel builds often include thousands or tens of thousands of small actions such as compiling, linking, or running tests
Key Insight: Remote Execution systems like BuildGrid are designed to handle tens of thousands of concurrent actions. Cache Priming's speculative actions are within the normal operating range of these systems.
Mitigation Strategies:
ActionCache hits: Many speculative actions will hit immediately, reducing actual execution load
Priority scheduling: Actual element builds can be prioritized over speculative actions
Adaptive throttling: Submit speculative actions at a controlled rate based on cluster capacity
Gradual rollout: Start with Phase 2 (SOURCE overlays only) which generates fewer actions
Required:
Monitor remote execution cluster utilization during prototype testing
Implement priority queuing: actual builds > speculative actions
Consider time-windowing: only prime for elements building soon
Acceptable: Speculative actions should consume <30% of remote execution capacity, leaving headroom for actual builds.
6. Storage and Network Overhead
Question: What is the actual storage overhead of SpeculativeActions?
Context:
Each artifact now includes SpeculativeActions metadata
For compilation-heavy elements (1000+ compilation units), this could be significant
ReferencedSpeculativeAction stores only element name + digest (minimal duplication)
Required:
Measure actual storage increase on real projects
Consider compression strategies if overhead is significant
Provide opt-out mechanism for elements that don't benefit
7. Value of ACTION Overlays Without rear
Question: Do ACTION overlays provide benefit on their own, or only when combined with rear?
Context: ACTION overlays create dependencies between speculative actions within an element. Without rear (see Section 12), the primary use case would be linking, which requires archives that may not exist as separate subactions.
Proposal: Evaluate ACTION overlays and rear together in Phase 4, as they may be interdependent.
Required: Identify other potential uses for ACTION overlays beyond archive creation.
Success Metrics
To validate the proposal, we need to measure:
Performance: End-to-end build time reduction (target: >20% for C/C++ heavy projects)
Resolution Rate: Percentage of speculative actions successfully instantiated (target: >70%)
Storage Overhead: Size increase of artifact cache (acceptable: <20%)
Overlay Generation Time: Time to generate overlays per element (acceptable: <2x element build time, since it's async)
Worker Utilization: Speculative actions should not starve actual builds
Validation Plan
Implement Phase 1-2: SOURCE overlays only on a prototype
Test on freedesktop-sdk: Large, real-world C/C++ heavy BuildStream project
Measure all success metrics: Gather data before proceeding
Iterate or pivot: If metrics don't meet targets, adjust design or abandon
Proceed to Phase 3+: Only if Phase 2 shows clear benefit
12. Future Enhancement: Local Execution with rear
Motivation
In the base Cache Priming design, speculative actions follow this pattern:
Compile .cpp → .o (via recc, speculative)
Upload .o to CAS
Download .o files for archiving
Create .a archive
Upload .a to CAS
However, since BuildStream submits entire element builds as Actions to workers, and recc returns .o files during the element build, those .o files are already local on the worker. We can optimize archive creation by keeping it local.
The rear Command
Similar to recc (remote execution caching compiler), rear (remote execution ar) wraps the ar archiver:
# Traditional ar usage
ar rcs libmath_utils.a math_utils.o helpers.o
# With rear wrapper
rear rcs libmath_utils.a math_utils.o helpers.o
rear behavior:
Generates an Action describing the archive operation
Checks ActionCache for existing result
If miss:
During main element build: Executes arlocally on the worker (via buildbox-casd) since .o files are already present
During speculative execution: Executes remotely (fetches .o files from CAS as needed)
Stores result in ActionCache
Records the Action digest in ActionResult.subactions
Extended Speculative Action Chain
With rear, we get a complete chain of speculative actions:
libA element build (main build):
recc g++ -c src/math_utils.cpp -o math_utils.o [remote or cached]
↓ (local .o file on worker)
rear rcs /usr/lib/libmath_utils.a math_utils.o [LOCAL execution]
↓ (local .a file on worker)
g++ app.o -L/usr/lib -lmath_utils -o final_binary [remote or cached]
Speculative execution (cache priming):
recc compile action [remote, warms ActionCache]
↓ (.o uploaded to CAS)
rear archive action [remote, fetches .o from CAS]
↓ (.a uploaded to CAS)
link action [remote, fetches .a from CAS]
Faster Main Builds: During the actual element build, .o files are local on the worker, so archive creation executes immediately without network overhead
Earlier Link Actions: Even though speculative rear actions run remotely, they still populate the ActionCache, enabling link actions to start as soon as archives are cached
ActionCache Hits: When the main build runs, rear checks ActionCache first and gets an immediate hit from the speculative execution
Reduced Critical Path: The main build's archive step becomes nearly instantaneous (cache hit), shortening the overall build time
Note: Future optimizations (out of scope for this proposal) could include locality hints to prefer scheduling speculative rear actions on workers that already have the .o files, further reducing network traffic.
Execution Flow
Cache Priming Phase (speculative):
Submit compile actions (recc) → executes remotely, warms ActionCache
↓ [.o files uploaded to CAS]
Submit archive action (rear) → executes remotely, fetches .o from CAS
↓ [.a file uploaded to CAS, ActionCache warmed]
Submit link actions → executes remotely, warms ActionCache
↓
Main Build Phase:
Element build starts
↓
recc compile → ActionCache HIT (instant)
↓ [.o files now local on worker]
rear archive → ActionCache HIT, but if miss, executes locally (instant)
↓ [.a file now local on worker]
link → ActionCache HIT (instant)
↓
Element build completes in minimal time
The key insight: rear enables ActionCache hits during the main build, and when those hits miss, local execution is very fast because the input .o files are already on the worker.
Beyond rear: Instrumentation Strategy
The rear optimization illustrates a general principle: identify commands that operate on local intermediate data and wrap them for speculative execution.
Candidates for "re-" wrappers:
ranlib: Index generation for static libraries (fast, local)
strip: Symbol stripping (fast, local, self-contained)
Monitor build logs to identify frequent intermediate commands
Measure: execution time, input/output sizes, determinism
Prioritize commands that are:
Fast enough for local execution (< 1 second)
Self-contained (few dependencies)
Operate on data already local to worker
Appear frequently across builds
Implement re-wrappers and measure impact
This creates a virtuous cycle: more instrumentation → more optimization opportunities → better build performance.
References
Adams, B., McIntosh, S., Nagappan, M., and Hassan, A.E. (2016). "Identifying and understanding header file hotspots in C/C++ build processes." Automated Software Engineering, Vol. 23, pp. 61-90. https://dl.acm.org/doi/10.1007/s10515-015-0183-5
Chowdhury, S., Uddin, G., Hemmati, H., and Holmes, R. (2024). "The Good, the Bad, and the Monstrous: Predicting Highly Change-Prone Source Code Methods at Their Inception." ACM Transactions on Software Engineering and Methodology, Vol. 33, No. 4. https://dl.acm.org/doi/10.1145/3715006
McIntosh, S., Adams, B., and Hassan, A.E. (2012). "The Evolution of Java Build Systems." Empirical Software Engineering, Vol. 17, No. 4-5, pp. 578-608.
Misu, S., Lin, B., and Monden, A. (2024). "Does Using Bazel Help Speed Up Continuous Integration Builds?" arXiv preprint. https://arxiv.org/html/2405.00796v1
Nagappan, N. and Ball, T. (2007). "Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study." First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007). https://ieeexplore.ieee.org/document/4343764/
Shin, Y., Meneely, A., Williams, L., and Osborne, J. (2011). "Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities." IEEE Transactions on Software Engineering, Vol. 37, No. 6, pp. 772-787. https://ieeexplore.ieee.org/document/5560680/
Munson, J.C. and Elbaum, S. (1998). "Code Churn: A Measure for Estimating the Impact of Code Change." Proceedings of the International Conference on Software Maintenance, pp. 24-31. https://ieeexplore.ieee.org/document/738486/
Speculative Actions: Predictive Cache Priming for BuildStream
Table of Contents
rearrear1. Executive Summary
Cache Priming restores build parallelism in BuildStream by speculatively executing compiler invocations ahead of time, warming the Remote Execution ActionCache before actual element builds run. This dramatically reduces end-to-end build latency without compromising correctness.
Key Insight: After building an element once, we know all the subactions (e.g.
recccompiler invocations) that occurred. We can store these with overlay instructions, then reuse them in future builds by adapting inputs for source/dependency changes.BuildStream and Remote Execution: BuildStream is built on Remote Execution API (REAPI) paradigms, using a local buildbox-casd as a CAS proxy that connects to remote Execution, ActionCache, and CAS services. Cache Priming extends this architecture by adding speculative action submission to warm the ActionCache before element builds execute.
2. Problem Statement
BuildStream orchestrates repository-level builds across many elements, allowing each element to use its native builds system, but not achieving the fine-grained parallelism available within individual compilation units. C/C++ projects are particularly affected; translation units could compile in parallel but instead wait sequentially for their element's turn in the build graph.
Why Cache Priming is Particularly Effective
Code churn patterns favor cache reuse: Research consistently shows that code changes are concentrated in a small subset of files, with most code remaining stable between builds. This stability pattern makes speculative execution with cached results highly effective.
Key Research Findings:
Adams et al. (2016) conducted a comprehensive case study on GLib, PostgreSQL, Qt, and Ruby systems analyzing C/C++ header file "hotspots"—headers that change frequently and trigger expensive rebuild processes. They empirically demonstrated that most header files are relatively stable compared to implementation files, with only a small subset accounting for the majority of build-time impact.
Misu et al. (2024) analyzed 383 GitHub projects using Bazel and found that incremental builds with caching achieve median speedups of 4.22x (build system tool-independent cache) to 4.71x (build system tool-specific cache) for long-build duration projects. This demonstrates the substantial real-world value of build caching when most inputs haven't changed.
Matev (2020) studied large C++ projects using ccache for distributed compilation, demonstrating that with a hot cache, compilation can be 4-5 times faster than with cache misses. The study showed that "incremental builds often do not significantly speed up compilation because even just a change of the modification time of a widely used header leads to many compiler and linker invocations"—highlighting both the cost of header changes and the benefit when they remain stable.
Chowdhury et al. (2024) analyzed 774,051 source code methods from 49 prominent open-source Java projects and found that "approximately 80% of changes are concentrated in just 20% of the methods, demonstrating the Pareto 80/20 principle. Moreover, this subset of methods is responsible for the majority of the identified bugs in these projects." This concentration of changes means the vast majority of code remains unchanged between builds.
Supporting Evidence on Change Concentration:
Shin et al. (2011): In Mozilla Firefox and Red Hat Enterprise Linux, 70.8% of known vulnerabilities were found in only 10.9% of files, demonstrating extreme concentration of churn in a small subset of the codebase.
Nagappan and Ball (2007): In Windows Server 2003, software dependencies and churn measures were efficient predictors of post-release failures, further confirming that changes cluster in predictable locations.
Munson and Elbaum (1998): Established that code churn, the amount of code change over time, is measurable and correlates with fault-proneness, confirming that change patterns are non-uniform and predictable.
In C/C++ projects specifically:
Example: In a typical incremental build of a C++ project with 1000 compilation units:
This is why Cache Priming is particularly powerful for C/C++ codebases: the stability of header files and the concentration of changes in a minority of implementation files means that speculative compilation actions can frequently reuse cached results from previous builds.
3. Solution Overview
High-Level Flow:
flowchart LR Build["Build #N"] -->|instantiates| SpecActions["Speculative Actions<br/>(generated by Build #N-1)"] SpecActions -->|warm| ActionCache[(ActionCache)] ActionCache -->|serves| ElementBuilds[Element Builds] ElementBuilds -->|produce| SubActions["(Sub)Actions"] SubActions -->|transformed into| SpecActions2[Speculative Actions] SpecActions2 -->|stored in| ArtifactCache[(Artifact Cache)] SpecActions2 -.-> SpecActions style ActionCache fill:#fff4e1 style ArtifactCache fill:#fff4e1 style ElementBuilds fill:#e1f0ff style SubActions fill:#e1f0ffProcess:
4. Data Model
4.1 Protocol Buffer Extensions
Artifact.speculative_actions (new field)
ActionResult.subactions (new field in REAPI)
5. Concrete Example: libA → libB → appC
Dependency Structure
5.1 Example: libA Compilation
libA Source Structure
libA Artifact Structure (Build Output)
Visual Flow:
graph TB subgraph Step1["① libA Source"] S1["src/math_utils.cpp<br/><b>digest: cpp_aaa111...</b>"] S2["src/math_utils.h<br/><b>digest: h_aaa111...</b>"] end subgraph Step2["② (Sub)Action Execution"] A1["<b>recc g++ -c</b><br/>Action digest: aaa111..."] end S1 -->|input| Step2 S2 -->|input| Step2 subgraph Step3["③ (Sub)Action Result"] O1["math_utils.o"] end Step2 -->|produces| O1 subgraph Step4["④ libA Artifact (/usr/)"] AR1["/usr/include/math_utils.h<br/><b>digest: h_aaa111...</b>"] AR2["/usr/lib/libmath_utils.a"] end O1 -.->|archived into| AR2 S2 -.->|installed as| AR1 subgraph Step5["⑤ Generated SpeculativeAction"] SA["<b>base_action_digest: aaa111...</b>"] OV1["<b>Overlay 1:</b><br/>type: SOURCE<br/>element: libA<br/>path: src/math_utils.cpp<br/>target_digest: cpp_aaa111..."] OV2["<b>Overlay 2:</b><br/>type: SOURCE<br/>element: libA<br/>path: src/math_utils.h<br/>target_digest: h_aaa111..."] end S1 -.-|digest match| OV1 S2 -.-|digest match| OV2 OV1 --> SA OV2 --> SA SA --> A1 style Step1 fill:#e6f3ff style Step4 fill:#ffe6cc style Step5 fill:#e6ffe6 style Step2 fill:#fff4e1 style Step3 fill:#f0f0f0Recorded Subaction (compile math_utils.cpp)
Action { command_digest: "cmd_aaa111..." input_root_digest: "dir_aaa111..." } Command { arguments: ["g++", "-c", "-fPIC", "src/math_utils.cpp", "-o", "math_utils.o"] } Directory (dir_aaa111...) { files: [ File { name: "src/math_utils.cpp", digest: "cpp_aaa111..." }, File { name: "src/math_utils.h", digest: "h_aaa111..." } ] }Generated SpeculativeAction
SpeculativeAction { base_action_digest: "aaa111..." overlays: [ Overlay { type: SOURCE source_element: "libA" source_path: "src/math_utils.cpp" target_digest: "cpp_aaa111..." # Replace this digest in input tree }, Overlay { type: SOURCE source_element: "libA" source_path: "src/math_utils.h" target_digest: "h_aaa111..." # Replace this digest in input tree } ] }5.2 Example: libB Compilation
libB Source Structure
libB Artifact Structure (Build Output)
Visual Flow:
graph TB subgraph Step1A["① libB Source"] SB1["src/vector_math.cpp<br/><b>digest: cpp_bbb222...</b>"] end subgraph Step1B["① libA Artifact (dependency)"] AA1["/usr/include/math_utils.h<br/><b>digest: h_aaa111...</b><br/>(from libA artifact)"] end subgraph Step1C["① libA Source"] SA2["src/math_utils.h<br/><b>digest: h_aaa111...</b>"] end subgraph Step2["② (Sub)Action Execution"] A1["<b>recc g++ -c -I/usr/include</b><br/>Action digest: bbb222..."] end SB1 -->|input| Step2 AA1 -.->|used by compile| Step2 subgraph Step3["③ (Sub)Action Result"] O1["vector_math.o"] end Step2 -->|produces| O1 subgraph Step4["④ Generated SpeculativeAction"] SA["<b>base_action_digest: bbb222...</b>"] OV1["<b>Overlay 1:</b><br/>type: SOURCE<br/>element: libB<br/>path: src/vector_math.cpp<br/>target_digest: cpp_bbb222..."] OV2["<b>Overlay 2:</b><br/>type: SOURCE<br/>element: libA<br/>path: src/math_utils.h<br/>target_digest: h_aaa111..."] end SB1 -.-|digest match| OV1 SA2 -.-|"digest match<br/>(libA Source not Artifact!)"| OV2 OV1 --> SA OV2 --> SA SA --> A1 style Step1A fill:#e6f3ff style Step1C fill:#e6f3ff style Step1B fill:#ffe6cc style Step4 fill:#e6ffe6 style Step2 fill:#fff4e1 style Step3 fill:#f0f0f0Recorded Subaction (compile vector_math.cpp)
Action { command_digest: "cmd_bbb222..." input_root_digest: "dir_bbb222..." } Command { arguments: [ "g++", "-c", "-fPIC", "-I/usr/include", # libA headers from dependency artifact "src/vector_math.cpp", "-o", "vector_math.o" ] } Directory (dir_bbb222...) { files: [ File { name: "src/vector_math.cpp", digest: "cpp_bbb222..." }, File { name: "/usr/include/math_utils.h", digest: "h_aaa111..." } ] }Generated SpeculativeAction
SpeculativeAction { base_action_digest: "bbb222..." overlays: [ Overlay { type: SOURCE source_element: "libB" source_path: "src/vector_math.cpp" target_digest: "cpp_bbb222..." # Replace this digest }, Overlay { type: SOURCE # Prefers SOURCE over ARTIFACT! source_element: "libA" # Found in libA's source tree source_path: "src/math_utils.h" # Original source path target_digest: "h_aaa111..." # Same digest as in /usr/include/ } ] }Key Insight: The overlay generator finds that
math_utils.h(digesth_aaa111...) appears in libA's ARTIFACT output at/usr/include/math_utils.h, but also discovers it exists in libA's SOURCE atsrc/math_utils.hwith the same digest. Following the SOURCE > ACTION > ARTIFACT priority, it records a SOURCE overlay pointing to libA's source tree. This creates a tighter dependency on the original source rather than the built artifact.5.3 Example: appC Linking
appC Source Structure
appC Artifact Structure (Build Output)
Visual Flow:
graph TB subgraph Step1A["① Previous Speculative Action Output"] PA1["main.o<br/><b>digest: obj_ccc333...</b><br/>(from compile action ccc333...)"] end subgraph Step1B["① libB Artifact"] AB1["/usr/lib/libvector_math.a<br/><b>digest: lib_bbb222...</b>"] end subgraph Step1C["① libA Artifact"] AA1["/usr/lib/libmath_utils.a<br/><b>digest: lib_aaa111...</b>"] end subgraph Step2["② (Sub)Action Execution"] A1["<b>g++ link</b><br/>Action digest: ccc444..."] end PA1 -->|input| Step2 AB1 -->|input| Step2 AA1 -->|input| Step2 subgraph Step3["③ (Sub)Action Result"] O1["appC executable"] end Step2 -->|produces| O1 subgraph Step4["④ Generated SpeculativeAction"] SA["<b>base_action_digest: ccc444...</b>"] OV1["<b>Overlay 1:</b><br/>type: ACTION<br/>element: appC<br/>source_action: ccc333...<br/>path: main.o<br/>target_digest: obj_ccc333..."] OV2["<b>Overlay 2:</b><br/>type: ARTIFACT<br/>element: libB<br/>path: /usr/lib/libvector_math.a<br/>target_digest: lib_bbb222..."] OV3["<b>Overlay 3:</b><br/>type: ARTIFACT<br/>element: libA<br/>path: /usr/lib/libmath_utils.a<br/>target_digest: lib_aaa111..."] end OV1 --> SA OV2 --> SA OV3 --> SA PA1 -.-|digest match| OV1 AB1 -.-|digest match| OV2 AA1 -.-|digest match| OV3 SA --> A1 style Step1A fill:#e6ffe6 style Step1B fill:#ffe6cc style Step1C fill:#ffe6cc style Step4 fill:#e6ffe6 style Step2 fill:#fff4e1 style Step3 fill:#f0f0f0Generated SpeculativeAction (link step)
SpeculativeAction { base_action_digest: "ccc444..." overlays: [ Overlay { type: ACTION # From previous speculative action source_element: "appC" source_action: "ccc333..." # The compile action source_path: "main.o" target_digest: "obj_ccc333..." }, Overlay { type: ARTIFACT # From libB artifact output source_element: "libB" source_path: "/usr/lib/libvector_math.a" target_digest: "lib_bbb222..." }, Overlay { type: ARTIFACT # From libA artifact output source_element: "libA" source_path: "/usr/lib/libmath_utils.a" target_digest: "lib_aaa111..." } ] }5.4 Overlay Type Summary
Visual Overview:
graph TD Start[Input File Digest] --> Q1{Found in current or<br/>dependency element's<br/>source tree?} Q1 -->|YES| S[①<br/>SOURCE overlay] Q1 -->|NO| Q2{Found in current or<br/>dependency element's<br/>sub-action result?} Q2 -->|YES| A[②<br/>ACTION overlay] Q2 -->|NO| Q3{Found in dependency<br/>element's artifact?} Q3 -->|YES| AR[③<br/>ARTIFACT overlay] style S fill:#e6f3ff style A fill:#e6ffe6 style AR fill:#ffe6ccOverlay Type Summary
.cppfiles,.hheaders.alibraries, built artifacts.oobject filesPriority: When the same digest appears in multiple locations, overlays prefer SOURCE > ACTION > ARTIFACT. This creates tighter dependencies on original sources rather than intermediate artifacts.
How This Exploits Code Stability
The overlay mechanism directly exploits the research findings from Section 2:
Given that research shows 80% of files typically remain unchanged, we expect:
6. System Flow
6.1 High-Level Process Flow
flowchart TD Start([Build Session Start]) --> Identify[Identify Elements to Build] Identify --> Retrieve[Retrieve Artifacts &<br/>SpeculativeActions<br/>by element weak-ref] Retrieve --> LoadRef[Load Referenced<br/>SpeculativeActions] LoadRef --> Aggregate[Aggregate & Deduplicate<br/>SpeculativeActions] Aggregate --> TopoSort[Topological Sort<br/>by Dependencies] TopoSort --> Instantiate{For Each<br/>SpeculativeAction} Instantiate --> Resolve{All Overlays<br/>Resolvable?} Resolve -->|No| Skip[Skip Action &<br/>Transitive Dependents] Resolve -->|Yes| Apply[Apply Overlays] Apply --> Store[Store in CAS] Store --> RecordMap[Record Mapping:<br/>base_action_digest →<br/>speculative_action_digest] RecordMap --> Submit[Submit to Remote Execution] Skip --> More{More?} Submit --> More More -->|Yes| Instantiate More -->|No| Parallel[Speculative Execution<br/>Warms ActionCache] Parallel --> Interleave[Scheduler Interleaves:<br/>Instantiate remaining speculative actions<br/>as dependencies become available] Interleave --> Build[Element Builds<br/>Hit Cache] Build --> Complete{Build<br/>Complete?} Complete -->|No| Interleave Complete -->|Yes| Cancel[Cancel Remaining<br/>Speculative Executions] Cancel --> End([Build Complete]) style Submit fill:#fff4e1 style Parallel fill:#fff4e1 style Interleave fill:#e1f0ff style Build fill:#e1f0ff style Skip fill:#ffe1e1Detailed Flow Steps:
{ base_action_digest → speculative_action_digest }for dependency resolution in following overlays.6.2 Data Flow Diagram
flowchart TB subgraph BuildN["Build #N"] Start[Build #N Starts] subgraph Priming["Cache Priming Phase"] Start --> Sched[BuildStream Scheduler] Sched -->|query via Asset API| AC1[(Artifact Cache)] AC1 -->|return SpeculativeActions<br/>from Build #N-1| SAList[SpeculativeActions] SAList --> Inst[Instantiator] Inst --> Over{Overlay<br/>Type?} Over -->|SOURCE| Src[Source Tree] Over -->|ARTIFACT| Art[Artifact Output] Over -->|ACTION| Act[Action Output] Src --> New[New Action] Art --> New Act --> New CASD1[buildbox-casd<br/>local CAS proxy] New -->|store via CAS API| CASD1 CASD1 -->|proxy to| CAS1[(BuildGrid CAS)] New -->|Execute API| BuildGrid1[BuildGrid<br/>Execution Service] BuildGrid1 -->|check| ACache[ActionCache] ACache -->|miss| BuildGrid1 BuildGrid1 -->|dispatch via Bots API| Workers1[buildbox-worker] Workers1 -->|fetch via| CASD_W1[buildbox-casd<br/>on worker] CASD_W1 -->|proxy to| CAS1 Workers1 -->|execute| Runner1[buildbox-run] Runner1 -->|result| Workers1 Workers1 -->|store result| ACache end subgraph Execution["Element Builds"] Sched -->|trigger| BSElement[BuildStream<br/>Element Build] BSElement -->|create Action| CASD2[buildbox-casd<br/>local CAS proxy] CASD2 -->|store via CAS API| CAS1 BSElement -->|Execute API| BuildGrid2[BuildGrid<br/>Execution Service] BuildGrid2 -->|check| ACache ACache -.->|cache hit for sub-actions!| BuildGrid2 BuildGrid2 -->|dispatch via Bots API| Workers2[buildbox-worker] Workers2 -->|fetch via| CASD_W2[buildbox-casd<br/>on worker] CASD_W2 -->|proxy to| CAS1 Workers2 -->|execute| Runner2[buildbox-run] Runner2 -->|spawns| SubActions[Sub-Actions:<br/>recc invocations] SubActions -.->|benefit from warmed ActionCache| ACache SubActions -->|record digests| Runner2 Runner2 -->|result + subactions| Workers2 Workers2 -->|ActionResult| BuildGrid2 BuildGrid2 -->|result| BSElement BSElement -->|store via| CASD2 end subgraph Generation["Overlay Generation (Async)"] Gen[Overlay Generator<br/>ASYNC] BSElement -.->|triggers async| Gen Gen -->|fetch Actions via CAS API| CASD2 Gen -->|create| SA[SpeculativeActions] SA -->|attach to| Art1[Artifact] Art1 -->|store via Asset API| AC2[(Artifact Cache)] end end AC2 -.->|↓ Data for Build #N+1 ↓| Boundary Boundary[/"═══════════════════════════════════════════════════════════════════"\] subgraph BuildN1["Build #N+1 (Uses SpeculativeActions from Build #N)"] NextBuild[Build #N+1 will use<br/>SpeculativeActions<br/>generated above] end style Start fill:#e6f3ff style SAList fill:#ffeb99 style Gen fill:#ffe6cc style SA fill:#ffe6cc style Art1 fill:#ffe6cc style AC2 fill:#ffe6cc style Workers1 fill:#fff4e1 style Workers2 fill:#fff4e1 style BSElement fill:#e1f0ff style ACache fill:#fff9e1 style Boundary fill:#f0f0f0,stroke:#333,stroke-width:3px,stroke-dasharray: 5 5 style BuildN1 fill:#f8f8ff style NextBuild fill:#ffeb99 style CASD1 fill:#e8f4f8 style CASD2 fill:#e8f4f8 style CASD_W1 fill:#e8f4f8 style CASD_W2 fill:#e8f4f8Reading the Diagram:
This is an example setup with BuildGrid for Remote Execution services.
Note: this diagram could do with some additional cleanup and scrutiny
REAPI Architecture:
buildbox-casdas CAS proxybuildbox-casdproxies to BuildGrid's CAS, ActionCache, and Execution servicesbuildbox-casdlocally for efficient blob accessCache Priming Flow:
buildbox-runspawns sub-actions (recc) → sub-actions benefit from warmed ActionCacheKey Flow: Build #N uses speculative actions from Build #N-1 → builds → generates speculative actions for Build #N+1
7. Detailed Algorithms
Note: The following pseudocode is illustrative and non-optimal. Production implementations should optimize for performance, especially around CAS operations and dependency resolution.
7.1 Overlay Generation (after build)
Note: This process runs asynchronously and does not block the element build completion or downstream element builds.
Note on find_source_element: This function searches through all elements in the dependency graph that were built before the current element. It looks for files matching the target digest in:
When the same digest appears in multiple locations, it returns the SOURCE match, creating a direct dependency on the original source rather than intermediate artifacts.
Use Cases for ACTION Overlays Beyond Current Element:
ACTION overlays are particularly valuable for intermediate build artifacts that are generated deterministically:
.o→.aviarear(see Section 12)7.2 Overlay Instantiation (before build)
8. Implementation Plan
Phase 1: REAPI Extension
ActionResult.subactionsfield to REAPIPhase 2: BuildStream Integration - SOURCE Overlays Only
Phase 3: ARTIFACT Overlays
Phase 4: ACTION Overlays (potentially combined with
rearand other re- wrappers)rear(remote execution ar) to complete the speculative action chain (see Section 12 for details)Phase 5: Optimization
9. Benefits
rear) and other re-wrapped commands to further improve performance10. Key Design Decisions
11. Open Questions and Key Uncertainties
Implementation Questions
Critical Success Factors
1. Phased Rollout Strategy
Question: Should we implement all overlay types at once, or phase them?
Proposal: Start with SOURCE overlays only (Phase 2), measure impact, then add ARTIFACT (Phase 3), then ACTION (Phase 4).
Rationale:
rearRequired: Define success criteria for each phase before proceeding to the next.
2. Overlay Resolution Success Rate
Question: How often will overlays successfully resolve in practice?
Context: Speculative actions are skipped if any overlay cannot be resolved. Success rate depends on:
Research Evidence: Studies of C/C++ build caching and code churn patterns provide encouraging data:
Implication for Build Structure Stability: Research shows that while source code changes frequently, build structure (which libraries are linked, link order, build rules) changes much less often:
Real-world cache effectiveness: The Misu and Matev studies showing 4-5x speedups from caching suggest that cache hit rates of 75-80% are achievable in practice, directly supporting the viability of Cache Priming.
Required:
Risk: If <50% of speculative actions resolve successfully, the system may not provide sufficient benefit to justify the complexity. However, research suggests success rates should be substantially higher (70-80%+).
3. Non-Determinism Handling
Question: How do non-deterministic builds affect speculative action correctness?
Context: If a build produces different outputs for the same inputs, the recorded
base_action_digestmay not be reusable.Current Assumption: Non-deterministic actions will still return cached results when the ActionCache entry hasn't been evicted. If a new execution produces different outputs, it generates a new subaction, and SpeculativeActions are updated for the next build.
Required: Validate this assumption and document expected behavior for non-deterministic builds.
4. Digest Resolution Performance Impact
Question: Can we make
find_source_elementfast enough to avoid becoming a bottleneck?Context: During overlay generation, every input file of every subaction needs its digest resolved against all source trees, artifact outputs, and prior actions in the dependency graph. For large builds, this could be thousands of files × thousands of potential sources.
Critical Insight: Since overlay generation runs asynchronously and doesn't block element builds or speculative action execution, performance requirements are relaxed:
Optimization Strategies:
[(digest: FixedSizeBinary(32), element: string, path: string, type: int8)]pc.is_in()for individual digest checks andpc.join(..., join_type="semi")for bulk digest matching against large setsWhy Apache Arrow:
Required:
Acceptable Performance: Overlay generation should complete within 1-2x the element's build time. Since it runs asynchronously, it won't impact the current build's critical path.
Risk: If overlay generation is extremely slow (>10x element build time), it might not complete before the next build session starts, reducing cache priming effectiveness.
5. Speculative Action Scale and Remote Execution Capacity
Question: Won't submitting thousands of speculative actions overwhelm the remote execution cluster?
Context: A large C/C++ BuildStream project might generate thousands of speculative actions (one per compilation unit). For example, a project with 100 elements averaging 100 compilation units each would produce 10,000 speculative actions.
Comparison to Existing Systems: This scale is actually modest compared to build systems already using Remote Execution:
Key Insight: Remote Execution systems like BuildGrid are designed to handle tens of thousands of concurrent actions. Cache Priming's speculative actions are within the normal operating range of these systems.
Mitigation Strategies:
Required:
Acceptable: Speculative actions should consume <30% of remote execution capacity, leaving headroom for actual builds.
6. Storage and Network Overhead
Question: What is the actual storage overhead of SpeculativeActions?
Context:
Required:
7. Value of ACTION Overlays Without rear
Question: Do ACTION overlays provide benefit on their own, or only when combined with
rear?Context: ACTION overlays create dependencies between speculative actions within an element. Without
rear(see Section 12), the primary use case would be linking, which requires archives that may not exist as separate subactions.Proposal: Evaluate ACTION overlays and
reartogether in Phase 4, as they may be interdependent.Required: Identify other potential uses for ACTION overlays beyond archive creation.
Success Metrics
To validate the proposal, we need to measure:
Validation Plan
12. Future Enhancement: Local Execution with rear
Motivation
In the base Cache Priming design, speculative actions follow this pattern:
.cpp→.o(viarecc, speculative).oto CAS.ofiles for archiving.aarchive.ato CASHowever, since BuildStream submits entire element builds as Actions to workers, and
reccreturns.ofiles during the element build, those.ofiles are already local on the worker. We can optimize archive creation by keeping it local.The
rearCommandSimilar to
recc(remote execution caching compiler), rear (remote execution ar) wraps theararchiver:rear behavior:
arlocally on the worker (via buildbox-casd) since.ofiles are already present.ofiles from CAS as needed)ActionResult.subactionsExtended Speculative Action Chain
With
rear, we get a complete chain of speculative actions:Example: libA with
rearRecorded Subactions during libA build:
ActionResult { subactions: [ "recc_compile_math_utils...", // .cpp → .o "recc_compile_helpers...", // .cpp → .o "rear_archive..." // .o → .a ] }Generated SpeculativeAction for rear:
SpeculativeAction { base_action_digest: "rear_archive..." overlays: [ Overlay { type: ACTION source_element: "libA.bst" source_action: "recc_compile_math_utils..." source_path: "math_utils.o" target_digest: "obj_math_utils..." }, Overlay { type: ACTION source_element: "libA.bst" source_action: "recc_compile_helpers..." source_path: "helpers.o" target_digest: "obj_helpers..." } ] }Key Advantages
.ofiles are local on the worker, so archive creation executes immediately without network overheadrearactions run remotely, they still populate the ActionCache, enabling link actions to start as soon as archives are cachedrearchecks ActionCache first and gets an immediate hit from the speculative executionNote: Future optimizations (out of scope for this proposal) could include locality hints to prefer scheduling speculative
rearactions on workers that already have the.ofiles, further reducing network traffic.Execution Flow
The key insight: rear enables ActionCache hits during the main build, and when those hits miss, local execution is very fast because the input
.ofiles are already on the worker.Beyond
rear: Instrumentation StrategyThe
rearoptimization illustrates a general principle: identify commands that operate on local intermediate data and wrap them for speculative execution.Candidates for "re-" wrappers:
ranlib: Index generation for static libraries (fast, local)strip: Symbol stripping (fast, local, self-contained)Instrumentation approach:
This creates a virtuous cycle: more instrumentation → more optimization opportunities → better build performance.
References
Adams, B., McIntosh, S., Nagappan, M., and Hassan, A.E. (2016). "Identifying and understanding header file hotspots in C/C++ build processes." Automated Software Engineering, Vol. 23, pp. 61-90. https://dl.acm.org/doi/10.1007/s10515-015-0183-5
Chowdhury, S., Uddin, G., Hemmati, H., and Holmes, R. (2024). "The Good, the Bad, and the Monstrous: Predicting Highly Change-Prone Source Code Methods at Their Inception." ACM Transactions on Software Engineering and Methodology, Vol. 33, No. 4. https://dl.acm.org/doi/10.1145/3715006
Macho, C., McIntosh, S., and Pinzger, M. (2021). "The nature of build changes." Empirical Software Engineering, Vol. 26, Article 52. https://link.springer.com/article/10.1007/s10664-020-09926-4
Matev, R. (2020). "Fast distributed compilation and testing of large C++ projects." EPJ Web of Conferences, Vol. 245. https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_05001.pdf
McIntosh, S., Adams, B., and Hassan, A.E. (2012). "The Evolution of Java Build Systems." Empirical Software Engineering, Vol. 17, No. 4-5, pp. 578-608.
Misu, S., Lin, B., and Monden, A. (2024). "Does Using Bazel Help Speed Up Continuous Integration Builds?" arXiv preprint. https://arxiv.org/html/2405.00796v1
Nagappan, N. and Ball, T. (2007). "Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study." First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007). https://ieeexplore.ieee.org/document/4343764/
Shin, Y., Meneely, A., Williams, L., and Osborne, J. (2011). "Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities." IEEE Transactions on Software Engineering, Vol. 37, No. 6, pp. 772-787. https://ieeexplore.ieee.org/document/5560680/
Munson, J.C. and Elbaum, S. (1998). "Code Churn: A Measure for Estimating the Impact of Code Change." Proceedings of the International Conference on Software Maintenance, pp. 24-31. https://ieeexplore.ieee.org/document/738486/