Skip to content

Warm page cache before save to eliminate cold-NVMe IOPS bottleneck#4

Merged
joshfriend merged 1 commit intomainfrom
warm-page-cache
Apr 2, 2026
Merged

Warm page cache before save to eliminate cold-NVMe IOPS bottleneck#4
joshfriend merged 1 commit intomainfrom
warm-page-cache

Conversation

@joshfriend
Copy link
Copy Markdown
Collaborator

Parallel readers fault all cache files into the OS page cache before tar starts, so tar reads at memory speed (~1300 MB/s) instead of cold NVMe IOPS-limited ~80 MB/s. 2-5x archive speedup depending on instance generation.

Parallel readers fault all cache files into the OS page cache before
tar starts, so tar reads at memory speed (~1300 MB/s) instead of
cold NVMe IOPS-limited ~80 MB/s. 2-5x archive speedup depending on
instance generation.
@joshfriend joshfriend enabled auto-merge (squash) April 2, 2026 01:10
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4283e47d7c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +200 to +204
if !cfg.SkipWarm {
log.Debug("warming page cache")
warmStart := time.Now()
warmPageCache(sources)
log.Debug("page cache warm", "duration", time.Since(warmStart).Round(time.Millisecond))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Honor context cancellation before warming files

Save now performs a full warmPageCache(sources) pass before starting CreateTarZstd, but this warm step does not observe ctx.Done(). In timeout/cancel scenarios (common in CI), Save will continue scanning and reading the entire cache tree before returning, which can add minutes of uninterruptible work after cancellation and delay job teardown. Please gate or short-circuit warming when the context is canceled.

Useful? React with 👍 / 👎.

Comment on lines +537 to +539
_ = filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error {
if err != nil || !d.Type().IsRegular() {
return nil
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Skip warming files excluded from the archive

warmPageCache reads every regular file under each source, but CreateTarZstd later excludes many paths (including wrapper/dists/*/*/*.zip and CacheExclusions). This means we spend I/O warming data that will never be archived, and large excluded files can evict useful cache pages and reduce the intended speedup. The warm walk should apply the same exclusion rules as tar input selection.

Useful? React with 👍 / 👎.

@joshfriend joshfriend merged commit 7d17723 into main Apr 2, 2026
9 checks passed
@joshfriend joshfriend deleted the warm-page-cache branch April 2, 2026 01:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant