Skip to content

Build agent image in cmd_post_process#96

Merged
moodmosaic merged 3 commits intoprotocol-security:masterfrom
BowTiedRadone:fix/pp-docker-build
May 6, 2026
Merged

Build agent image in cmd_post_process#96
moodmosaic merged 3 commits intoprotocol-security:masterfrom
BowTiedRadone:fix/pp-docker-build

Conversation

@BowTiedRadone
Copy link
Copy Markdown
Contributor

@BowTiedRadone BowTiedRadone commented May 5, 2026

Summary

Fixes an edge case that showed when trying to use a separate claude-code post-processor after the image was built for a codex-only config. The Docker layer cache wasn't picked up -- because the post-process path never called docker build at all -- so the post container inherited a claude-less image and the harness exited 127 on first session, needing manual docker build intervention to recover.

cmd_post_process now builds the image from the active config before launching, mirroring cmd_start. Build args are derived from the same config the container will use, so when the driver set changes the layer cache invalidates correctly and the right install layer re-runs; when it hasn't changed, the build is a no-op.

Changes

  • launch.sh: factor compute_swarm_agents (driver set from a config) and build_image (build with derived args) out of cmd_start, and call build_image from cmd_post_process.
  • launch.sh: drive-by, switch the submodule mirror cleanup in cmd_start to rm_docker_dir so it works when a prior container left UID-mismatched files behind.

Self-review checklist

  • This PR addresses a single concern (why?). If it covers multiple independent changes, I've split them into separate PRs.

Test plan

Using two configs with identical prompts, having the agent drivers as the only difference (swarm-codex.json and swarm-claude-code.json):

  • Build with swarm-codex.json, then run post-process against swarm-claude-code.json -- image rebuilds with claude-code and the post container reaches a real session.
  • Re-run post-process with no config change -- build is a cache hit, install layer does not re-run.
  • launch.sh start against an unchanged config -- behavior identical to before.

@BowTiedRadone BowTiedRadone requested a review from moodmosaic as a code owner May 5, 2026 10:18
Pin the PR's contract structurally so a future refactor cannot
silently drop one of the call sites or change the SWARM_AGENTS
union.

- Section 38 sources compute_swarm_agents from launch.sh and
  exercises it on eight configs: default driver, codex-only,
  mixed groups, dedup across groups, the codex-agents +
  claude-code-pp scenario from the PR description, pp matching
  the agent driver, pp inheriting the top-level driver, and an
  empty per-group driver falling back to the default.
- Section 39 asserts build_image has exactly two call sites
  (one each in cmd_start and cmd_post_process) and that its
  body still threads compute_swarm_agents, SWARM_AGENTS,
  CLAUDE_CODE_VERSION, and CODEX_CLI_VERSION as build-args.
@moodmosaic
Copy link
Copy Markdown
Member

One ask going forward: PRs must ship with tests. In the AI era there's no excuse -- (trustworthy) tests can be cheap to author, and untested changes only slow review (the maintainer has to reproduce the bug by hand and build the mental model from scratch).

A trustworthy test-suite is what keeps this project maintainable long-term and the maintainer's life easier. I added tests on top of this branch this round so we don't block the fix; please land tests-included next time. 🙏

Copy link
Copy Markdown
Member

@moodmosaic moodmosaic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @BowTiedRadone! 👍

cmd_post_process never called docker build, so a driver-set switch silently left the post container on a stale image and exited 127; factoring out compute_swarm_agents / build_image is the right shape and gets cache invalidation for free.

LGTM!

@moodmosaic moodmosaic merged commit 6ab8ead into protocol-security:master May 6, 2026
5 checks passed
@BowTiedRadone
Copy link
Copy Markdown
Contributor Author

@moodmosaic Thanks! Sure thing 🙏

@BowTiedRadone BowTiedRadone deleted the fix/pp-docker-build branch May 6, 2026 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants