Skip to content

add --reuse flag for container reuse in local dev#6562

Draft
p-datadog wants to merge 11 commits intomainfrom
feat/container-reuse
Draft

add --reuse flag for container reuse in local dev#6562
p-datadog wants to merge 11 commits intomainfrom
feat/container-reuse

Conversation

@p-datadog
Copy link
Member

@p-datadog p-datadog commented Mar 20, 2026

Summary

When iterating on test code locally, the full container lifecycle adds 30-60+ seconds per run — even when the containers haven't changed at all. This PR adds a --reuse flag that attaches to already-running containers instead of restarting them.

How it works

Just always pass +u during local development:

./run.sh DEFAULT +u   # first time: no containers -> starts them, keeps alive
./run.sh DEFAULT +u   # containers exist -> attaches instantly, runs tests
./run.sh DEFAULT +u   # iterate on tests...

# rebuild tracer
./build.sh ruby --binary-path ~/code/dd-trace-rb

./run.sh DEFAULT +u   # detects stale image -> restarts containers, keeps alive
./run.sh DEFAULT +u   # back to fast iteration

The flag is always safe to use — it falls back to a normal start when containers are missing, stopped, or stale (image was rebuilt). No need to remember when to use it and when not to.

Staleness detection

When --reuse finds existing containers, it compares each container's image ID against the current Docker image. If the image was rebuilt since the container started, the containers are torn down and started fresh automatically. This prevents silently testing against an old container after a rebuild.

What changed

conftest.py — registers --reuse / -u via parser.addoption, following the established pattern alongside --replay, --sleep, etc.

run.sh — new +u/++reuse flag that passes --reuse to pytest. Added to help text.

utils/_context/_scenarios/core.py_reuse flag read from config.option.reuse in pytest_configure (alongside self.replay); skip log directory wipe when reusing so healthcheck files from the prior run survive.

utils/_context/_scenarios/endtoend.py:

  • _attach_or_start_containers(): validates containers are running with the correct image. Attaches if possible (one Docker API call per container); falls back to a normal fresh start if any container is missing, stopped, or stale. After fallback, re-enables reuse on containers so they're kept alive for the next run.
  • Refactors _wait_and_stop_containers() into _drain_interfaces() + _stop_containers() — this separation lets reuse drain interface data without stopping containers.

utils/_context/containers.py:

  • image_is_stale(): compares running container's image ID against current Docker image.
  • configure() gains a reuse parameter — skips killing old containers and loads image info from existing logs instead of Docker.
  • stop() and remove() are no-ops in reuse mode, guarded at the leaf level so all callers (including error paths) are covered.

Behavior change in the drain/stop refactor

The original code interleaved draining and stopping: drain library interface, stop weblog, drain buddy interfaces, stop buddies, drain agent interface, stop agent. The refactored version drains all interfaces first, then stops all containers.

This means the agent container keeps running (and potentially receiving data) during the library interface drain, where before it would have been stopped right after its own drain. This is arguably better — more data gets captured — but it's a change in the non-reuse path too. The refactor is a prerequisite for reuse (which needs to drain without stopping), and the behavior difference should be harmless in practice since the drain timeouts are the same either way.

Lifecycle comparison

Normal:        configure -> kill_old -> start -> health -> post_start -> readiness -> [tests] -> drain all -> stop all -> remove
Reuse (hit):   configure(keep) -> attach -> post_start(from logs) -> [tests] -> drain all -> (keep alive)
Reuse (miss):  configure(keep) -> not found -> reconfigure -> start -> ... -> [tests] -> drain all -> (keep alive)
Reuse (stale): configure(keep) -> stale image -> reconfigure -> start -> ... -> [tests] -> drain all -> (keep alive)

Test plan

  • Normal run without --reuse — full lifecycle, containers start and stop as before
  • +u with no containers running — falls back to normal start, keeps containers alive
  • +u with containers running — attaches, runs tests fast, keeps alive
  • Rebuild image, then +u — detects stale image, restarts fresh, keeps alive
  • +u with stopped containers — falls back to normal start
  • Verify interface data is still drained on reuse (test assertions pass)
  • Verify ./run.sh +h shows ++reuse in the help output

Generated with Claude Code

When iterating on test code locally, the full container lifecycle
(start, health check, readiness wait, tests, interface drain, stop,
remove) adds 30-60+ seconds per run even when containers haven't
changed.

The --reuse flag (or ST_REUSE=1 env var) attaches to already-running
containers instead of starting new ones, skips readiness waits, and
keeps containers alive after tests finish. This reduces iteration
time to just the test execution itself.

Implementation:
- run.sh: new +r/++reuse flag sets ST_REUSE=1
- core.py: skip log directory wipe when reusing (preserves healthcheck
  metadata from prior run)
- endtoend.py: _attach_to_existing_containers() validates containers
  are running and attaches; refactors _wait_and_stop_containers into
  _drain_interfaces() + _stop_containers() for separation of concerns
- containers.py: configure() gains reuse param to skip killing old
  containers and loading image from existing logs; stop()/remove()
  are no-ops in reuse mode

Usage:
  ./run.sh DEFAULT          # first run: starts containers normally
  ./run.sh DEFAULT ++reuse  # subsequent runs: reuses existing containers

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 20, 2026

CODEOWNERS have been resolved as:

conftest.py                                                             @DataDog/system-tests-core
run.sh                                                                  @DataDog/system-tests-core
tests/test_the_test/test_docker_scenario.py                             @DataDog/system-tests-core
utils/_context/_scenarios/core.py                                       @DataDog/system-tests-core
utils/_context/_scenarios/endtoend.py                                   @DataDog/system-tests-core
utils/_context/containers.py                                            @DataDog/system-tests-core

Unicorn Enterprises and others added 10 commits March 20, 2026 18:17
When --reuse finds no containers, stopped containers, or containers
built from a stale image (image ID mismatch after rebuild), it falls
back to a normal start instead of failing. This means --reuse is
always safe to use — no need for a separate workflow after rebuilds.

Also adds image_is_stale() to TestedContainer which compares the
running container's image ID against the current Docker image.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Follow the established ST pattern: register --reuse via
parser.addoption in conftest.py, read it from config.option.reuse
in pytest_configure, and pass it as a pytest arg from run.sh.

Also fix: after fallback-to-fresh-start, re-enable _reuse on
containers so they're kept alive for the next --reuse run.
Move _needs_readiness_wait init to DockerScenario.configure where
it's set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
One stdout line on happy path ("Reusing existing containers"),
one on fallback ("Reuse not possible, starting containers...").
Per-container details demoted to debug.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The absence of the existing "Stopping container" / "Removing container"
debug lines is sufficient signal. No need to log what we're not doing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The fallback stdout line is sufficient. Per-container reasons
are unnecessary even at debug level.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
-r is taken by pytest (extra test summary). -u is free.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Cache get_existing_container() results from the check loop and
  reuse them in the attach loop (saves 3 Docker API round-trips)
- Add +u/++reuse to run.sh help text
- Add comment noting _needs_readiness_wait crosses DockerScenario →
  EndToEndScenario boundary

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All subclasses of TestedContainer that override configure() need the
new reuse parameter to match the superclass signature.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
pytest reserves lowercase short options for built-in use.
Plugin options must use uppercase.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant