add --reuse flag for container reuse in local dev by p-datadog · Pull Request #6562 · DataDog/system-tests

p-datadog · 2026-03-20T22:06:28Z

Summary

When iterating on test code locally, the full container lifecycle adds 30-60+ seconds per run — even when the containers haven't changed at all. This PR adds a --reuse flag that attaches to already-running containers instead of restarting them.

How it works

Just always pass +u during local development:

./run.sh DEFAULT +u   # first time: no containers -> starts them, keeps alive
./run.sh DEFAULT +u   # containers exist -> attaches instantly, runs tests
./run.sh DEFAULT +u   # iterate on tests...

# rebuild tracer
./build.sh ruby --binary-path ~/code/dd-trace-rb

./run.sh DEFAULT +u   # detects stale image -> restarts containers, keeps alive
./run.sh DEFAULT +u   # back to fast iteration

The flag is always safe to use — it falls back to a normal start when containers are missing, stopped, or stale (image was rebuilt). No need to remember when to use it and when not to.

Staleness detection

When --reuse finds existing containers, it compares each container's image ID against the current Docker image. If the image was rebuilt since the container started, the containers are torn down and started fresh automatically. This prevents silently testing against an old container after a rebuild.

What changed

conftest.py — registers --reuse / -u via parser.addoption, following the established pattern alongside --replay, --sleep, etc.

run.sh — new +u/++reuse flag that passes --reuse to pytest. Added to help text.

utils/_context/_scenarios/core.py — _reuse flag read from config.option.reuse in pytest_configure (alongside self.replay); skip log directory wipe when reusing so healthcheck files from the prior run survive.

utils/_context/_scenarios/endtoend.py:

_attach_or_start_containers(): validates containers are running with the correct image. Attaches if possible (one Docker API call per container); falls back to a normal fresh start if any container is missing, stopped, or stale. After fallback, re-enables reuse on containers so they're kept alive for the next run.
Refactors _wait_and_stop_containers() into _drain_interfaces() + _stop_containers() — this separation lets reuse drain interface data without stopping containers.

utils/_context/containers.py:

image_is_stale(): compares running container's image ID against current Docker image.
configure() gains a reuse parameter — skips killing old containers and loads image info from existing logs instead of Docker.
stop() and remove() are no-ops in reuse mode, guarded at the leaf level so all callers (including error paths) are covered.

Behavior change in the drain/stop refactor

The original code interleaved draining and stopping: drain library interface, stop weblog, drain buddy interfaces, stop buddies, drain agent interface, stop agent. The refactored version drains all interfaces first, then stops all containers.

This means the agent container keeps running (and potentially receiving data) during the library interface drain, where before it would have been stopped right after its own drain. This is arguably better — more data gets captured — but it's a change in the non-reuse path too. The refactor is a prerequisite for reuse (which needs to drain without stopping), and the behavior difference should be harmless in practice since the drain timeouts are the same either way.

Lifecycle comparison

Normal:        configure -> kill_old -> start -> health -> post_start -> readiness -> [tests] -> drain all -> stop all -> remove
Reuse (hit):   configure(keep) -> attach -> post_start(from logs) -> [tests] -> drain all -> (keep alive)
Reuse (miss):  configure(keep) -> not found -> reconfigure -> start -> ... -> [tests] -> drain all -> (keep alive)
Reuse (stale): configure(keep) -> stale image -> reconfigure -> start -> ... -> [tests] -> drain all -> (keep alive)

Test plan

Normal run without --reuse — full lifecycle, containers start and stop as before
+u with no containers running — falls back to normal start, keeps containers alive
+u with containers running — attaches, runs tests fast, keeps alive
Rebuild image, then +u — detects stale image, restarts fresh, keeps alive
+u with stopped containers — falls back to normal start
Verify interface data is still drained on reuse (test assertions pass)
Verify ./run.sh +h shows ++reuse in the help output

Generated with Claude Code

When iterating on test code locally, the full container lifecycle (start, health check, readiness wait, tests, interface drain, stop, remove) adds 30-60+ seconds per run even when containers haven't changed. The --reuse flag (or ST_REUSE=1 env var) attaches to already-running containers instead of starting new ones, skips readiness waits, and keeps containers alive after tests finish. This reduces iteration time to just the test execution itself. Implementation: - run.sh: new +r/++reuse flag sets ST_REUSE=1 - core.py: skip log directory wipe when reusing (preserves healthcheck metadata from prior run) - endtoend.py: _attach_to_existing_containers() validates containers are running and attaches; refactors _wait_and_stop_containers into _drain_interfaces() + _stop_containers() for separation of concerns - containers.py: configure() gains reuse param to skip killing old containers and loading image from existing logs; stop()/remove() are no-ops in reuse mode Usage: ./run.sh DEFAULT # first run: starts containers normally ./run.sh DEFAULT ++reuse # subsequent runs: reuses existing containers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-03-20T22:06:55Z

CODEOWNERS have been resolved as:

conftest.py                                                             @DataDog/system-tests-core
run.sh                                                                  @DataDog/system-tests-core
tests/test_the_test/test_docker_scenario.py                             @DataDog/system-tests-core
utils/_context/_scenarios/core.py                                       @DataDog/system-tests-core
utils/_context/_scenarios/endtoend.py                                   @DataDog/system-tests-core
utils/_context/containers.py                                            @DataDog/system-tests-core

When --reuse finds no containers, stopped containers, or containers built from a stale image (image ID mismatch after rebuild), it falls back to a normal start instead of failing. This means --reuse is always safe to use — no need for a separate workflow after rebuilds. Also adds image_is_stale() to TestedContainer which compares the running container's image ID against the current Docker image. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Follow the established ST pattern: register --reuse via parser.addoption in conftest.py, read it from config.option.reuse in pytest_configure, and pass it as a pytest arg from run.sh. Also fix: after fallback-to-fresh-start, re-enable _reuse on containers so they're kept alive for the next --reuse run. Move _needs_readiness_wait init to DockerScenario.configure where it's set. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

One stdout line on happy path ("Reusing existing containers"), one on fallback ("Reuse not possible, starting containers..."). Per-container details demoted to debug. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The absence of the existing "Stopping container" / "Removing container" debug lines is sufficient signal. No need to log what we're not doing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The fallback stdout line is sufficient. Per-container reasons are unnecessary even at debug level. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

-r is taken by pytest (extra test summary). -u is free. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Cache get_existing_container() results from the check loop and reuse them in the attach loop (saves 3 Docker API round-trips) - Add +u/++reuse to run.sh help text - Add comment noting _needs_readiness_wait crosses DockerScenario → EndToEndScenario boundary Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

All subclasses of TestedContainer that override configure() need the new reuse parameter to match the superclass signature. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

pytest reserves lowercase short options for built-in use. Plugin options must use uppercase. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Unicorn Enterprises and others added 10 commits March 20, 2026 18:17

trim reuse logging to match codebase terseness

b554f61

One stdout line on happy path ("Reusing existing containers"), one on fallback ("Reuse not possible, starting containers..."). Per-container details demoted to debug. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

remove skip-announcement logs from stop() and remove()

54a1110

The absence of the existing "Stopping container" / "Removing container" debug lines is sufficient signal. No need to log what we're not doing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

collapse reuse check into single condition, remove debug logs

bdea8e1

The fallback stdout line is sufficient. Per-container reasons are unnecessary even at debug level. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

use -u/+u as short option for --reuse

06df223

-r is taken by pytest (extra test summary). -u is free. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix mypy: add reuse param to all configure() overrides

aa457d5

All subclasses of TestedContainer that override configure() need the new reuse parameter to match the superclass signature. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix ruff formatting in _drain_interfaces

c8effa3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix: use uppercase -U for pytest short option

222d7e1

pytest reserves lowercase short options for built-in use. Plugin options must use uppercase. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add --reuse flag for container reuse in local dev#6562

add --reuse flag for container reuse in local dev#6562
p-datadog wants to merge 11 commits intomainfrom
feat/container-reuse

p-datadog commented Mar 20, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

p-datadog commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How it works

Staleness detection

What changed

Behavior change in the drain/stop refactor

Lifecycle comparison

Test plan

Uh oh!

github-actions bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

p-datadog commented Mar 20, 2026 •

edited

Loading

github-actions bot commented Mar 20, 2026 •

edited

Loading