Skip to content

Add safeguard to check Ansible facts freshness before apply#2165

Open
berendt wants to merge 1 commit intomainfrom
check-facts
Open

Add safeguard to check Ansible facts freshness before apply#2165
berendt wants to merge 1 commit intomainfrom
check-facts

Conversation

@berendt
Copy link
Copy Markdown
Member

@berendt berendt commented Mar 26, 2026

Check Redis for cached Ansible facts before executing a role via the apply command. Warns if no facts exist or if facts are stale (older than 43200 seconds by default, configurable via FACTS_MAX_AGE). Skipped for gather-facts/facts roles and --show-tree.

AI-assisted: Claude Code

Copy link
Copy Markdown

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • The check_ansible_facts function scans the entire Redis keyspace matching ansible_facts* on every apply invocation, which could become expensive in large environments; consider narrowing the scan (e.g., by host pattern or role context) or adding a fast early-exit once a representative sample has been checked.
  • When stripping the hostname from the Redis key (using key.decode().replace("ansible_facts", "", 1)), consider making the expected key format explicit (e.g., prefix with a separator such as ansible_facts:) to avoid accidentally altering hostnames that might legitimately contain the string ansible_facts.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `check_ansible_facts` function scans the entire Redis keyspace matching `ansible_facts*` on every `apply` invocation, which could become expensive in large environments; consider narrowing the scan (e.g., by host pattern or role context) or adding a fast early-exit once a representative sample has been checked.
- When stripping the hostname from the Redis key (using `key.decode().replace("ansible_facts", "", 1)`), consider making the expected key format explicit (e.g., prefix with a separator such as `ansible_facts:`) to avoid accidentally altering hostnames that might legitimately contain the string `ansible_facts`.

## Individual Comments

### Comment 1
<location path="osism/utils/__init__.py" line_range="609" />
<code_context>
+            date_time = facts.get("ansible_date_time", {})
+            epoch = date_time.get("epoch")
+            if epoch is None:
+                hostname = key.decode().replace("ansible_facts", "", 1)
+                logger.debug(
+                    f"Host '{hostname}': facts missing ansible_date_time.epoch"
</code_context>
<issue_to_address>
**issue (bug_risk):** Make hostname extraction robust to both bytes and str Redis keys.

With some `_init_redis()` configurations (`decode_responses=True`), `scan` may return `str` keys instead of `bytes`, so `key.decode()` would raise `AttributeError`. Handle both types, e.g.:

```python
if isinstance(key, bytes):
    hostname = key.decode()
else:
    hostname = key
hostname = hostname.replace("ansible_facts", "", 1)
```
</issue_to_address>

### Comment 2
<location path="osism/utils/__init__.py" line_range="618-619" />
<code_context>
+            if age > max_age:
+                hostname = key.decode().replace("ansible_facts", "", 1)
+                stale_hosts.append((hostname, int(age)))
+        except (json.JSONDecodeError, ValueError, TypeError):
+            continue
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider logging malformed/stale fact entries instead of silently skipping them.

Swallowing JSON/parse errors here can hide corrupt or unexpected Redis data and make debugging difficult. Please add at least a debug-level log including the key (and maybe a truncated value) so issues with facts not updating can be diagnosed while still skipping them in the main logic.

```suggestion
        except (json.JSONDecodeError, ValueError, TypeError):
            truncated_value = value
            # Avoid logging excessively large values
            if isinstance(truncated_value, (bytes, str)):
                truncated_value = truncated_value[:200]
            logger.debug(
                "Skipping malformed ansible_facts entry for key %r: %r",
                key,
                truncated_value,
                exc_info=True,
            )
            continue
```
</issue_to_address>

### Comment 3
<location path="osism/commands/apply.py" line_range="438-439" />
<code_context>

+        # Check if Ansible facts in Redis are available and fresh.
+        # Skip when gathering facts or just showing the tree.
+        if role and role not in ("gather-facts", "facts") and not show_tree:
+            utils.check_ansible_facts()
+
         rc = 0
</code_context>
<issue_to_address>
**suggestion (performance):** Reconsider running a full Redis scan for facts on every apply invocation.

`check_ansible_facts()` scans all `ansible_facts*` keys in Redis, which can be costly in large environments when `apply` is invoked frequently. Consider gating this behind a config/CLI flag or adding caching/backoff so the scan doesn’t always run.

Suggested implementation:

```python
        # Check if Ansible facts in Redis are available and fresh.
        # Skip when gathering facts or just showing the tree.
        # To avoid a costly scan on every apply, only perform this check
        # if it hasn't been run recently (simple time-based backoff).
        if role and role not in ("gather-facts", "facts") and not show_tree:
            now = time.time()
            last_check = getattr(utils, "_last_ansible_facts_check", 0)
            # 300 seconds (5 minutes) backoff between full Redis scans
            if now - last_check > 300:
                utils.check_ansible_facts()
                setattr(utils, "_last_ansible_facts_check", now)

```

1. At the top of `osism/commands/apply.py`, ensure `time` is imported:
   - Add `import time` near the other standard library imports if it is not already present.
2. If you want the backoff interval to be configurable (e.g. via config or CLI), replace the hard-coded `300` with a variable read from your configuration/argument parsing, and pass that value into this block instead of the literal.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Check Redis for cached Ansible facts before executing a role via
the apply command. Warns if no facts exist or if facts are stale
(older than 43200 seconds by default, configurable via FACTS_MAX_AGE).
Skipped for gather-facts/facts roles and --show-tree.

AI-assisted: Claude Code

Signed-off-by: Christian Berendt <berendt@osism.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant