Add safeguard to check Ansible facts freshness before apply#2165
Open
Add safeguard to check Ansible facts freshness before apply#2165
Conversation
There was a problem hiding this comment.
Hey - I've found 3 issues, and left some high level feedback:
- The
check_ansible_factsfunction scans the entire Redis keyspace matchingansible_facts*on everyapplyinvocation, which could become expensive in large environments; consider narrowing the scan (e.g., by host pattern or role context) or adding a fast early-exit once a representative sample has been checked. - When stripping the hostname from the Redis key (using
key.decode().replace("ansible_facts", "", 1)), consider making the expected key format explicit (e.g., prefix with a separator such asansible_facts:) to avoid accidentally altering hostnames that might legitimately contain the stringansible_facts.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `check_ansible_facts` function scans the entire Redis keyspace matching `ansible_facts*` on every `apply` invocation, which could become expensive in large environments; consider narrowing the scan (e.g., by host pattern or role context) or adding a fast early-exit once a representative sample has been checked.
- When stripping the hostname from the Redis key (using `key.decode().replace("ansible_facts", "", 1)`), consider making the expected key format explicit (e.g., prefix with a separator such as `ansible_facts:`) to avoid accidentally altering hostnames that might legitimately contain the string `ansible_facts`.
## Individual Comments
### Comment 1
<location path="osism/utils/__init__.py" line_range="609" />
<code_context>
+ date_time = facts.get("ansible_date_time", {})
+ epoch = date_time.get("epoch")
+ if epoch is None:
+ hostname = key.decode().replace("ansible_facts", "", 1)
+ logger.debug(
+ f"Host '{hostname}': facts missing ansible_date_time.epoch"
</code_context>
<issue_to_address>
**issue (bug_risk):** Make hostname extraction robust to both bytes and str Redis keys.
With some `_init_redis()` configurations (`decode_responses=True`), `scan` may return `str` keys instead of `bytes`, so `key.decode()` would raise `AttributeError`. Handle both types, e.g.:
```python
if isinstance(key, bytes):
hostname = key.decode()
else:
hostname = key
hostname = hostname.replace("ansible_facts", "", 1)
```
</issue_to_address>
### Comment 2
<location path="osism/utils/__init__.py" line_range="618-619" />
<code_context>
+ if age > max_age:
+ hostname = key.decode().replace("ansible_facts", "", 1)
+ stale_hosts.append((hostname, int(age)))
+ except (json.JSONDecodeError, ValueError, TypeError):
+ continue
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider logging malformed/stale fact entries instead of silently skipping them.
Swallowing JSON/parse errors here can hide corrupt or unexpected Redis data and make debugging difficult. Please add at least a debug-level log including the key (and maybe a truncated value) so issues with facts not updating can be diagnosed while still skipping them in the main logic.
```suggestion
except (json.JSONDecodeError, ValueError, TypeError):
truncated_value = value
# Avoid logging excessively large values
if isinstance(truncated_value, (bytes, str)):
truncated_value = truncated_value[:200]
logger.debug(
"Skipping malformed ansible_facts entry for key %r: %r",
key,
truncated_value,
exc_info=True,
)
continue
```
</issue_to_address>
### Comment 3
<location path="osism/commands/apply.py" line_range="438-439" />
<code_context>
+ # Check if Ansible facts in Redis are available and fresh.
+ # Skip when gathering facts or just showing the tree.
+ if role and role not in ("gather-facts", "facts") and not show_tree:
+ utils.check_ansible_facts()
+
rc = 0
</code_context>
<issue_to_address>
**suggestion (performance):** Reconsider running a full Redis scan for facts on every apply invocation.
`check_ansible_facts()` scans all `ansible_facts*` keys in Redis, which can be costly in large environments when `apply` is invoked frequently. Consider gating this behind a config/CLI flag or adding caching/backoff so the scan doesn’t always run.
Suggested implementation:
```python
# Check if Ansible facts in Redis are available and fresh.
# Skip when gathering facts or just showing the tree.
# To avoid a costly scan on every apply, only perform this check
# if it hasn't been run recently (simple time-based backoff).
if role and role not in ("gather-facts", "facts") and not show_tree:
now = time.time()
last_check = getattr(utils, "_last_ansible_facts_check", 0)
# 300 seconds (5 minutes) backoff between full Redis scans
if now - last_check > 300:
utils.check_ansible_facts()
setattr(utils, "_last_ansible_facts_check", now)
```
1. At the top of `osism/commands/apply.py`, ensure `time` is imported:
- Add `import time` near the other standard library imports if it is not already present.
2. If you want the backoff interval to be configurable (e.g. via config or CLI), replace the hard-coded `300` with a variable read from your configuration/argument parsing, and pass that value into this block instead of the literal.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Check Redis for cached Ansible facts before executing a role via the apply command. Warns if no facts exist or if facts are stale (older than 43200 seconds by default, configurable via FACTS_MAX_AGE). Skipped for gather-facts/facts roles and --show-tree. AI-assisted: Claude Code Signed-off-by: Christian Berendt <berendt@osism.tech>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Check Redis for cached Ansible facts before executing a role via the apply command. Warns if no facts exist or if facts are stale (older than 43200 seconds by default, configurable via FACTS_MAX_AGE). Skipped for gather-facts/facts roles and --show-tree.
AI-assisted: Claude Code