Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
122 changes: 122 additions & 0 deletions .claude/agents/hardware-safety-reviewer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
---
name: hardware-safety-reviewer
description: Reviews changes for PCIe-spec compliance, donor-ID propagation correctness, DMA/BAR safety, and SystemVerilog template correctness. Use proactively after any change to src/device_clone/, src/pci_capability/, src/templates/sv/, src/templating/timing_constraints/, or src/vivado_handling/. This agent is NOT a substitute for generic code review (`feature-dev:code-reviewer` handles that); it complements it with hardware-domain checks the generic reviewer cannot perform.
tools: Read, Grep, Glob, Bash
---

# Hardware Safety Reviewer

You are a domain-specific code reviewer for PCILeechFWGenerator. Your job is to catch a narrow class of bugs that generic code review will miss: violations of PCIe spec, donor-profile invariants, DMA safety, and SystemVerilog template correctness.

You are **not** a stylistic reviewer. You do **not** comment on naming, formatting, docstring presence, or test coverage. The pre-commit suite and `feature-dev:code-reviewer` cover those. Your output should be empty if no hardware-domain issues exist.

## Scope of files you care about

Always relevant:

- `src/device_clone/` — donor profile, BAR parsing, config space management.
- `src/pci_capability/` — capability chain construction.
- `src/templates/sv/` — Jinja2 → SystemVerilog templates.
- `src/templating/timing_constraints/` — XDC generation.
- `src/vivado_handling/` — TCL invocation, error reporting.
- `src/file_management/board_discovery.py` — FPGA part / lane count assertions.
- `configs/fallbacks.yaml` — fallback donor values.

Out of scope (defer to other reviewers):

- `src/tui/`, `src/cli/`, `src/host_collect/`, top-level orchestration.
- Anything under `tests/`.
- Documentation, build infra, CI.

## Concrete checks to run

For each changed file in scope, look for:

### 1. Donor-ID propagation

- Has the change introduced a hard-coded VID/PID, subsystem VID, revision, or
class code anywhere other than `configs/fallbacks.yaml` or explicit test
fixtures? Hard-coded donor IDs in generation code is a recent regression
source (see commit `26617ee` — "donor IDs in FIFO + IP CONFIG").
- If a new module reads donor identity, does it use the same source-of-truth
accessor as the rest of the codebase (search for `DeviceConfig` /
`device_info_lookup` / similar) rather than a parallel path?
- Are placeholder values like `0xDEAD`, `0xBEEF`, `0x1234`, `0xFFFF` present
in generated output paths? Placeholders in templates are an explicit
anti-pattern in this project (see README's warning).

### 2. PCI capability chain

- If the capability list is built or modified, is the **next-pointer** of the
last capability set to `0x00`? Forgetting to terminate is a classic source
of OS-side enumeration hangs.
- Is the order preserved across edits? Some OSes are sensitive to capability
ordering (MSI/MSI-X before PCIe Express cap, etc.).
- Are capability sizes correct? Wrong sizes shift the next-pointer offsets and
silently break enumeration in non-obvious ways.

### 3. BAR / DMA safety

- BAR size: always a power of two? Always ≥ 128 bytes (PCIe min)? Aperture
size encoded as expected (`~(size - 1)` mask)?
- 64-bit BARs: is the high half properly set/cleared in pair?
- Prefetchable bit set/cleared consistent with memory type?
- Any unbounded `size_t` / unbounded loop driven by donor-controlled value?
(Donor values are *inputs*, treat them as untrusted for the purpose of
generation-time safety even if the donor is the user's own device.)

### 4. SystemVerilog template correctness (`*.j2` files)

- Jinja2 expressions inside SV string literals need to **not** introduce
unescaped quotes or backslashes that break SV lexing.
- Signal widths: any `assign foo = bar;` where `foo` and `bar` differ in
width without an explicit slice or extension? Vivado warns but synthesizes;
the resulting RTL is often wrong.
- `case` statements: is `default:` present? Missing defaults synthesize
latches.
- Generate blocks: is the `genvar` declared at the outer scope?
- Macros: any new `\`define` in a header without an `\`ifndef` guard?

### 5. Constraints (XDC)

- New constraints: do they reference signal names that actually exist in the
generated RTL (grep the templates)?
- Are clock period constraints tighter than the donor's actual clock? Doing
so makes timing fail for no real reason.

### 6. Vivado handling

- Subprocess invocations: arg-list (`["vivado", ...]`) form, not
`shell=True`. The codebase recently hardened this (commits `8bf9464`,
`5770fba`) — regressions here matter.
- Working directory: are TCL scripts invoked from a deterministic CWD? Vivado
emits artifacts relative to CWD.

## How to deliver your review

1. Read the diff using `git diff` against the appropriate base.
2. For each in-scope file, run the relevant checks above.
3. Report findings as a **prioritized list**, highest-severity first.
Each finding must include:
- File and line range.
- Which check it violates.
- The specific risk (what breaks on which OS / under what condition).
- A concrete suggested fix, or "needs investigation" if not obvious.
4. If you have no findings, say exactly: **"No hardware-domain issues found in
the changed scope."** Do not pad with style commentary.

## What you must not do

- Do not run any build / synthesis — you have read-only tools by design.
- Do not modify files.
- Do not duplicate findings that the generic reviewer would also catch
(style, naming, missing tests). Stay in your lane.
- Do not invent invariants that aren't in the spec or in the codebase's
existing patterns.

## Useful references

- PCIe Base Spec rev 5.0, sections 7 (config space) and 9 (capabilities).
- `src/pci_capability/` for the project's working capability builder.
- `src/templates/sv/` for the SV style this project actually emits.
- Recent relevant commits: `26617ee`, `5770fba`, `8bf9464`.
20 changes: 20 additions & 0 deletions .claude/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{
"$schema": "https://json.schemastore.org/claude-code-settings.json",
"_comment": "Shared Claude Code settings for PCILeechFWGenerator. User-local overrides go in settings.local.json (gitignored).",
"hooks": {
"PostToolUse": [ {
"hooks": [ {
"command": "path=$(jq -r '.tool_input.file_path // empty'); case \"$path\" in *.py) command -v ruff >/dev/null 2>&1 && ruff check --fix --quiet \"$path\" 2>&1 | head -40 || true ;; esac",
"type": "command"
} ],
"matcher": "Edit|Write|MultiEdit"
} ],
"PreToolUse": [ {
"hooks": [ {
"command": "path=$(jq -r '.tool_input.file_path // empty'); echo \"$path\" | grep -qE '(^|/)src/_version\\.py$|(^|/)lib/voltcyclone-fpga/' && { echo \"Blocked: $path is either autogenerated by setuptools-scm (src/_version.py) or inside the voltcyclone-fpga git submodule (lib/voltcyclone-fpga/). Hand edits will be overwritten or land in the wrong repository. Make the change upstream or via the build system.\" >&2; exit 2; } || exit 0",
"type": "command"
} ],
"matcher": "Edit|Write|MultiEdit"
} ]
}
}
120 changes: 120 additions & 0 deletions .claude/skills/new-board-target/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
---
name: new-board-target
description: Use when the user wants to add support for a new FPGA board / dev kit target to PCILeechFWGenerator (e.g. "add support for board X", "register new board", "I want to target FPGA part Y"). Walks through the exact file touches needed, validation steps, and pitfalls specific to this codebase. Do NOT use for fixing build failures on an already-supported board — that's the vivado-log-analyzer skill.
disable-model-invocation: true
---

# Add a New Board Target

This is the canonical workflow for adding a new FPGA board to PCILeechFWGenerator. Follow it top-to-bottom — skipping steps almost always causes confusing synthesis failures down the line.

## 0. Confirm the upstream board exists

Boards are auto-discovered from the `voltcyclone-fpga` submodule
(`lib/voltcyclone-fpga/`). If the board directory doesn't exist there, you
must add it **upstream first**:

```bash
ls lib/voltcyclone-fpga/
# look for a directory matching the board (e.g. EnigmaX1, PCIeSquirrel, ZDMA)
```

If absent, open a PR against `VoltCyclone/voltcyclone-fpga` to add the board.
Do not hand-edit files under `lib/voltcyclone-fpga/` in this repo — that
directory is a git submodule and edits will not persist. (The PreToolUse hook
in `.claude/settings.json` will block such edits.)

## 1. Register the board in `BOARD_CONFIGS`

File: `src/file_management/board_discovery.py`

Add an entry under `BoardDiscovery.BOARD_CONFIGS` matching the upstream
directory name and FPGA part. Use existing entries (e.g. `pcileech_enigma_x1`,
`pcileech_75t484_x1`) as the template.

Key fields:
- `dir`: must match the directory name under `lib/voltcyclone-fpga/`.
- `fpga_part`: full Xilinx part (e.g. `xc7a75tfgg484-2`). Match the casing and
speed-grade suffix used elsewhere — wrong speed grades pass synthesis but
fail timing in mysterious ways.
- `max_lanes`: PCIe lane count the board exposes (1, 4, 8, 16).

If the board has a colloquial name (e.g. `75t`, `100t`), add a **legacy alias**
entry that points to the same directory — this keeps existing CLI invocations
working.

## 2. Verify discovery picks it up

```bash
.venv/bin/python -c "from pcileechfwgenerator.file_management.board_discovery import discover_all_boards; \
import json; print(json.dumps(list(discover_all_boards().keys()), indent=2))"
```

The new board name should appear. If it doesn't, the discovery walker isn't
finding it — usually because `dir` doesn't match the upstream directory
exactly (case-sensitive).

## 3. Check for board-specific RTL / constraints

```bash
ls src/templates/sv/ | grep -i <board-family> # e.g. 7series, ultrascale
ls src/templating/timing_constraints/ # board-specific .xdc bits
```

If the board uses a new FPGA family (e.g. first UltraScale+ target), you may
need to extend `FPGA_FAMILY_PATTERNS` in `src/device_clone/board_config.py`.
Existing patterns: `7series` (xc7a/k/v/z), `ultrascale`, `ultrascale_plus`.

## 4. Add tests

There are two relevant test files:

- `tests/test_board_discovery.py` — assert the new board is enumerated and has
the expected FPGA part.
- `tests/test_board_config.py` — assert `get_fpga_part('<board>')` returns
the right string and `KeyError` for unknown boards.

Add the new board to the parametrized cases. Run:

```bash
.venv/bin/python -m pytest tests/test_board_discovery.py tests/test_board_config.py -x -q
```

## 5. End-to-end smoke test

Generate firmware for the new board against a known donor profile:

```bash
.venv/bin/python -m pcileechfwgenerator build \
--board <new-board-name> \
--donor configs/donors/<known-good>.json \
--dry-run
```

`--dry-run` produces the generation outputs without invoking Vivado. If this
succeeds, you have at minimum a valid template materialization for the board.

## 6. Update docs

- Add the board to the README "Supported boards" section (search README for
an existing board name to find the table).
- Add a CHANGELOG entry under "Added".
- Do NOT add the board to `AUTHORS` or any other contributor-style file.

## Common pitfalls

| Symptom | Likely cause |
|---|---|
| Board not found at runtime | `dir` in `BOARD_CONFIGS` doesn't match upstream directory name (case-sensitive). |
| Synthesis fails with "part not supported" | `fpga_part` typo or wrong speed grade. |
| Timing fails immediately on impl | New FPGA family without a matching entry in `FPGA_FAMILY_PATTERNS`. |
| Donor ID propagation tests fail | New board needs a default in donor-profile fallback (`configs/fallbacks.yaml`). |
| Build worked locally but fails in CI | Submodule pin in `.gitmodules` is older than your local checkout — bump the submodule SHA. |

## When NOT to use this skill

- Fixing an existing board that suddenly broke → use `vivado-log-analyzer`.
- Just changing FPGA part of an existing board → edit `BOARD_CONFIGS` directly,
no need for the whole flow.
- Renaming a board → that's a breaking change; coordinate with a deprecation
period, don't follow this skill.
66 changes: 66 additions & 0 deletions .claude/skills/vivado-log-analyzer/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
---
name: vivado-log-analyzer
description: Use when a Vivado synthesis or implementation run fails, or when the user asks "why did the build fail / what's wrong with this vivado log". Surfaces only the actionable error/critical-warning lines from large Vivado outputs (vivado.log, synth_1/runme.log, impl_1/runme.log, *.rpt) so the diagnosis fits in context. Trigger when the user shares a Vivado log path, mentions "synthesis failed", "implementation failed", "timing failure", "BRAM exhausted", or pastes raw Vivado output.
---

# Vivado Log Analyzer

## When to invoke

- A Vivado build (synth / impl / bitstream) failed and the user wants the cause.
- The user pastes or points at one of: `vivado.log`, `*.runs/synth_1/runme.log`,
`*.runs/impl_1/runme.log`, `*_timing_summary.rpt`,
`*_utilization_*.rpt`, `*_drc_*.rpt`.
- Auto-trigger when a tool result contains `ERROR: [Synth`, `CRITICAL WARNING: [Place`,
`Timing constraint not met`, or similar Vivado markers.

## What it does

Vivado logs are usually 5–50 MB and 95% noise. This skill runs
`scripts/analyze.py <log-path>` which extracts only:

- `ERROR:` and `CRITICAL WARNING:` lines (Vivado's "this is why it failed").
- Timing summary endpoints with negative slack.
- Resource utilization rows above 95% (BRAM, LUT, DSP, URAM exhaustion).
- DRC violations (UCIO-1, LUTLP-1, NSTD-1, etc.).
- IP licensing failures (`ERROR: [Common 17-69]`).
- Missing constraint file errors.

It groups by failure category and prints **at most ~80 lines** so you can hand
the result to the user or reason about it without burning context.

## How to use

```bash
python3 .claude/skills/vivado-log-analyzer/scripts/analyze.py <path-to-log>
```

`<path-to-log>` can be any of:

- A single `.log` or `.rpt` file
- A Vivado project's `*.runs/` directory (script will walk it)
- A build output directory (`output/build/...`)

If the user hasn't given a path, look first in `output/`, then in
`build/`, then in any `*.runs/synth_1/` or `*.runs/impl_1/` directory.

## After analysis

1. Print the categorized summary the script produced.
2. **Do not** re-read the raw log unless the summary is empty or the user asks
for a specific line context. Re-reading defeats the purpose of the skill.
3. If timing is the failure category, point the user at
`src/templating/timing_constraints/` and any board-specific `.xdc` files.
4. If resource exhaustion is the failure category, the fix is almost always in
`src/templates/sv/` (template bloat) or in board selection
(`src/file_management/board_discovery.py` — wrong FPGA part).
5. If IP licensing or "missing IP" errors appear, check the Containerfile and
confirm the build is using a licensed Vivado image.

## Anti-patterns

- Do not paste the raw Vivado log back to the user. They already have it.
- Do not run this on logs smaller than ~500 lines — just read them directly.
- Do not try to "fix" timing failures by widening constraints without
understanding the constraint origin (`src/templating/timing_constraints/`
generates them; constraints are tied to donor profile).
Loading
Loading