Skip to content

ci: add amd-ci-job-monitor for runner fleet reporting#4897

Open
amdfaa wants to merge 1 commit into
developfrom
feat/amd-ci-job-monitor
Open

ci: add amd-ci-job-monitor for runner fleet reporting#4897
amdfaa wants to merge 1 commit into
developfrom
feat/amd-ci-job-monitor

Conversation

@amdfaa
Copy link
Copy Markdown

@amdfaa amdfaa commented May 19, 2026

Adds amd-ci-job-monitor.yml and supporting scripts (ported from ROCm/aiter) to publish runner-fleet-report artifacts for the AI Frameworks Dashboard import pipeline.

MIGraphX-specific: exclude_jobs for image prep jobs; runner-config.yml maps ROCM-Ubuntu.

Made with Cursor

Port monitor workflow and scripts from ROCm/aiter with MIGraphX-specific
exclude_jobs and runner-config for ROCM-Ubuntu.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copilot AI review requested due to automatic review settings May 19, 2026 22:47
@amdfaa amdfaa requested a review from causten as a code owner May 19, 2026 22:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an automated GitHub Actions “job monitor” workflow and supporting Python scripts to query recent Actions job history and publish a runner-fleet-report artifact (plus an intermediate snapshot) for downstream dashboard ingestion.

Changes:

  • Introduces .github/workflows/amd-ci-job-monitor.yml to (a) discover workflow jobs, (b) fetch a repo-wide Actions snapshot, and (c) generate per-job summaries and a consolidated runner fleet report artifact.
  • Adds .github/scripts/list_jobs.py (workflow/job discovery) and .github/scripts/query_job_status.py (GitHub API querying + report formatting).
  • Adds .github/runner-config.yml to map runner labels to hardware metadata for the runner fleet report.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
.github/workflows/amd-ci-job-monitor.yml New scheduled/dispatchable workflow to generate Actions snapshot + runner fleet report artifacts.
.github/scripts/query_job_status.py New report generator querying the Actions API (or a snapshot) and emitting markdown summaries.
.github/scripts/list_jobs.py New workflow parser producing a workflow/job matrix and workflow→job map used by the workflow.
.github/runner-config.yml New runner label→GPU metadata mapping consumed by runner report generation.

Comment on lines +48 to +53
- name: Checkout code
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
Comment on lines +29 to +34
description: "Comma-separated job names to exclude"
required: false
default: "cancel,check_image,build_image,build_SLES_image"
type: string
job_filter:
description: "Single job name filter (optional)"
Comment on lines +27 to +32
)
parser.add_argument(
"--exclude-jobs",
default="",
help="Comma-separated job names to skip.",
)
rate_limit_reset_epoch = exc.reset_epoch
print("[warn] GitHub API rate limit exceeded during report generation.")
except requests.HTTPError as exc:
print(f"[warn] Failed to query workflow runs: {exc}")
Comment on lines +523 to +545
def build_queue_distribution(queue_times: list[float]):
if not queue_times:
return []
ranges = [
("< 1 min", 0, 60),
("1-5 min", 60, 300),
("5-15 min", 300, 900),
("15-30 min", 900, 1800),
("30-60 min", 1800, 3600),
("> 60 min", 3600, float("inf")),
]
total = len(queue_times)
buckets = []
for label, lower, upper in ranges:
count = sum(1 for value in queue_times if lower <= value < upper)
percentage = round(count / total * 100, 1) if total else 0.0
buckets.append([label, count, f"{percentage}%"])
return buckets


def build_runner_report_rows(job_rows: list[dict[str, Any]], report_time: datetime):
stats = defaultdict(
lambda: {
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants