TRT-1989: model changes to prep for partitioning#3532
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
@neisw: This pull request references TRT-1989 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughThis PR adds denormalized partitioning fields ( ChangesProw run and test partitioning fields
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 15 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (15 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/db/models/prow.go`:
- Around line 92-99: ProwJobID was added as a denormalized lookup key but lacks
an index; modify the struct field declaration for ProwJobID (the ProwJobID field
in the prow job run tests model) to add a GORM index tag (e.g., include
`gorm:"index"` or a named index) so queries filtering by ProwJobID use an index
rather than causing full table scans.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: d96f992c-5f51-4c25-a083-71ee5bfd7ce3
📒 Files selected for processing (5)
cmd/sippy/seed_data.gopkg/api/job_runs_test.gopkg/dataloader/prowloader/prow.gopkg/db/models/prow.gopkg/sippyserver/pr_new_tests_worker_test.go
| // used for variants | ||
| // skips joining on ProwJobRunID just to get ProwJobID | ||
| ProwJobID uint | ||
| // used for partitioning | ||
| ProwJobRunTimestamp time.Time `gorm:"expression:DATE(timestamp AT TIME ZONE 'UTC')"` | ||
| // used for partitioning | ||
| ProwJobRunRelease string `gorm:"varchar(10)"` | ||
| TestID uint `gorm:"index;index:idx_prow_job_run_tests_test_id_status"` |
There was a problem hiding this comment.
Index the new denormalized lookup key.
ProwJobID is added to avoid a join, but without an index this can still force full scans on prow_job_run_tests under real data volume.
💡 Suggested fix
- ProwJobID uint
+ ProwJobID uint `gorm:"index"`📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // used for variants | |
| // skips joining on ProwJobRunID just to get ProwJobID | |
| ProwJobID uint | |
| // used for partitioning | |
| ProwJobRunTimestamp time.Time `gorm:"expression:DATE(timestamp AT TIME ZONE 'UTC')"` | |
| // used for partitioning | |
| ProwJobRunRelease string `gorm:"varchar(10)"` | |
| TestID uint `gorm:"index;index:idx_prow_job_run_tests_test_id_status"` | |
| // used for variants | |
| // skips joining on ProwJobRunID just to get ProwJobID | |
| ProwJobID uint `gorm:"index"` | |
| // used for partitioning | |
| ProwJobRunTimestamp time.Time `gorm:"expression:DATE(timestamp AT TIME ZONE 'UTC')"` | |
| // used for partitioning | |
| ProwJobRunRelease string `gorm:"varchar(10)"` | |
| TestID uint `gorm:"index;index:idx_prow_job_run_tests_test_id_status"` |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/db/models/prow.go` around lines 92 - 99, ProwJobID was added as a
denormalized lookup key but lacks an index; modify the struct field declaration
for ProwJobID (the ProwJobID field in the prow job run tests model) to add a
GORM index tag (e.g., include `gorm:"index"` or a named index) so queries
filtering by ProwJobID use an index rather than causing full table scans.
There was a problem hiding this comment.
@coderabbitai, we are only staging the new columns for now and beginning data collection. We will be migrating the tables and backfilling the new columns at a later time. Indexing will be updated then.
There was a problem hiding this comment.
@neisw, understood — thanks for the context! A phased approach (stage columns → collect data → migrate/backfill → add indexes) makes sense here. Would you like me to open a follow-up GitHub issue to track adding the index on ProwJobID (and any other partitioning-related indexes) once the migration and backfill are complete?
✏️ Learnings added
Learnt from: neisw
Repo: openshift/sippy PR: 3532
File: pkg/db/models/prow.go:92-99
Timestamp: 2026-05-15T13:46:44.315Z
Learning: In `pkg/db/models/prow.go`, the `ProwJobID`, `ProwJobRunTimestamp`, and `ProwJobRunRelease` fields added to `ProwJobRunTest`, as well as `ProwJobRelease` on `ProwJobRun` and the timestamp/release fields on `ProwJobRunTestOutput`, are being staged in a phased approach: columns are added first for data collection, with table migration, backfill, and index creation deferred to a later PR. Do not flag missing indexes on these new columns as blocking issues.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
|
Scheduling required tests: |
|
Scheduling required tests: |
|
@coderabbitai, regarding Consider that update along with the one regarding the indexes and update your analysis. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 4816
SQL injection at Index analysis update: I'll update my analysis to reflect that:
✏️ Learnings added
|
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/dataloader/prowloader/prow.go (1)
960-991:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMake
ProwJobRunand join-row inserts atomic.
ProwJobRunis committed before join rows. If a later join insert fails, you keep partial state and retries can fail on duplicate run ID.💡 Suggested fix
- err = pl.dbc.DB.WithContext(ctx).Create(&models.ProwJobRun{ + err = pl.dbc.DB.WithContext(ctx).Transaction(func(tx *gorm.DB) error { + if err := tx.Create(&models.ProwJobRun{ Model: gorm.Model{ ID: uint(id), }, Cluster: pj.Spec.Cluster, Duration: duration, ProwJob: *dbProwJob, ProwJobID: dbProwJob.ID, ProwJobRelease: dbProwJob.Release, URL: pj.Status.URL, GCSBucket: pj.Spec.DecorationConfig.GCSConfiguration.Bucket, Timestamp: pj.Status.StartTime, OverallResult: overallResult, TestFailures: failures, Succeeded: overallResult == sippyprocessingv1.JobSucceeded, Labels: labels, Annotations: annotations, - }).Error - if err != nil { - return err - } - - for _, pull := range pulls { - if err := pl.dbc.DB.WithContext(ctx).Create(&models.ProwJobRunProwPullRequest{ + }).Error; err != nil { + return fmt.Errorf("create prow job run %d: %w", id, err) + } + + for _, pull := range pulls { + if err := tx.Create(&models.ProwJobRunProwPullRequest{ ProwJobRunID: uint(id), ProwPullRequestID: pull.ID, ProwJobRunRelease: dbProwJob.Release, ProwJobRunTimestamp: pj.Status.StartTime, - }).Error; err != nil { - return err + }).Error; err != nil { + return fmt.Errorf("create prow_job_run_prow_pull_request run=%d pr=%d: %w", id, pull.ID, err) + } } - } + return nil + }) + if err != nil { + return err + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/dataloader/prowloader/prow.go` around lines 960 - 991, Wrap the ProwJobRun and subsequent ProwJobRunProwPullRequest inserts in a single DB transaction so they commit or roll back together: use pl.dbc.DB.WithContext(ctx).Transaction(...) and inside use the transaction handle (tx) instead of pl.dbc.DB to call Create for models.ProwJobRun and for each models.ProwJobRunProwPullRequest; return any error from the transaction callback so a failed join insert rolls back the ProwJobRun and avoids partial/duplicate state. Ensure you still set the same fields (ID via uint(id), ProwJob, ProwJobID, etc.) when creating ProwJobRun and reference uint(id) for ProwJobRunID in the join inserts.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/plans/trt-1989-partitioning-prep.md`:
- Line 106: The WHERE clause's operator precedence causes the OR to apply
globally; update the condition in the SQL that uses r.prow_job_id = j.id and
r.prow_job_release so the NULL/empty check is grouped: change the predicate to
require r.prow_job_id = j.id AND (r.prow_job_release IS NULL OR
r.prow_job_release = ''), or equivalently use r.prow_job_release IN (NULL,
'')—apply this change where the WHERE references r.prow_job_id, j.id and
r.prow_job_release.
In `@pkg/db/db.go`:
- Around line 75-79: The call to d.DB.SetupJoinTable(&models.ProwJobRun{},
"PullRequests", &models.ProwJobRunProwPullRequest{}) returns a raw error that is
returned directly; wrap that error with contextual information using fmt.Errorf
and the %w verb before returning so callers get useful migration context (e.g.,
include the operation and involved models/association names when wrapping the
error).
---
Outside diff comments:
In `@pkg/dataloader/prowloader/prow.go`:
- Around line 960-991: Wrap the ProwJobRun and subsequent
ProwJobRunProwPullRequest inserts in a single DB transaction so they commit or
roll back together: use pl.dbc.DB.WithContext(ctx).Transaction(...) and inside
use the transaction handle (tx) instead of pl.dbc.DB to call Create for
models.ProwJobRun and for each models.ProwJobRunProwPullRequest; return any
error from the transaction callback so a failed join insert rolls back the
ProwJobRun and avoids partial/duplicate state. Ensure you still set the same
fields (ID via uint(id), ProwJob, ProwJobID, etc.) when creating ProwJobRun and
reference uint(id) for ProwJobRunID in the join inserts.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 00953495-29c0-48a6-90e0-1ea6c64693f0
📒 Files selected for processing (4)
docs/plans/trt-1989-partitioning-prep.mdpkg/dataloader/prowloader/prow.gopkg/db/db.gopkg/db/models/prow.go
|
Scheduling required tests: |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/plans/trt-1989-partitioning-prep.md`:
- Around line 86-89: Update the wording in the migration constraints so the FK
direction is correct: change the phrase "Tables with FKs from `prow_job_runs` to
non-partitioned tables" to express that these are tables with FKs to
`prow_job_runs` (e.g., annotations, pull request join table) — ensure the
sentence reads something like "Tables with FKs to `prow_job_runs` from
non-partitioned tables must either be co-partitioned or have their FKs dropped,"
and verify the examples (`annotations`, `pull request join table`) read as
tables referencing `prow_job_runs`.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: d54c95ca-5526-45c4-9646-8380aa4b1d3e
📒 Files selected for processing (3)
docs/plans/trt-1989-partitioning-prep.mdpkg/dataloader/prowloader/prow.gopkg/db/db.go
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/db/db.go
- pkg/dataloader/prowloader/prow.go
| 3. Tables with FKs **from** `prow_job_runs` to non-partitioned tables | ||
| (annotations, pull request join table) must either be co-partitioned | ||
| or have their FKs dropped. | ||
|
|
There was a problem hiding this comment.
Fix FK direction wording in migration constraints.
This section says FKs are from prow_job_runs, but the listed examples are tables with FKs to prow_job_runs. Please correct the direction to avoid migration confusion.
Suggested doc fix
-3. Tables with FKs **from** `prow_job_runs` to non-partitioned tables
- (annotations, pull request join table) must either be co-partitioned
- or have their FKs dropped.
+3. Tables with FKs **to** `prow_job_runs`
+ (annotations, pull request join table) must either be co-partitioned
+ or have those FK constraints adjusted/dropped during partition migration.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 3. Tables with FKs **from** `prow_job_runs` to non-partitioned tables | |
| (annotations, pull request join table) must either be co-partitioned | |
| or have their FKs dropped. | |
| 3. Tables with FKs **to** `prow_job_runs` | |
| (annotations, pull request join table) must either be co-partitioned | |
| or have those FK constraints adjusted/dropped during partition migration. | |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/plans/trt-1989-partitioning-prep.md` around lines 86 - 89, Update the
wording in the migration constraints so the FK direction is correct: change the
phrase "Tables with FKs from `prow_job_runs` to non-partitioned tables" to
express that these are tables with FKs to `prow_job_runs` (e.g., annotations,
pull request join table) — ensure the sentence reads something like "Tables with
FKs to `prow_job_runs` from non-partitioned tables must either be co-partitioned
or have their FKs dropped," and verify the examples (`annotations`, `pull
request join table`) read as tables referencing `prow_job_runs`.
There was a problem hiding this comment.
Claude isn't in agreement with you @coderabbitai: These GORM tags create FK constraints on the child/join tables that reference prow_job_runs.id, but the cascade behavior is defined from the ProwJobRun side. The constraint says "when a prow_job_run is deleted, cascade to these tables." That's a FK from prow_job_runs's perspective in the GORM model.
There was a problem hiding this comment.
@neisw, that's a fair point from the GORM model perspective — the cascade behavior is declared on the ProwJobRun parent struct, so "from prow_job_runs" makes sense when reading the Go code.
For what it's worth, standard SQL/PostgreSQL terminology frames it the other way (the referencing table holds the FK column pointing to the parent), which is why the wording caught my eye — especially since this is a PostgreSQL partitioning guide where that distinction matters during migration. But since your team is familiar with the GORM model and the intent is clear in context, I'll defer to your judgment on the phrasing.
|
Scheduling required tests: |
|
/hold |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: neisw, smg247 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/unhold |
|
@neisw: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Model changes include
Summary by CodeRabbit
Chores
Tests
Documentation