Support CPMS test role in shiftstack-qa automation#9
Conversation
imatza-rh
left a comment
There was a problem hiding this comment.
Good port from the IR-plugin. Nice improvements: OpenStack resource cleanup in always, must-gather on test failure. Nit: PR description says "branch fallback" but prepare_openshift_tests.yml has no fallback - same as all other roles, works fine.
Please run ./gate.sh (ansible-lint + pre-commit inside the shiftstack-client container) - no CI checks are configured on GitHub PRs. yamllint passes locally, but ansible-lint needs the container.
| - post | ||
| - verification | ||
| - day2ops | ||
| - cpms_test |
There was a problem hiding this comment.
Is cpms_test needed here? This is a Jenkins job definition — the Zuul integration job uses osp_verification.yaml. The e2e-periodic rolls all 3 masters (5h ginkgo timeout) and has been a systemic timeout in Jenkins since 4.18.
There was a problem hiding this comment.
You're right, this is a Jenkins job definition and the e2e-periodic 5h timeout would be problematic here. Removed cpms_test from stages, cpms_replace_attrs from day2ops_procedures, and the cpms_replacements vars. The CPMS tests will only run via the Zuul integration job using osp_verification.yaml.
| - cpms_replacements.sg_name in item.security_groups | json_query('[*].name') | ||
| with_items: "{{ master_after }}" | ||
|
|
||
| rescue: |
There was a problem hiding this comment.
The day2ops wrapper run_procedure.yml already runs must-gather + records failure on rescue. This inner rescue duplicates that. See how moving-etcd-to-ephemeral.yml handles it — no inner rescue, relies on the wrapper. Consider removing this rescue block (keep the always — the restore logic is correct and needed).
There was a problem hiding this comment.
Agreed. run_procedure.yml already handles must-gather and failure recording in its rescue block. Removed the inner rescue to follow the same pattern as moving-etcd-to-ephemeral.yml. The always block with the CPMS restore logic is kept since that's procedure-specific cleanup that the wrapper can't handle.
…inner rescue Co-authored-by: Cursor <cursoragent@cursor.com>
…n on test failure
f88cbb5 to
cce04e6
Compare
Summary
cpms_teststage role that runs the upstreamcluster-control-plane-machine-set-operatore2e tests (presubmit + periodic) with branch fallback logic for newer OCP versionscpms_replace_attrsday2ops procedure ported from the openshift-ir-plugin, which validates CPMS reconciliation by patching failure domains, networks, and security groups on control plane nodesocp_testing.yamland enable it inosp_verificationand4.17_ovnkubernetes_ipijob definitionsDetails
The CPMS operator manages the lifecycle of OCP control plane machines. This PR ports the existing Jenkins/IR-based CPMS testing into shiftstack-qa's Ansible automation framework.
New
cpms_teststage role:mainfor versions without a release branch)make e2e-presubmitandmake e2e-periodicwithOPENSHIFT_CI=truefor JUnit outputNew
cpms_replace_attrsday2ops procedure: