Skip to content

Fix deploy-edpm.yml crash on unpreprovisioned baremetal computes#3930

Open
rebtoor wants to merge 1 commit into
openstack-k8s-operators:mainfrom
rebtoor:deploy-edpm-fix
Open

Fix deploy-edpm.yml crash on unpreprovisioned baremetal computes#3930
rebtoor wants to merge 1 commit into
openstack-k8s-operators:mainfrom
rebtoor:deploy-edpm-fix

Conversation

@rebtoor
Copy link
Copy Markdown
Contributor

@rebtoor rebtoor commented May 13, 2026

The NFS and Ceph plays in deploy-edpm.yml target the computes group with gather_facts enabled (the default). In architecture deployments with preProvisioned=false, compute nodes are bare libvirt domains with no OS until Ironic provisions them during the kustomize_deploy stages. Ansible's implicit fact gathering tries to SSH into these unreachable hosts, aborting the entire playbook before the architecture deployment can run.

Add gather_facts: false to both the NFS and Ceph plays. For the NFS play, insert an end_play guard when cifmw_architecture_scenario is defined and move fact gathering after that guard. Architecture deploys handle NFS via a pre_stage hook instead (see the companion nfs-on-computes.yml hook playbook). The Ceph play already had an end_play guard but Ansible was crashing on fact gathering before reaching it.

This fixes all architecture-uni03gamma-deploy-bm-* jobs which have been consistently failing since the deploy-bm variant was introduced.

Related-Issue: ANVIL-109

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 13, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign evallesp for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

The NFS and Ceph plays in deploy-edpm.yml target the computes
group with gather_facts enabled (the default). In architecture
deployments with preProvisioned=false, compute nodes are bare
libvirt domains with no OS until Ironic provisions them during
the kustomize_deploy stages. Ansible's implicit fact gathering
tries to SSH into these unreachable hosts, aborting the entire
playbook before the architecture deployment can run.

Add gather_facts: false to both the NFS and Ceph plays. For the
NFS play, insert an end_play guard when cifmw_architecture_scenario
is defined and move fact gathering after that guard. Architecture
deploys handle NFS via a pre_stage hook instead (see the companion
nfs-on-computes.yml hook playbook). The Ceph play already had an
end_play guard but Ansible was crashing on fact gathering before
reaching it.

Similarly, the "Fetch network facts" task in deploy_architecture.yml
delegates setup to every host in the inventory, including
unprovisioned computes. Skip computes when
cifmw_edpm_deploy_pre_provisioned is false.

This fixes all architecture-uni03gamma-deploy-bm-* jobs which have
been consistently failing since the deploy-bm variant was introduced.

Related-Issue: ANVIL-109
Co-authored-by: Cursor <cursoragent@cursor.com>

Signed-off-by: Roberto Alfieri <ralfieri@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant