Skip to content

Decide on must-gather.sh inclusion in Dockerfile after distroless migration #2436

@rajathagasthya

Description

@rajathagasthya

Part of NVIDIA/cloud-native-team#299. Deferred from PR #2434.

hack/must-gather.sh is a 286-line bash script copied into the image
at docker/Dockerfile:103 as /usr/bin/gather, the OpenShift
oc adm must-gather --image=… plugin entrypoint. It shells out to
kubectl/oc, neither of which ship in the distroless base today.
The script is therefore largely unusable from inside the pod already,
and once the image base drops -dev (no shell), it cannot run at all.

Decide and implement one of:

  1. Rewrite the script in Go as a subcommand of nvidia-validator
    or a new dedicated binary. Use a vendored Go Kubernetes client.
    Stays inside the gpu-operator distroless image.
  2. Split must-gather into its own image — a small RHEL/UBI-based
    image that vendors oc and the bash script. Document a separate
    image tag for the must-gather plugin.
  3. Remove /usr/bin/gather from the gpu-operator image entirely
    and document that customers should run must-gather.sh directly
    from outside the cluster. Customers report this is what they
    already do in practice.

Acceptance:

  • Decision is made and recorded
  • Customer-facing must-gather workflow is documented

Metadata

Metadata

Labels

enhancementImprovements to existing features, performance, or usability (not bug fixes or new features).

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions