Skip to content

Bump PyTorch pin to nightly dev20260517#19630

Open
pytorchupdatebot wants to merge 1 commit into
mainfrom
automated/pytorch-pin-bump-dev20260517
Open

Bump PyTorch pin to nightly dev20260517#19630
pytorchupdatebot wants to merge 1 commit into
mainfrom
automated/pytorch-pin-bump-dev20260517

Conversation

@pytorchupdatebot
Copy link
Copy Markdown
Collaborator

Summary

Automated weekly PyTorch pin bump.

  • Updates NIGHTLY_VERSION in torch_pin.py to dev20260517
  • Updates .ci/docker/ci_commit_pins/pytorch.txt to the corresponding nightly commit hash
  • Syncs c10 headers from PyTorch into runtime/core/portable_type/c10/

This PR was created automatically. If CI fails, Claude will attempt to fix issues (up to 3 attempts). If CI still fails, human review will be requested.

cc @jakeszwe

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented May 18, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19630

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 64 New Failures, 167 Cancelled Jobs, 11 Unclassified Failures

As of commit 483afb7 with merge base c531386 (image):

NEW FAILURES - The following jobs have failed:

UNCLASSIFIED FAILURES - DrCI could not classify the following jobs because the workflow did not run on the merge base. The failures may be pre-existing on trunk or introduced by this PR:

CANCELLED JOBS - The following jobs were cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 18, 2026
@pytorchupdatebot
Copy link
Copy Markdown
Collaborator Author

@claude [ci-fix-attempt 1/3]

The trunk CI workflow has failed on this automated PyTorch pin bump PR.
Failed run: https://github.com/pytorch/executorch/actions/runs/26027997076

Please:

  1. Read the Dr. CI comment on this PR for a summary of which jobs failed and whether they are flaky. Ignore failures marked as FLAKY.
  2. Use your CI tools to download the failure logs for the non-flaky failing jobs
  3. Identify the root cause of the failure
  4. If this is a build or test failure caused by PyTorch API changes, fix the ExecuTorch code to be compatible with the new PyTorch version
  5. If this is a c10 header sync issue, the headers have already been synced by the pin bump script — the issue is likely in ExecuTorch code that uses those headers
  6. Run lintrunner -a on any files you change
  7. Push your fix as a new commit to this PR branch

Important constraints:

  • Do NOT modify torch_pin.py or .ci/docker/ci_commit_pins/pytorch.txt — the pin itself is correct
  • Do NOT modify files under runtime/core/portable_type/c10/ unless the sync introduced a new API that ExecuTorch code needs to adapt to
  • Focus on fixing ExecuTorch code to be compatible with the new PyTorch APIs
  • If this is a major BC-breaking change that requires architectural discussion, say so clearly and stop — do not attempt a fix

@claude
Copy link
Copy Markdown

claude Bot commented May 18, 2026

Claude finished @pytorchupdatebot's task in 8m 1s —— View job


CI Fix Attempt (1/3)

Analyzed CI failures from the PyTorch pin bump to dev20260517.

  • Analyze Dr. CI failure summary and categorize failures
  • Download and inspect failure logs for root cause
  • Identify whether failures are in ExecuTorch code or third-party dependencies
  • Determine if any ExecuTorch code fix is possible — no fix possible from ExecuTorch side

Root Cause Analysis

After downloading and analyzing failure logs, I've identified two distinct upstream issues — neither is caused by ExecuTorch code:

1. fbgemm inline assembly compilation errors (~45 failures)

The majority of failures come from third_party/fbgemm/src/fp32/FbgemmFP32UKernelsAvx512_256.cc:47:8: error: invalid operand for instruction (and similar files). The fbgemm source uses Intel-syntax inline assembly (-masm=intel), and the assembler in the CI Docker images (clang-12) is rejecting instructions like vcomiss xmm31, xmm0 and DWORD PTR syntax.

This is a bug inside PyTorch's third_party/fbgemm — the code is compiled during Docker image builds and ExecuTorch test jobs where PyTorch is built from source. ExecuTorch has no control over this code.

2. torchvision incompatibility (~15 failures)

The macOS unittest failures, CoreML tests, arm-backend-zephyr tests, and MPS tests all fail with:

RuntimeError: operator torchvision::nms does not exist

This occurs because the new torch nightly dev20260517 is incompatible with the pre-built torchvision wheel installed in CI. When torchvision initializes, it tries to register a fake kernel for torchvision::nms, but the operator was never registered by the torchvision C extension (which was built against a different torch version).

Conclusion

Neither issue can be fixed by modifying ExecuTorch code. Both are upstream PyTorch/fbgemm issues:

  1. The fbgemm assembly error is a regression in PyTorch's bundled fbgemm where new AVX-512 kernel code is incompatible with clang-12's assembler. The new kernels use extended registers (xmm16-xmm31) and EVEX-encoded instructions that require -mavx512fp16 or similar flags that aren't being passed.
  2. The torchvision incompatibility is a nightly version skew — the torch nightly hasn't published a compatible torchvision build yet, or CI is fetching an older cached torchvision.

The c10 header changes in this PR (Macros.h, BFloat16.h) are benign refactors around ROCm/CUDA warp size handling and BFloat16 conversions that don't affect ExecuTorch's use of these headers.

Recommendation

This pin bump should wait for upstream fixes:

  • The fbgemm assembly issue needs to be fixed in PyTorch's fbgemm submodule (the new FP32/FP16 ukernels use AVX-512 extended registers without proper ISA flags)
  • The torchvision issue will resolve once a compatible torchvision nightly is published matching dev20260517

Human review is needed to decide whether to wait for PyTorch to fix fbgemm, or skip this nightly pin.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/pytorch-pin-bump ci-no-td CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants