Skip to content

[lld][MachO] Add ARM64e pointer authentication linking support#188378

Open
oskarwirga wants to merge 22 commits intollvm:mainfrom
oskarwirga:arm64e_lld
Open

[lld][MachO] Add ARM64e pointer authentication linking support#188378
oskarwirga wants to merge 22 commits intollvm:mainfrom
oskarwirga:arm64e_lld

Conversation

@oskarwirga
Copy link
Copy Markdown
Contributor

This PR implements ARM64e linking support in lld's Mach-O backend. This is a continuation of work discussed in #79543 and I included @BertalanD's suggestion to keep the Relocation object size small, thanks! This PR is part of several I have planned, but contains everything needed to link arm64e Mach-Os with LLD.

What this PR adds

  • Chained fixup pointer structures: Adds bitfield definitions covering both legacy (16-bit ordinal) and USERLAND24 (24-bit ordinal) formats.
  • AUTH relocation parsing: Extends lld's Relocation struct with auth metadata (diversity, key, address-diversity) parsed from ARM64_RELOC_AUTHENTICATED_POINTER relocations. Auth fields are packed in a union with the 64-bit addend to keep sizeof(Relocation) at 24 bytes.
  • Authenticated stubs
  • Set default CPU_SUBTYPE_ARM64E with ptrauth ABI version 0, I got that from Apple's LD
  • Dual GOT: Separate __auth_got section for signed function pointers alongside the existing unsigned __got. Stubs load from __auth_got; address-taken data references use __got.
  • Symbol routing: Stub creation targets __auth_got; personality pointers in unwind info route to __auth_got; ObjC message send fast stubs use __auth_got for objc_msgSend lookup with braa authentication
  • Chained fixup encoding: Encodes auth-rebase, auth-bind, plain-rebase, and plain-bind entries. Automatically selects DYLD_CHAINED_PTR_ARM64E_USERLAND24 format based on deployment target (macOS 12.0+, iOS 15.0+, watchOS 8.0+, tvOS 15.0+)

Testing

  • arm64e-stubs.s — Verifies authenticated stub code sequence
  • arm64e-auth-got.s — Verifies dual GOT (__auth_got + __got) creation and routing
  • arm64e-auth-reloc.s — Verifies ARM64_RELOC_AUTHENTICATED_POINTER parsing and chained fixup encoding
  • arm64e-auth-data.s — Verifies auth metadata (key, diversity, addrDiv) flows through the Relocation union into chained fixups
  • arm64e-chained-fixups.s — Verifies chained fixup format selection and encoding
  • arm64e-reject-mixed.s — Verifies arm64/arm64e object mixing rejection
  • arm64e-no-fixup-chains.s — Verifies -no_fixup_chains is rejected for arm64e
  • End-to-end tested with iOS/macOS apps

AI Disclosure

I used Claude heavily throughout development, primarily to iterate through linker errors. All generated code was manually reviewed and tested on ARM64e hardware. I feel very confident in these changes being functionally correct because I tested on hardware extensively, architecturally they may be lacking!

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Mar 24, 2026

@llvm/pr-subscribers-llvm-binary-utilities

@llvm/pr-subscribers-lld

Author: Oskar Wirga (oskarwirga)

Changes

This PR implements ARM64e linking support in lld's Mach-O backend. This is a continuation of work discussed in #79543 and I included @BertalanD's suggestion to keep the Relocation object size small, thanks! This PR is part of several I have planned, but contains everything needed to link arm64e Mach-Os with LLD.

What this PR adds

  • Chained fixup pointer structures: Adds bitfield definitions covering both legacy (16-bit ordinal) and USERLAND24 (24-bit ordinal) formats.
  • AUTH relocation parsing: Extends lld's Relocation struct with auth metadata (diversity, key, address-diversity) parsed from ARM64_RELOC_AUTHENTICATED_POINTER relocations. Auth fields are packed in a union with the 64-bit addend to keep sizeof(Relocation) at 24 bytes.
  • Authenticated stubs
  • Set default CPU_SUBTYPE_ARM64E with ptrauth ABI version 0, I got that from Apple's LD
  • Dual GOT: Separate __auth_got section for signed function pointers alongside the existing unsigned __got. Stubs load from __auth_got; address-taken data references use __got.
  • Symbol routing: Stub creation targets __auth_got; personality pointers in unwind info route to __auth_got; ObjC message send fast stubs use __auth_got for objc_msgSend lookup with braa authentication
  • Chained fixup encoding: Encodes auth-rebase, auth-bind, plain-rebase, and plain-bind entries. Automatically selects DYLD_CHAINED_PTR_ARM64E_USERLAND24 format based on deployment target (macOS 12.0+, iOS 15.0+, watchOS 8.0+, tvOS 15.0+)

Testing

  • arm64e-stubs.s — Verifies authenticated stub code sequence
  • arm64e-auth-got.s — Verifies dual GOT (__auth_got + __got) creation and routing
  • arm64e-auth-reloc.s — Verifies ARM64_RELOC_AUTHENTICATED_POINTER parsing and chained fixup encoding
  • arm64e-auth-data.s — Verifies auth metadata (key, diversity, addrDiv) flows through the Relocation union into chained fixups
  • arm64e-chained-fixups.s — Verifies chained fixup format selection and encoding
  • arm64e-reject-mixed.s — Verifies arm64/arm64e object mixing rejection
  • arm64e-no-fixup-chains.s — Verifies -no_fixup_chains is rejected for arm64e
  • End-to-end tested with iOS/macOS apps

AI Disclosure

I used Claude heavily throughout development, primarily to iterate through linker errors. All generated code was manually reviewed and tested on ARM64e hardware. I feel very confident in these changes being functionally correct because I tested on hardware extensively, architecturally they may be lacking!


Patch is 66.98 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/188378.diff

26 Files Affected:

  • (modified) lld/MachO/Arch/ARM64Common.cpp (+8-1)
  • (added) lld/MachO/Arch/ARM64e.cpp (+268)
  • (modified) lld/MachO/CMakeLists.txt (+1)
  • (modified) lld/MachO/Driver.cpp (+8-1)
  • (modified) lld/MachO/InputFiles.cpp (+29-2)
  • (modified) lld/MachO/InputSection.cpp (+19-5)
  • (modified) lld/MachO/InputSection.h (+1)
  • (modified) lld/MachO/MapFile.cpp (+3-1)
  • (modified) lld/MachO/OutputSegment.cpp (+1)
  • (modified) lld/MachO/Relocations.h (+36-2)
  • (modified) lld/MachO/Symbols.cpp (+17-3)
  • (modified) lld/MachO/Symbols.h (+8-1)
  • (modified) lld/MachO/SyntheticSections.cpp (+242-31)
  • (modified) lld/MachO/SyntheticSections.h (+22-6)
  • (modified) lld/MachO/Target.h (+1)
  • (modified) lld/MachO/UnwindInfoSection.cpp (+22-6)
  • (modified) lld/MachO/Writer.cpp (+27-6)
  • (modified) lld/MachO/Writer.h (+2-1)
  • (added) lld/test/MachO/arm64e-auth-data.s (+49)
  • (added) lld/test/MachO/arm64e-auth-got.s (+51)
  • (added) lld/test/MachO/arm64e-auth-reloc.s (+46)
  • (added) lld/test/MachO/arm64e-chained-fixups.s (+52)
  • (added) lld/test/MachO/arm64e-no-fixup-chains.s (+21)
  • (added) lld/test/MachO/arm64e-reject-mixed.s (+43)
  • (added) lld/test/MachO/arm64e-stubs.s (+51)
  • (modified) llvm/include/llvm/BinaryFormat/MachO.h (+71)
diff --git a/lld/MachO/Arch/ARM64Common.cpp b/lld/MachO/Arch/ARM64Common.cpp
index 599f1e18efda6..e137cdb5679f5 100644
--- a/lld/MachO/Arch/ARM64Common.cpp
+++ b/lld/MachO/Arch/ARM64Common.cpp
@@ -19,7 +19,8 @@ using namespace lld::macho;
 int64_t ARM64Common::getEmbeddedAddend(MemoryBufferRef mb, uint64_t offset,
                                        const relocation_info rel) const {
   if (rel.r_type != ARM64_RELOC_UNSIGNED &&
-      rel.r_type != ARM64_RELOC_SUBTRACTOR) {
+      rel.r_type != ARM64_RELOC_SUBTRACTOR &&
+      rel.r_type != ARM64_RELOC_AUTHENTICATED_POINTER) {
     // All other reloc types should use the ADDEND relocation to store their
     // addends.
     // TODO(gkm): extract embedded addend just so we can assert that it is 0
@@ -28,6 +29,12 @@ int64_t ARM64Common::getEmbeddedAddend(MemoryBufferRef mb, uint64_t offset,
 
   const auto *buf = reinterpret_cast<const uint8_t *>(mb.getBufferStart());
   const uint8_t *loc = buf + offset + rel.r_address;
+
+  if (rel.r_type == ARM64_RELOC_AUTHENTICATED_POINTER) {
+    // Only the low 32 bits are the addend; upper bits hold ptrauth fields.
+    return llvm::SignExtend64<32>(read32le(loc));
+  }
+
   switch (rel.r_length) {
   case 2:
     return static_cast<int32_t>(read32le(loc));
diff --git a/lld/MachO/Arch/ARM64e.cpp b/lld/MachO/Arch/ARM64e.cpp
new file mode 100644
index 0000000000000..be89991950f80
--- /dev/null
+++ b/lld/MachO/Arch/ARM64e.cpp
@@ -0,0 +1,268 @@
+//===- ARM64e.cpp ---------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "Arch/ARM64Common.h"
+#include "InputFiles.h"
+#include "Symbols.h"
+#include "SyntheticSections.h"
+#include "Target.h"
+
+#include "lld/Common/ErrorHandler.h"
+#include "mach-o/compact_unwind_encoding.h"
+#include "llvm/BinaryFormat/MachO.h"
+
+using namespace llvm;
+using namespace llvm::MachO;
+using namespace lld;
+using namespace lld::macho;
+
+namespace {
+
+struct ARM64e : ARM64Common {
+  ARM64e();
+  void writeStub(uint8_t *buf, const Symbol &, uint64_t) const override;
+  void writeStubHelperHeader(uint8_t *buf) const override;
+  void writeStubHelperEntry(uint8_t *buf, const Symbol &,
+                            uint64_t entryAddr) const override;
+
+  void writeObjCMsgSendStub(uint8_t *buf, Symbol *sym, uint64_t stubsAddr,
+                            uint64_t &stubOffset, uint64_t selrefVA,
+                            Symbol *objcMsgSend) const override;
+  void populateThunk(InputSection *thunk, Symbol *funcSym) override;
+
+  void initICFSafeThunkBody(InputSection *thunk,
+                            Symbol *targetSym) const override;
+  Symbol *getThunkBranchTarget(InputSection *thunk) const override;
+  uint32_t getICFSafeThunkSize() const override;
+};
+
+} // namespace
+
+// Random notes on reloc types:
+// ADDEND always pairs with BRANCH26, PAGE21, or PAGEOFF12
+// POINTER_TO_GOT: ld64 supports a 4-byte pc-relative form as well as an 8-byte
+// absolute version of this relocation. The semantics of the absolute relocation
+// are weird -- it results in the value of the GOT slot being written, instead
+// of the address. Let's not support it unless we find a real-world use case.
+static constexpr std::array<RelocAttrs, 12> relocAttrsArray{{
+#define B(x) RelocAttrBits::x
+    {"UNSIGNED",
+     B(UNSIGNED) | B(ABSOLUTE) | B(EXTERN) | B(LOCAL) | B(BYTE4) | B(BYTE8)},
+    {"SUBTRACTOR", B(SUBTRAHEND) | B(EXTERN) | B(BYTE4) | B(BYTE8)},
+    {"BRANCH26", B(PCREL) | B(EXTERN) | B(BRANCH) | B(BYTE4)},
+    {"PAGE21", B(PCREL) | B(EXTERN) | B(BYTE4)},
+    {"PAGEOFF12", B(ABSOLUTE) | B(EXTERN) | B(BYTE4)},
+    {"GOT_LOAD_PAGE21", B(PCREL) | B(EXTERN) | B(GOT) | B(BYTE4)},
+    {"GOT_LOAD_PAGEOFF12",
+     B(ABSOLUTE) | B(EXTERN) | B(GOT) | B(LOAD) | B(BYTE4)},
+    {"POINTER_TO_GOT", B(PCREL) | B(EXTERN) | B(GOT) | B(POINTER) | B(BYTE4)},
+    {"TLVP_LOAD_PAGE21", B(PCREL) | B(EXTERN) | B(TLV) | B(BYTE4)},
+    {"TLVP_LOAD_PAGEOFF12",
+     B(ABSOLUTE) | B(EXTERN) | B(TLV) | B(LOAD) | B(BYTE4)},
+    {"ADDEND", B(ADDEND)},
+    // ARM64e-specific: AUTHENTICATED_POINTER (64-bit absolute, external or
+    // local)
+    {"AUTHENTICATED_POINTER",
+     B(ABSOLUTE) | B(UNSIGNED) | B(EXTERN) | B(LOCAL) | B(BYTE8) | B(AUTH)},
+#undef B
+}};
+
+// ARM64e uses authenticated stubs with braa instruction.
+// These are 16 bytes (4 instructions) instead of the regular 12 bytes.
+// The stub computes the GOT address in x17 for use as authentication context.
+static constexpr uint32_t stubCode[] = {
+    0x90000011, // 00: adrp  x17, __auth_got@page
+    0x91000231, // 04: add   x17, x17, __auth_got@pageoff
+    0xf9400230, // 08: ldr   x16, [x17]
+    0xd71f0a11, // 0c: braa  x16, x17  ; authenticate with IA key, context=x17
+};
+
+void ARM64e::writeStub(uint8_t *buf8, const Symbol &sym,
+                       uint64_t pointerVA) const {
+  auto *buf32 = reinterpret_cast<uint32_t *>(buf8);
+  constexpr size_t stubCodeSize = 4 * sizeof(uint32_t);
+  SymbolDiagnostic d = {&sym, "stub"};
+  uint64_t stubAddr = in.stubs->addr + sym.stubsIndex * stubCodeSize;
+  uint64_t pcPageBits = pageBits(stubAddr);
+  uint64_t targetPageBits = pageBits(pointerVA);
+  int64_t pageDiff = static_cast<int64_t>(targetPageBits - pcPageBits);
+  // adrp x17, __auth_got@page
+  encodePage21(&buf32[0], d, stubCode[0], pageDiff);
+  // add x17, x17, __auth_got@pageoff
+  encodePageOff12(&buf32[1], d, stubCode[1], pointerVA);
+  // ldr x16, [x17]
+  buf32[2] = stubCode[2];
+  // braa x16, x17
+  buf32[3] = stubCode[3];
+}
+
+static constexpr uint32_t stubHelperHeaderCode[] = {
+    0x90000011, // 00: adrp  x17, _dyld_private@page
+    0x91000231, // 04: add   x17, x17, _dyld_private@pageoff
+    0xa9bf47f0, // 08: stp   x16/x17, [sp, #-16]!
+    0x90000010, // 0c: adrp  x16, dyld_stub_binder@page
+    0xf9400210, // 10: ldr   x16, [x16, dyld_stub_binder@pageoff]
+    0xd61f0200, // 14: br    x16
+};
+
+void ARM64e::writeStubHelperHeader(uint8_t *buf8) const {
+  ::writeStubHelperHeader<LP64>(buf8, stubHelperHeaderCode);
+}
+
+static constexpr uint32_t stubHelperEntryCode[] = {
+    0x18000050, // 00: ldr  w16, l0
+    0x14000000, // 04: b    stubHelperHeader
+    0x00000000, // 08: l0: .long 0
+};
+
+void ARM64e::writeStubHelperEntry(uint8_t *buf8, const Symbol &sym,
+                                  uint64_t entryVA) const {
+  ::writeStubHelperEntry(buf8, stubHelperEntryCode, sym, entryVA);
+}
+
+// ARM64e uses authenticated ObjC stubs with braa instruction.
+// Uses x17 as both the address register and authentication context,
+// matching the pattern used in ARM64e auth stubs.
+static constexpr uint32_t objcStubsFastCode[] = {
+    0x90000001, // adrp  x1, __objc_selrefs@page
+    0xf9400021, // ldr   x1, [x1, @selector("foo")@pageoff]
+    0x90000011, // adrp  x17, __auth_got@page
+    0x91000231, // add   x17, x17, __auth_got@pageoff
+    0xf9400230, // ldr   x16, [x17]
+    0xd71f0a11, // braa  x16, x17  ; authenticate with IA key
+    0xd4200020, // brk   #0x1
+    0xd4200020, // brk   #0x1
+};
+
+static constexpr uint32_t objcStubsSmallCode[] = {
+    0x90000001, // adrp  x1, __objc_selrefs@page
+    0xf9400021, // ldr   x1, [x1, @selector("foo")@pageoff]
+    0x14000000, // b     _objc_msgSend
+};
+
+void ARM64e::writeObjCMsgSendStub(uint8_t *buf, Symbol *sym, uint64_t stubsAddr,
+                                  uint64_t &stubOffset, uint64_t selrefVA,
+                                  Symbol *objcMsgSend) const {
+  uint64_t objcMsgSendAddr;
+  uint64_t objcStubSize;
+  uint64_t objcMsgSendIndex;
+
+  if (config->objcStubsMode == ObjCStubsMode::fast) {
+    objcStubSize = target->objcStubsFastSize;
+    // ARM64e uses authgot for objc_msgSend.
+    objcMsgSendAddr = in.authgot->addr;
+    objcMsgSendIndex = objcMsgSend->authGotIndex;
+    ::writeObjCMsgSendFastStub<LP64>(buf, objcStubsFastCode, sym, stubsAddr,
+                                     stubOffset, selrefVA, objcMsgSendAddr,
+                                     objcMsgSendIndex);
+  } else {
+    assert(config->objcStubsMode == ObjCStubsMode::small);
+    objcStubSize = target->objcStubsSmallSize;
+    if (auto *d = dyn_cast<Defined>(objcMsgSend)) {
+      objcMsgSendAddr = d->getVA();
+      objcMsgSendIndex = 0;
+    } else {
+      objcMsgSendAddr = in.stubs->addr;
+      objcMsgSendIndex = objcMsgSend->stubsIndex;
+    }
+    ::writeObjCMsgSendSmallStub<LP64>(buf, objcStubsSmallCode, sym, stubsAddr,
+                                      stubOffset, selrefVA, objcMsgSendAddr,
+                                      objcMsgSendIndex);
+  }
+  stubOffset += objcStubSize;
+}
+
+// A thunk is the relaxed variation of stubCode. We don't need the
+// extra indirection through a lazy pointer because the target address
+// is known at link time.
+static constexpr uint32_t thunkCode[] = {
+    0x90000010, // 00: adrp  x16, <thunk.ptr>@page
+    0x91000210, // 04: add   x16, [x16,<thunk.ptr>@pageoff]
+    0xd61f0200, // 08: br    x16
+};
+
+void ARM64e::populateThunk(InputSection *thunk, Symbol *funcSym) {
+  thunk->align = 4;
+  thunk->data = {reinterpret_cast<const uint8_t *>(thunkCode),
+                 sizeof(thunkCode)};
+  thunk->relocs.emplace_back(/*type=*/ARM64_RELOC_PAGEOFF12,
+                             /*pcrel=*/false, /*length=*/2,
+                             /*offset=*/4, /*addend=*/0,
+                             /*referent=*/funcSym);
+  thunk->relocs.emplace_back(/*type=*/ARM64_RELOC_PAGE21,
+                             /*pcrel=*/true, /*length=*/2,
+                             /*offset=*/0, /*addend=*/0,
+                             /*referent=*/funcSym);
+}
+
+// Just a single direct branch to the target function.
+static constexpr uint32_t icfSafeThunkCode[] = {
+    0x14000000, // 00: b    target
+};
+
+void ARM64e::initICFSafeThunkBody(InputSection *thunk,
+                                  Symbol *targetSym) const {
+  // The base data here will not be itself modified, we'll just be adding a
+  // reloc below. So we can directly use the constexpr above as the data.
+  thunk->data = {reinterpret_cast<const uint8_t *>(icfSafeThunkCode),
+                 sizeof(icfSafeThunkCode)};
+
+  thunk->relocs.emplace_back(/*type=*/ARM64_RELOC_BRANCH26,
+                             /*pcrel=*/true, /*length=*/2,
+                             /*offset=*/0, /*addend=*/0,
+                             /*referent=*/targetSym);
+}
+
+Symbol *ARM64e::getThunkBranchTarget(InputSection *thunk) const {
+  assert(thunk->relocs.size() == 1 &&
+         "expected a single reloc on ARM64 ICF thunk");
+  auto &reloc = thunk->relocs[0];
+  assert(isa<Symbol *>(reloc.referent) &&
+         "ARM64 thunk reloc is expected to point to a Symbol");
+
+  return cast<Symbol *>(reloc.referent);
+}
+
+uint32_t ARM64e::getICFSafeThunkSize() const {
+  return sizeof(icfSafeThunkCode);
+}
+
+ARM64e::ARM64e() : ARM64Common(LP64()) {
+  cpuType = CPU_TYPE_ARM64;
+  // ARM64e-specific: Use ARM64E subtype with pointer authentication ABI version 0
+  cpuSubtype = CPU_SUBTYPE_ARM64E_WITH_PTRAUTH_VERSION(/*version*/ 0,
+                                                       /*kernel*/ false);
+
+  stubSize = sizeof(stubCode);
+  thunkSize = sizeof(thunkCode);
+
+  objcStubsFastSize = sizeof(objcStubsFastCode);
+  objcStubsFastAlignment = 32;
+  objcStubsSmallSize = sizeof(objcStubsSmallCode);
+  objcStubsSmallAlignment = 4;
+
+  // Branch immediate is two's complement 26 bits, which is implicitly
+  // multiplied by 4 (since all functions are 4-aligned: The branch range
+  // is -4*(2**(26-1))..4*(2**(26-1) - 1).
+  backwardBranchRange = 128 * 1024 * 1024;
+  forwardBranchRange = backwardBranchRange - 4;
+
+  modeDwarfEncoding = UNWIND_ARM64_MODE_DWARF;
+  subtractorRelocType = ARM64_RELOC_SUBTRACTOR;
+  unsignedRelocType = ARM64_RELOC_UNSIGNED;
+
+  stubHelperHeaderSize = sizeof(stubHelperHeaderCode);
+  stubHelperEntrySize = sizeof(stubHelperEntryCode);
+
+  relocAttrs = {relocAttrsArray.data(), relocAttrsArray.size()};
+}
+
+TargetInfo *macho::createARM64eTargetInfo() {
+  static ARM64e t;
+  return &t;
+}
diff --git a/lld/MachO/CMakeLists.txt b/lld/MachO/CMakeLists.txt
index 72631f11511bf..e8afdc06609e9 100644
--- a/lld/MachO/CMakeLists.txt
+++ b/lld/MachO/CMakeLists.txt
@@ -6,6 +6,7 @@ include_directories(${LLVM_MAIN_SRC_DIR}/../libunwind/include)
 
 add_lld_library(lldMachO
   Arch/ARM64.cpp
+  Arch/ARM64e.cpp
   Arch/ARM64Common.cpp
   Arch/ARM64_32.cpp
   Arch/X86_64.cpp
diff --git a/lld/MachO/Driver.cpp b/lld/MachO/Driver.cpp
index 58fbe64c2d1f9..8b85a7ae33d37 100644
--- a/lld/MachO/Driver.cpp
+++ b/lld/MachO/Driver.cpp
@@ -951,6 +951,10 @@ static TargetInfo *createTargetInfo(InputArgList &args) {
   case CPU_TYPE_X86_64:
     return createX86_64TargetInfo();
   case CPU_TYPE_ARM64:
+    if (cpuSubtype == CPU_SUBTYPE_ARM64E ||
+        cpuSubtype == CPU_SUBTYPE_ARM64E_VERSIONED_PTRAUTH_ABI_MASK ||
+        cpuSubtype == CPU_SUBTYPE_ARM64E_WITH_PTRAUTH_VERSION(0, 0))
+      return createARM64eTargetInfo();
     return createARM64TargetInfo();
   case CPU_TYPE_ARM64_32:
     return createARM64_32TargetInfo();
@@ -1254,7 +1258,8 @@ static bool shouldEmitChainedFixups(const InputArgList &args) {
     return false;
   }
 
-  if (!is_contained({AK_x86_64, AK_x86_64h, AK_arm64}, config->arch())) {
+  if (!is_contained({AK_x86_64, AK_x86_64h, AK_arm64, AK_arm64e},
+                    config->arch())) {
     if (requested)
       error("-fixup_chains is only supported on x86_64 and arm64 targets");
 
@@ -1975,6 +1980,8 @@ bool link(ArrayRef<const char *> argsArr, llvm::raw_ostream &stdoutOS,
   config->emitDataInCodeInfo =
       args.hasFlag(OPT_data_in_code_info, OPT_no_data_in_code_info, true);
   config->emitChainedFixups = shouldEmitChainedFixups(args);
+  if (config->arch() == AK_arm64e && !config->emitChainedFixups)
+    error("arm64e requires chained fixups; cannot use -no_fixup_chains");
   config->emitInitOffsets =
       config->emitChainedFixups || args.hasArg(OPT_init_offsets);
   config->emitRelativeMethodLists = shouldEmitRelativeMethodLists(args);
diff --git a/lld/MachO/InputFiles.cpp b/lld/MachO/InputFiles.cpp
index cc7eae51175bc..e2bb43c3e3598 100644
--- a/lld/MachO/InputFiles.cpp
+++ b/lld/MachO/InputFiles.cpp
@@ -201,6 +201,15 @@ static bool compatWithTargetArch(const InputFile *file, const Header *hdr) {
     return false;
   }
 
+  // Reject arm64 objects when linking for arm64e.
+  if (config->arch() == AK_arm64e &&
+      hdr->cputype == CPU_TYPE_ARM64 &&
+      (hdr->cpusubtype & ~CPU_SUBTYPE_MASK) == CPU_SUBTYPE_ARM64_ALL) {
+    warn(toString(file) + " has architecture arm64 which is incompatible with "
+         "target architecture arm64e (arm64e requires pointer authentication)");
+    return false;
+  }
+
   return checkCompatibility(file);
 }
 
@@ -263,7 +272,7 @@ std::optional<MemoryBufferRef> macho::readFile(StringRef path) {
     // FIXME: LD64 has a more complex fallback logic here.
     // Consider implementing that as well?
     if (cpuType != static_cast<uint32_t>(target->cpuType) ||
-        cpuSubtype != target->cpuSubtype) {
+        cpuSubtype != (target->cpuSubtype & ~MachO::CPU_SUBTYPE_MASK)) {
       archs.emplace_back(getArchName(cpuType, cpuSubtype));
       continue;
     }
@@ -580,9 +589,27 @@ void ObjFile::parseRelocations(ArrayRef<SectionHeader> sectionHeaders,
     r.pcrel = relInfo.r_pcrel;
     r.length = relInfo.r_length;
     r.offset = relInfo.r_address;
+
+    // For ARM64e authenticated pointer relocations, extract the auth info
+    // (diversity, key, addrDiv) from the upper bits of the raw pointer value
+    // and store them in the union's authData member.
+    if (target->hasAttr(relInfo.r_type, RelocAttrBits::AUTH)) {
+      const uint8_t *loc = buf + sec.offset + relInfo.r_address;
+      uint64_t raw = read64le(loc);
+      // The auth bit (bit 63) should be set for authenticated pointers
+      if ((raw >> 63) & 1) {
+        r.hasAuth = true;
+        r.authData.addend = isSubtrahend ? 0 : static_cast<int32_t>(totalAddend);
+        r.authData.info.diversity = (raw >> 32) & 0xFFFF;
+        r.authData.info.addrDiv = (raw >> 48) & 0x1;
+        r.authData.info.key = (raw >> 49) & 0x3;
+      }
+    }
+
     if (relInfo.r_extern) {
       r.referent = symbols[relInfo.r_symbolnum];
-      r.addend = isSubtrahend ? 0 : totalAddend;
+      if (!r.hasAuth)
+        r.addend = isSubtrahend ? 0 : totalAddend;
     } else {
       assert(!isSubtrahend);
       const SectionHeader &referentSecHead =
diff --git a/lld/MachO/InputSection.cpp b/lld/MachO/InputSection.cpp
index 34847adc85954..04dd0ded61b52 100644
--- a/lld/MachO/InputSection.cpp
+++ b/lld/MachO/InputSection.cpp
@@ -101,7 +101,14 @@ static uint64_t resolveSymbolOffsetVA(const Symbol *sym, uint8_t type,
     // There's no meaningful way to "interpose" an interior offset.
     symVA = (offset != 0) ? sym->getVA() : sym->resolveBranchVA();
   } else if (relocAttrs.hasAttr(RelocAttrBits::GOT)) {
-    symVA = sym->resolveGotVA();
+    // GOT_LOAD (no POINTER attr) should use regular __got when available,
+    // because the compiler applies paciza on the loaded value and needs
+    // a raw (non-auth) pointer. POINTER_TO_GOT (has POINTER attr) should
+    // use __auth_got (the default from getGotVA/resolveGotVA).
+    if (!relocAttrs.hasAttr(RelocAttrBits::POINTER) && sym->isInGot())
+      symVA = in.got->getVA(sym->gotIndex);
+    else
+      symVA = sym->resolveGotVA();
   } else if (relocAttrs.hasAttr(RelocAttrBits::TLV)) {
     symVA = sym->resolveTlvVA();
   } else {
@@ -253,7 +260,7 @@ void ConcatInputSection::writeTo(uint8_t *buf) {
         target->handleDtraceReloc(referentSym, r, loc);
         continue;
       }
-      referentVA = resolveSymbolOffsetVA(referentSym, r.type, r.addend);
+      referentVA = resolveSymbolOffsetVA(referentSym, r.type, r.getAddend());
 
       if (isThreadLocalVariables(getFlags()) && isa<Defined>(referentSym)) {
         // References from thread-local variable sections are treated as offsets
@@ -262,15 +269,22 @@ void ConcatInputSection::writeTo(uint8_t *buf) {
         // contiguous).
         referentVA -= firstTLVDataSection->addr;
       } else if (needsFixup) {
-        writeChainedFixup(loc, referentSym, r.addend);
+        writeChainedFixup(loc, referentSym, r);
         continue;
       }
     } else if (auto *referentIsec = r.referent.dyn_cast<InputSection *>()) {
       assert(!::shouldOmitFromOutput(referentIsec));
-      referentVA = referentIsec->getVA(r.addend);
+      referentVA = referentIsec->getVA(r.getAddend());
 
       if (needsFixup) {
-        writeChainedRebase(loc, referentVA);
+        AuthInfo aiStorage;
+        const AuthInfo *ai = nullptr;
+        if (r.hasAuth) {
+          aiStorage = r.getAuthInfo();
+          ai = &aiStorage;
+        }
+        uint64_t segmentBase = referentIsec->parent->parent->addr;
+        writeChainedRebase(loc, referentVA, segmentBase, ai);
         continue;
       }
     }
diff --git a/lld/MachO/InputSection.h b/lld/MachO/InputSection.h
index e0a90a2edc0af..ac23b9a9b4e57 100644
--- a/lld/MachO/InputSection.h
+++ b/lld/MachO/InputSection.h
@@ -363,6 +363,7 @@ constexpr const char staticInit[] = "__StaticInit";
 constexpr const char stringTable[] = "__string_table";
 constexpr const char stubHelper[] = "__stub_helper";
 constexpr const char stubs[] = "__stubs";
+constexpr const char authStubs[] = "__auth_stubs";
 constexpr const char swift[] = "__swift";
 constexpr const char symbolTable[] = "__symbol_table";
 constexpr const char textCoalNt[] = "__textcoal_nt";
diff --git a/lld/MachO/MapFile.cpp b/lld/MachO/MapFile.cpp
index 29ebcdcf9a832..352a47a83355f 100644
--- a/lld/MachO/MapFile.cpp
+++ b/lld/MachO/MapFile.cpp
@@ -149,7 +149,9 @@ static void printNonLazyPointerSection(raw_fd_ostream &os,
   // associations.
   for (const Symbol *sym : osec->getEntries())
     os << format("0x%08llX\t0x%08zX\t[  0] non-lazy-pointer-to-local: %s\n",
-                 osec->addr + sym->gotIndex * target->wordSize,
+                 osec->addr + (osec->getIsAuth() ? sym->authGotIndex
+                                                 : sym->...
[truncated]

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 24, 2026

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Copy Markdown
Contributor

@ojhunt ojhunt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than the various masking and shift operations to pull out of the various constants, I think should switch to struct defs with correct bitfields instead. But if we retain shifts, the literals should be replaced with named constants.

Comment thread lld/MachO/Arch/ARM64e.cpp
Comment thread lld/MachO/Arch/ARM64e.cpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just verifying lld style allows auto instead of a named type? I'd certainly prefer a type here, but I don't know if it's required

Also verifying lowercase initial letter is correct naming style?

Both of these exist throughout this PR but I'll just comment/question here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This specific instance in lld/MachO/Arch/ARM64e.cpp is copied over from lld/MachO/Arch/ARM64.cpp and I didn't want the two to diverge too much.

In general, I tried to keep auto's inline with the LLD Mach-O writing style which also uses lowercase initial letter. I don't feel strongly about this so I am happy to change it or leave it for a future NFC.

Comment thread lld/MachO/InputFiles.cpp
Comment thread lld/MachO/MapFile.cpp Outdated
Comment thread lld/MachO/OutputSegment.cpp
Comment thread lld/MachO/Relocations.h
Comment on lines 85 to 88
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this for converting values is UB.

Hahaha, I joke (it is UB) using memcpy to do this is miserable

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL! I modified lld/MachO/ICF.cpp to not touch the raw addend so now we should be avoiding potential for reading the raw union.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so now we should be avoiding potential for reading the raw union.

Would it be beneficial to make this union private to force all future users to use getAddend() instead of addend?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but I'd prefer saving it for a later follow-up because there's a lot of .addend callsites and this PR already is quite unwieldy and I've already promised a few other NFCs to follow.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth to break up this PR into smaller ones. It might also help the review.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth to break up this PR into smaller ones.

I am not opposed to the idea but these changes are quite interdependent and decomposing this would result in functionally dead code, at least for the first couple PRs

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth to break up this PR into smaller ones. It might also help the review.

Its become a lot more sprawling and might be too ambitious in one go atp. Any suggestions on if/how I should split this up?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not big ones. You probably have a better picture of the code than me.

There's things that are obvious refactors that should be able to be broken up in earlier PRs. For example adding getAddend() and changing all usages of r.addend. There's probably others that are similar to those.

Comment thread lld/MachO/SyntheticSections.h Outdated
Comment thread lld/MachO/Writer.cpp
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd kind of prefer InStruct (I think that's what in is but it took a while to track this down) to handle this with something like InStruct::AddGOTEntry(sim, isauthgot), but that implies a structural change outside of the scope of this PR

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that would be cleaner, again I think I can start working on a separate NFC for this stuff.

Comment thread lld/MachO/Writer.cpp Outdated
Comment thread lld/MachO/Writer.cpp Outdated
Comment thread lld/MachO/Relocations.h
Comment on lines 85 to 88
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so now we should be avoiding potential for reading the raw union.

Would it be beneficial to make this union private to force all future users to use getAddend() instead of addend?

Comment thread lld/MachO/SyntheticSections.cpp
Copy link
Copy Markdown
Contributor

@drodriguez drodriguez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipped some of the bigger files for the moment. Most of my comments for the moment might be considered minor or nit.

Comment thread lld/MachO/Driver.cpp Outdated
Comment thread lld/MachO/Driver.cpp Outdated
Comment thread lld/MachO/InputFiles.cpp Outdated
Comment thread lld/MachO/OutputSegment.cpp
Comment thread lld/MachO/Relocations.h Outdated
Comment thread lld/MachO/Relocations.h
Comment on lines 85 to 88
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth to break up this PR into smaller ones. It might also help the review.

Comment thread lld/MachO/UnwindInfoSection.cpp Outdated
@oskarwirga oskarwirga requested a review from int3 April 6, 2026 15:37
oskarwirga added a commit that referenced this pull request Apr 8, 2026
…rAuth (#189474)

This is part of work being done in #188378 and #188638, split out from
#188650.

`lowerConstantPtrAuth` silently miscompiled `ConstantPtrAuth` constants
with non-`GlobalValue` pointer bases — emitting `0@AUTH(da,0)` instead
of erroring.

Changes:
- Handle `ConstantPointerNull` bases explicitly
- Error via `reportFatalUsageError` on any remaining unresolved base
(e.g. nested ptrauth) instead of silently miscompiling

This PR was mostly developed with LLM assistance
@ojhunt
Copy link
Copy Markdown
Contributor

ojhunt commented Apr 9, 2026

@rjmccall @ahmedbougacha @ahatanak @lgerbarg could some of you folk look at this to see if it's reasonable? it's outside my domain and I have ~negative amounts of time to look for at least another week or two

Comment thread lld/MachO/Arch/ARM64e.cpp Outdated
Comment thread lld/MachO/Arch/ARM64e.cpp Outdated
Comment thread lld/MachO/Arch/ARM64e.cpp Outdated
Comment thread lld/MachO/InputSection.cpp Outdated
Comment thread lld/MachO/Relocations.h Outdated
Comment thread lld/MachO/Relocations.h Outdated
Comment thread lld/MachO/Symbols.cpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we changed this value before? What kind of testing/benchmarking is expected before we can change this value? At least, it would be good to verify that runtime and memory usage doesn't increase too much for a few builds.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't expect to see significant impact from this given 56 byte allocation would probably take the same amount of memory as 64, but out of curiosity as well I ran a benchmark linking LLD before and after these changes & got no noticeable regression. It was all within range of noise on my macbook.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alignment of uint64_t is 8 bytes, the alignment of uint32_t is 4 bytes, so those 56 bytes were aligned, and it would not have taken 64 bytes (in theory… I haven't measure the alignment before and after). It is still packed as much as possible, but it is a 14% increase in memory for the symbol structures. Also both any architecture is paying the price for something that only works in arm64e.

It is also bad that you added a single uint32_t authGotIndex, but the size jumped 8 bytes. The layout of the structure might not be optimal for the data it contains (from what I can see, there's a pointer after all those uint32_t, which probably became unaligned with this change).

Comment thread lld/MachO/Symbols.cpp Outdated
@drodriguez
Copy link
Copy Markdown
Contributor

Sorry. I just merged #191808, which modifies some of the functions in lld/MachO/Arch/ARM64.cpp which moved in these commits. You might need to merge and adapt to the new signatures before this can be merged or reviewed further.

Comment thread lld/MachO/SyntheticSections.cpp Outdated
Comment thread lld/MachO/InputSection.cpp Outdated
Comment thread lld/MachO/SyntheticSections.cpp Outdated
Comment thread lld/MachO/SyntheticSections.cpp Outdated
Comment thread lld/MachO/SyntheticSections.cpp Outdated
Comment thread lld/MachO/SyntheticSections.cpp Outdated
Comment thread lld/MachO/SyntheticSections.cpp Outdated
Comment thread lld/MachO/ICF.cpp
Comment thread lld/MachO/InputFiles.cpp
Comment thread lld/MachO/Symbols.cpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alignment of uint64_t is 8 bytes, the alignment of uint32_t is 4 bytes, so those 56 bytes were aligned, and it would not have taken 64 bytes (in theory… I haven't measure the alignment before and after). It is still packed as much as possible, but it is a 14% increase in memory for the symbol structures. Also both any architecture is paying the price for something that only works in arm64e.

It is also bad that you added a single uint32_t authGotIndex, but the size jumped 8 bytes. The layout of the structure might not be optimal for the data it contains (from what I can see, there's a pointer after all those uint32_t, which probably became unaligned with this change).

Add struct definitions for the arm64e chained fixup pointer formats
used by dyld. These correspond to the DYLD_CHAINED_PTR_ARM64E and
DYLD_CHAINED_PTR_ARM64E_USERLAND24 format types already enumerated
in this header. The structs encode rebase, bind, and authenticated
bind/rebase fixup entries, and will be consumed by the arm64e chained
fixup encoding implementation in LLD.

Layouts match Apple's dyld fixup-chains.h definitions.
Add data structures to support arm64e pointer authentication in the
MachO linker. The AUTH relocation attribute marks relocations that
carry ptrauth metadata (key, diversity, address discrimination).
The Relocation struct's addend slot is overlaid with an AuthReloc
union member (32-bit addend + AuthInfo) to keep sizeof(Relocation)
unchanged.

Also adds the __auth_stubs section name constant.
Add the ARM64e target to LLD's MachO linker, enabling linking of
arm64e binaries with pointer authentication support.

Key components:
- ARM64e.cpp: New target with 16-byte authenticated stubs using braa
  (authenticated branch) with x17 as the context register, plus
  authenticated ObjC stubs, thunks, and ICF safe thunks
- ARM64Common.cpp: Handle ARM64_RELOC_AUTHENTICATED_POINTER in
  getEmbeddedAddend (low 32 bits only) and relocateOne
- Driver.cpp: Route arm64e CPU subtypes to the new target, add arm64e
  to chained fixups support
- InputFiles.cpp: Parse ptrauth metadata (key, diversity, addrDiv)
  from high bits of authenticated pointer relocations
oskarwirga and others added 17 commits April 20, 2026 12:23
Add the authenticated GOT (__auth_got) section for arm64e pointer
authentication. On arm64e, symbols accessed through authenticated
pointers need a separate GOT section whose entries are signed with
address-diversified PAC. The __auth_got is placed before __got in
section ordering so the linker processes authenticated entries first.

This commit adds the infrastructure: AuthGotSection class, authgot
pointer in InStruct, auth-aware writeChainedRebase/writeChainedFixup
signatures, and the forceOutline parameter for bindings that need
full auth encoding. Population and routing are in follow-up commits.
Route symbols to the authenticated GOT (__auth_got) when arm64e pointer
authentication requires it. Symbols that are called (BRANCH relocations)
need stubs, and arm64e stubs use authgot for braa-based authenticated
jumps. A pre-scan of relocations identifies these symbols so that
subsequent GOT_LOAD relocations for the same symbols are also routed
to authgot.

Also routes personality pointers in unwind info to authgot on arm64e,
since exception handling needs authenticated personality function
pointers.
…port

Implement the chained fixup encoding for arm64e using the
DYLD_CHAINED_PTR_ARM64E_USERLAND24 format, and add dual GOT support
for symbols that need both authenticated and raw GOT entries.

Chained fixup encoding:
- Auth rebases use dyld_chained_ptr_arm64e_auth_rebase with PAC
  key/diversity/addrDiv fields
- Auth binds use dyld_chained_ptr_arm64e_auth_bind24 (24-bit ordinal)
- Non-auth arm64e fixups use corresponding non-auth structs
- Handles authEncodingBits for C++ RTTI pointers in UNSIGNED relocs

Dual GOT routing:
- Symbols with both BRANCH and GOT_LOAD relocations need different
  GOT entries: BRANCH -> __auth_got (braa), GOT_LOAD -> __got (paciza)
- Adds authGotIndex alongside gotIndex in Symbol
- GOT_LOAD resolves through __got to prevent double-signing (Crash 7)
Replace bare shift/mask operations in InputFiles.cpp with a
arm64e_auth_embedded_pointer bitfield struct defined in MachO.h.
This matches the pattern used by the dyld_chained_ptr_arm64e_*
structs for chained fixup encoding.
Compare auth metadata (diversity, key, addrDiv) explicitly in
equalsConstant() so that sections with different signing schemas
are never folded together. Use getAddend() instead of raw union
access to avoid reading the wrong union member.
Extract complex expressions into local variables:
- MapFile: GOT index ternary moved to a local before the format call
- Writer: reinterpret_cast results in buildFixupChains stored in locals
ARM64E userland formats use 8-byte stride, not 4. Replace the
hardcoded stride constant with a per-architecture value and use
it uniformly in both the stride check and the next-field encoding.
- Driver.cpp: Fix cpuSubtype check to match any arm64e variant using
  mask instead of listing specific values
- Driver.cpp: Move arm64e chained fixups requirement into
  shouldEmitChainedFixups(); warn and force-enable instead of erroring,
  matching ld64 behavior
- InputFiles.cpp: Check != CPU_SUBTYPE_ARM64E instead of
  == CPU_SUBTYPE_ARM64_ALL to also reject CPU_SUBTYPE_ARM64_V8
- Relocations.h: Add static_assert for AuthReloc size
- Update arm64e-no-fixup-chains.s test for warn behavior
- Use sizeof(stubCode) instead of manual 4 * sizeof(uint32_t)
- Deduplicate stub helper, thunk, and ICF code shared between ARM64
  and ARM64e by moving to ARM64Common
- Change getAuthInfo() to return const AuthInfo* (nullptr when !hasAuth)
  to simplify callers in InputSection, SyntheticSections, and ICF
- Use sizeof(int64_t) in AuthReloc static_assert
- Reuse getAuthGotVA() from getGotVA()
- Extract addPersonalityGotEntry() helper in UnwindInfoSection
…p writers

writeChainedRebase, writeChainedFixup, and the per-reloc loop in
ConcatInputSection::writeTo all passed segmentBase that was never read
(the writers normalize against in.header->addr instead). Removes one
parent->parent->addr traversal per relocation in the hot path.
The static constexpr arrays in ARM64Common.h were duplicated across each
of the 5 translation units that include the header. inline constexpr
gives them external linkage with the C++17 inline-variable rule, so
there is exactly one copy in the final binary.
Replaces the '0' magic in NonLazyPointerSectionBase::writeTo with a named
PtrAuthKey::IA constant. The enum lives next to AuthInfo in Relocations.h
since LLVMAArch64Utils' AArch64PACKey is not in lld's link set.
…GotIndex

getAuthGotVA already asserts isInAuthGot(); routing the StubsSection and
UnwindInfoSection personality lookups through it makes the precondition
checked. ARM64e::writeObjCMsgSendStub still needs the raw index but now
asserts isInAuthGot() before reading.
…ation)

The high8 byte and the target field are two halves of one logical value
that dyld OR-folds at load time. In USERLAND24 the unified value is the
runtime offset (targetValue), not the VA, so high8 must be extracted
from targetValue to keep both halves consistent.

In practice Mach-O image bases never set bits [56:63], so the
discrepancy is unobservable at runtime; the fix is for self-consistency.

Add a smoke test that verifies the high8 byte round-trips through the
linker into the rebase entry in both USERLAND24 and legacy
DYLD_CHAINED_PTR_ARM64E pointer formats.
The Relocation union overlays AuthReloc {int32_t addend; AuthInfo info;}
on the 64-bit addend slot. The non-extern (section-relative) AUTH path
in ObjFile::parseRelocations was writing the full 64-bit r.addend after
auth decoding, silently zeroing diversity/key/addrDiv.

Use authData.addend (the int32_t half) when r.hasAuth so the auth
metadata survives, mirroring what the extern path already does.
Adding authGotIndex pushed Symbol from 56 to 64 bytes — the 6th uint32_t
forced a 4-byte pad before nameData (a pointer needing 8-byte alignment).
Pack the Symbol::Kind enum into 3 bits in the bitfield region (the enum
has 7 values; LLVM_PREFERRED_TYPE preserves type-safety via the kind()
accessor) so symbolKind shares its byte with isUsedInRegularObj and
used. Restores the 56-byte target for all architectures, including
non-arm64e which doesn't read authGotIndex but still incurred its
alignment cost.

Defined drops correspondingly from 96 to 88 bytes.
@oskarwirga
Copy link
Copy Markdown
Contributor Author

I force-pushed after rebasing but RE: #188378 (comment)

@drodriguez I was able to keep the symbol size the same by packing kind as 3 bits + reordering the members

YonahGoldberg pushed a commit to YonahGoldberg/llvm-project that referenced this pull request Apr 21, 2026
…rAuth (llvm#189474)

This is part of work being done in llvm#188378 and llvm#188638, split out from
llvm#188650.

`lowerConstantPtrAuth` silently miscompiled `ConstantPtrAuth` constants
with non-`GlobalValue` pointer bases — emitting `0@AUTH(da,0)` instead
of erroring.

Changes:
- Handle `ConstantPointerNull` bases explicitly
- Error via `reportFatalUsageError` on any remaining unresolved base
(e.g. nested ptrauth) instead of silently miscompiling

This PR was mostly developed with LLM assistance
vikramRH pushed a commit to ROCm/llvm-project that referenced this pull request Apr 22, 2026
…rAuth (llvm#189474)

This is part of work being done in llvm#188378 and llvm#188638, split out from
llvm#188650.

`lowerConstantPtrAuth` silently miscompiled `ConstantPtrAuth` constants
with non-`GlobalValue` pointer bases — emitting `0@AUTH(da,0)` instead
of erroring.

Changes:
- Handle `ConstantPointerNull` bases explicitly
- Error via `reportFatalUsageError` on any remaining unresolved base
(e.g. nested ptrauth) instead of silently miscompiling

This PR was mostly developed with LLM assistance
Copy link
Copy Markdown
Contributor

@drodriguez drodriguez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to use merges instead of force pushes, specially for PRs with a lot of commits. Now it is very difficult to find what might have changed in the older commits.

Some more nits and details. The ones that are easy to find looking at the code. Hopefully other people can see other things and understand better the PR holistically than me.

Comment thread lld/MachO/Relocations.h
Comment on lines 85 to 88
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not big ones. You probably have a better picture of the code than me.

There's things that are obvious refactors that should be able to be broken up in earlier PRs. For example adding getAddend() and changing all usages of r.addend. There's probably others that are similar to those.

Comment thread lld/MachO/Relocations.h
Comment thread lld/MachO/SyntheticSections.cpp
Comment thread lld/MachO/Arch/ARM64Common.cpp
Comment thread lld/MachO/SyntheticSections.cpp Outdated
Comment thread lld/MachO/Relocations.h
…access

Adds Relocation::setAddend() as the symmetric pair to the existing
getAddend(), and migrates all callers of the raw .addend field to use
the accessors.

While auditing for this change, found four r.addend = 0 writes that
would clobber AuthInfo on AUTH relocations (the addend lives in a union
overlaid on the upper 32 bits when hasAuth is true):
  - UnwindInfoSection.cpp: personality reloc (auth-relevant on arm64e)
  - ObjC.cpp: template-copied reloc inherits hasAuth
  - ConcatOutputSection.cpp: BRANCH26 paths (defensive)
  - InputFiles.cpp: eh_frame SUBTRACTOR/UNSIGNED pair (defensive)

These writes now go through setAddend() which dispatches based on
r.hasAuth.
- AuthInfo::key is now PtrAuthKey directly instead of uint8_t. The
  cast moves from defaultAuthInfo to the parse site in InputFiles.cpp
  and the three dyld-bitfield writes in writeChainedRebase /
  writeChainedBind, where the dyld struct uses a uint64_t-backed
  bit-field that requires uint8_t.

- defaultAuthInfo is now static constexpr instead of static const.

- writeChainedRebase's arm64e non-auth path now reports a hard error
  if the encoded target value cannot round-trip through the 43-bit
  target field plus 8-bit high8. The existing recommendation to
  disable -no_fixup_chains does not apply since Driver.cpp rejects
  that flag for arm64e.
Copy link
Copy Markdown
Contributor

@drodriguez drodriguez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last two commits deal with the last bits of feedback I got. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants