Skip to content

Enable SHA-256 ARM NEON acceleration on aarch64 Linux #1478

@chfast

Description

@chfast

The hardware-accelerated SHA-256 implementation (sha_256_arm_v8 using ARM SHA2 crypto extensions) is currently restricted to Apple platforms by the preprocessor guard at sha256.cpp:22:

#elif defined(__aarch64__) && defined(__APPLE__)

The Linux runtime detection code (getauxval(AT_HWCAP) & HWCAP_SHA2 at line 683) already exists inside this block but is dead code since the outer guard requires __APPLE__.

Why Apple-only

All Apple Silicon chips guarantee SHA2 extensions, so compiling the intrinsics is always safe. On Linux aarch64, older cores (e.g., Cortex-A53) may lack SHA2 support, and compiling the intrinsics without -march=armv8-a+crypto could fail.

Proposed fix

  1. Widen the outer guard to #elif defined(__aarch64__)
  2. Add __attribute__((target("sha2"))) (or +sha2 target) to sha_256_arm_v8() so the compiler emits SHA instructions only for that function, regardless of the global -march
  3. The existing getauxval runtime detection at line 683-684 already correctly gates whether sha_256_arm_v8 is actually called at runtime

This would enable hardware SHA-256 acceleration on Linux ARM servers (AWS Graviton, Ampere Altra, etc.).

Found during security audit of cryptographic precompile code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions