The hardware-accelerated SHA-256 implementation (sha_256_arm_v8 using ARM SHA2 crypto extensions) is currently restricted to Apple platforms by the preprocessor guard at sha256.cpp:22:
#elif defined(__aarch64__) && defined(__APPLE__)
The Linux runtime detection code (getauxval(AT_HWCAP) & HWCAP_SHA2 at line 683) already exists inside this block but is dead code since the outer guard requires __APPLE__.
Why Apple-only
All Apple Silicon chips guarantee SHA2 extensions, so compiling the intrinsics is always safe. On Linux aarch64, older cores (e.g., Cortex-A53) may lack SHA2 support, and compiling the intrinsics without -march=armv8-a+crypto could fail.
Proposed fix
- Widen the outer guard to
#elif defined(__aarch64__)
- Add
__attribute__((target("sha2"))) (or +sha2 target) to sha_256_arm_v8() so the compiler emits SHA instructions only for that function, regardless of the global -march
- The existing
getauxval runtime detection at line 683-684 already correctly gates whether sha_256_arm_v8 is actually called at runtime
This would enable hardware SHA-256 acceleration on Linux ARM servers (AWS Graviton, Ampere Altra, etc.).
Found during security audit of cryptographic precompile code.
The hardware-accelerated SHA-256 implementation (
sha_256_arm_v8using ARM SHA2 crypto extensions) is currently restricted to Apple platforms by the preprocessor guard atsha256.cpp:22:#elif defined(__aarch64__) && defined(__APPLE__)The Linux runtime detection code (
getauxval(AT_HWCAP) & HWCAP_SHA2at line 683) already exists inside this block but is dead code since the outer guard requires__APPLE__.Why Apple-only
All Apple Silicon chips guarantee SHA2 extensions, so compiling the intrinsics is always safe. On Linux aarch64, older cores (e.g., Cortex-A53) may lack SHA2 support, and compiling the intrinsics without
-march=armv8-a+cryptocould fail.Proposed fix
#elif defined(__aarch64__)__attribute__((target("sha2")))(or+sha2target) tosha_256_arm_v8()so the compiler emits SHA instructions only for that function, regardless of the global-marchgetauxvalruntime detection at line 683-684 already correctly gates whethersha_256_arm_v8is actually called at runtimeThis would enable hardware SHA-256 acceleration on Linux ARM servers (AWS Graviton, Ampere Altra, etc.).
Found during security audit of cryptographic precompile code.