Hi there,
I'm reaching out to request support for Apple Silicon (M1, M2, M3) chips, particularly:
While HPTT does have ARM/NEON support, it seems primarily tuned for general ARMv8 targets. It would be incredibly valuable to have optimized kernels tuned for Apple’s custom cores (https://developer.apple.com/documentation/simd?changes=_4). These CPUs have specific microarchitectural traits (wider SIMD units, fused multiply-add behavior, etc.) that could be leveraged.
I'd be happy to help test patches on Apple M3 Pro (macOS 15.x, GCC 14 and Clang) if needed. Even scalar support is functional, but SIMD kernels would greatly benefit performance on Apple hardware.
Thanks again for your work!
Hi there,
I'm reaching out to request support for Apple Silicon (M1, M2, M3) chips, particularly:
While HPTT does have ARM/NEON support, it seems primarily tuned for general ARMv8 targets. It would be incredibly valuable to have optimized kernels tuned for Apple’s custom cores (https://developer.apple.com/documentation/simd?changes=_4). These CPUs have specific microarchitectural traits (wider SIMD units, fused multiply-add behavior, etc.) that could be leveraged.
I'd be happy to help test patches on Apple M3 Pro (macOS 15.x, GCC 14 and Clang) if needed. Even scalar support is functional, but SIMD kernels would greatly benefit performance on Apple hardware.
Thanks again for your work!