export Spopcount and use extrinsics when possible#1037
export Spopcount and use extrinsics when possible#1037
Conversation
f4700a2 to
75785e5
Compare
|
The reason to have popcount implemented in a header file is so uses can be inlined by the C compiler. How about taking the implementation in "popcount.h" and moving it to "mkheader.ss" to that it can be part of a generated "scheme.h"? Your improvements to use |
|
Wow, the clang, gcc, and Microsoft Visual Studio 2026 compilers all recognize the "slow" code in popcount.h that computes popcount and emit the instruction for it on x64, x86, and arm64! As a result, we don't need to bother with the builtins at all if we don't want to! |
|
Ugh, if I define |
|
Calling Spopcount as a function does increase the overhead, but only by a few clock cycles. Since it makes scheme.h much simpler, I'm going to use that approach. I will eliminate the __popcnt intrinsics in Windows because they're not easy to use correctly, and the latest Microsoft compiler converts our C code into the popcnt instruction. |
|
I can use |
I think the way to suppress those warnings would be to use |
|
Thank you, that worked! What do you think of this in scheme.h for 64-bit machines? |
eliminate __popcnt on Windows
|
I'm suspicious of the cast to I lose track of what I can believe that the non-intrinsic implementation could work for 64 bits. :) If compilers recognize it and convert it to a popcount instruction, then I'm completely convinced, of course. |
Addresses #1032