Skip to content

BLD: Add xsimd Dependency and SIMD detection#65471

Open
Alvaro-Kothe wants to merge 8 commits into
pandas-dev:mainfrom
Alvaro-Kothe:build/simd-detection
Open

BLD: Add xsimd Dependency and SIMD detection#65471
Alvaro-Kothe wants to merge 8 commits into
pandas-dev:mainfrom
Alvaro-Kothe:build/simd-detection

Conversation

@Alvaro-Kothe
Copy link
Copy Markdown
Member

@Alvaro-Kothe Alvaro-Kothe commented May 5, 2026


This PR adds boilerplate for SIMD detection, feature toggle and a CI job disabling SIMD.

Was part of #64905 and added a few modifications:

  • Created a simd directory with the goal to have all SIMD logic encapsulated there.
  • It always defines PANDAS_HAVE_SCALAR so that in case our SIMD dispatch logic if flawed won't cause SIGILL.
  • Uses dependency instead of subproject for xsimd, but requires a minimum version of xsimd of >=14.0 to don't have to differentiate between xsimd::common and xsimd::generic; and requires the latest xsimd >=14.2 for MSVC + ARM compatibility.

It doesn't add any SIMD code, the main visible change is the creation of pandas_simd_config.h

$ fd --extension h . build/cp314 --no-ignore
build/cp314/pandas/_libs/simd/pandas/pandas_simd_config.h

$ cat build/cp314/pandas/_libs/simd/pandas/pandas_simd_config.h 
/*
 * Autogenerated by the Meson build system.
 * Do not edit, your changes will be lost.
 */

#pragma once

#define PANDAS_HAVE_AVX2 1

#define PANDAS_HAVE_AVX512CD 1

#define PANDAS_HAVE_SCALAR 1

#define PANDAS_HAVE_SSE2 1

cc @WillAyd

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we should just remove this new CI job and setup argument; I understand what you are trying to do by covering platforms that we might not be testing, but on the flip side I'd just stick with our current support model of "we only support platforms that we have in CI."

It makes the communication model much easier; otherwise this CI job / setup argument have the tendency to hang around forever and we spend time supporting very niche (and also vague) use cases

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the job, but I think that the option is still beneficial to run the tests (at least locally) for the scalar version.

Comment thread pandas/_libs/simd/meson.build Outdated
'sse2': is_msvc_syntax ? ['/arch:SSE2'] : ['-msse2'],
'avx2': is_msvc_syntax ? ['/arch:AVX2'] : ['-mavx2'],
'avx512cd': is_msvc_syntax ? ['/arch:AVX512'] : ['-mavx512cd'],
'neon': [],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this doesn't have any flags why include here?

Comment thread pandas/_libs/simd/meson.build Outdated
supported_simd_archs += {name: flags}
endif
endif
elif host_machine.cpu_family() == 'aarch64'
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to other comment, I think it makes more sense to move this out of the loop

Comment thread pandas/_libs/simd/meson.build Outdated
endif

foreach arch_name, arch_flags : supported_simd_archs
simd_config.set('PANDAS_HAVE_@0@'.format(arch_name.to_upper()), 1)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its easier to just set this in the loop above rather than its own dedicated one

Comment thread pandas/_libs/meson.build Outdated
@Alvaro-Kothe Alvaro-Kothe force-pushed the build/simd-detection branch 2 times, most recently from eefe359 to 0fefde7 Compare May 7, 2026 19:04
Copy link
Copy Markdown
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment but otherwise lgtm. anyone else care to look? maybe @mroeschke @jorisvandenbossche @jbrockmendel

Comment thread meson.options Outdated
@@ -0,0 +1,5 @@
option(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get rid of this option? Or is someone asking for this?

Since its opt-out it seems really unlikely to be used all that much; if someone wants to champion it in the future let's leave it to them to do so separately

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@jorisvandenbossche
Copy link
Copy Markdown
Member

This PR adds boilerplate for SIMD detection, feature toggle and a CI job disabling SIMD.

How does the feature toggle work exactly? Can you give some examples?
(I am also not seeing any CI changes in the diff?)

@Alvaro-Kothe
Copy link
Copy Markdown
Member Author

This PR adds boilerplate for SIMD detection, feature toggle and a CI job disabling SIMD.

How does the feature toggle work exactly? Can you give some examples? (I am also not seeing any CI changes in the diff?)

Oh, sorry, this is outdated. @WillAyd requested to remove it.

The CI job was removed in ed588ad.

The meson option to disable SIMD was removed in effdd43.

Before, it was a meson option that allowed to disable SIMD by passing the option -Dsimd=disabled in meson setup.

@jorisvandenbossche
Copy link
Copy Markdown
Member

Ah, I see there is some relevant content in the inline review discussions with Will

Can we get rid of this option? Or is someone asking for this?

I think we ourselves already need such an option? I assume this PR enables to build with AVX extensions, but we do not want to use those for our wheel builds. So for that we would already need to be able to disable AVX.

Also, to avoid creating incompatible binaries, the default should maybe not entirely be opt-out. It could also default to some baseline (#64884 (comment))

I still think we should just remove this new CI job and setup argument; I understand what you are trying to do by covering platforms that we might not be testing, but on the flip side I'd just stick with our current support model of "we only support platforms that we have in CI."

As long as we have scalar code included in pandas, I think we should ensure test coverage for that code (so for me it is more about test coverage for the code (in case someone builds pandas from source with some compiler flag that disables optimizations) than "platform support"; we still support only the platforms in CI, but there are various options how to build pandas on those platforms)

In any case, this also touches more upon the general questions of what is needed for us to start adding simd code, so very welcome to chime in on #64884

@jorisvandenbossche
Copy link
Copy Markdown
Member

I assume this PR enables to build with AVX extensions, but we do not want to use those for our wheel builds. So for that we would already need to be able to disable AVX.

This might not actually true what I said above (the "detection" is based on existing compiler arguments, not actually detecting what is supported on the machine?) But so it might be good to try to clarify what this PR is exactly adding then

@Alvaro-Kothe
Copy link
Copy Markdown
Member Author

But so it might be good to try to clarify what this PR is exactly adding then

This PR adds boilerplate in the build system for SIMD capability. It adds xsimd as a dependency and checks compiler capability to compile SIMD instructions for arm64 and x86.

This PR is also a base for #64905, which uses CPU feature detection from xsimd for runtime dispatch. In there, I compile scalar, SSE2, AVX2 and AVX512 versions of the code.

@Alvaro-Kothe
Copy link
Copy Markdown
Member Author

I assume this PR enables to build with AVX extensions, but we do not want to use those for our wheel builds. So for that we would already need to be able to disable AVX.

Should I remove the AVX2 and AVX512 targets to don't have to deal with runtime dispatch for now?

It also raises the question if the 32-bit version of x86 should only have the scalar version, since it's not guaranteed that the CPU supports SSE2 instructions.

@jorisvandenbossche
Copy link
Copy Markdown
Member

jorisvandenbossche commented May 19, 2026

This PR adds boilerplate in the build system for SIMD capability. It adds xsimd as a dependency and checks compiler capability to compile SIMD instructions for arm64 and x86.

So it defines the PANDAS_HAVE_.. definitions purely on compiler capability on the build machine? Thus, as I understand, this is only useful in case those are used for implementing dynamic runtime dispatch?

I just want us to be very explicit in what is being added here. Because eg #64515 (before the last 2 commits that pushed a refactor with xsimd) was also defining (inline) eg PANDAS_HAS_SSE2, but there it was for compile time dispatch. The variables defined here would not be usable for such a use case?

@jorisvandenbossche
Copy link
Copy Markdown
Member

For example, in Arrow C++ they distinguish between ARROW_HAVE_.. and ARROW_HAVE_RUNTIME_.. variables, for what is available as-is (can be compiled unconditionally) vs what is available (to be compiled) behind a runtime check

@Alvaro-Kothe
Copy link
Copy Markdown
Member Author

So it defines the PANDAS_HAVE_.. definitions purely on compiler capability on the build machine?

Yes. But also probing for the macros __ARM_NEON, __ARM_NEON__, __SSE2__ is a way of checking compiler capability on the build machine.

Thus, as I understand, this is only useful in case those are used for implementing dynamic runtime dispatch?

No, it's also useful for compile‑time dispatch. The PANDAS_HAVE_<ARCH> macros were intended to be used in #ifdef and #if blocks to select SIMD code paths at compile time. Their purpose is the same to #64515 (before the xsimd refactor); only the detection method differs (Meson probing vs. inline compiler‑macro checks).

xsimd also contains the target default_arch that is an alias to the best SIMD target in the current translation unit. But there are some problems in using it, for example, it fails to compile with Emscripten (pyodide) without setting the appropriate compiler flag and the error is visible in @jbrockmendel's PR (#64515). And it also would have problems with an architecture that xsimd doesn't support.

Because eg #65471 (before the last 2 commits that pushed a refactor with xsimd) was also defining (inline) eg PANDAS_HAS_SSE2, but there it was for compile time dispatch. The variables defined here would not be usable for such a use case?

I think you meant #64515. The variables defined here serve the same purpose as the inline definition there, they can be used identically. So, instead of using #if defined(...), you would use #include "pandas_simd_config.h" and check directly for PANDAS_HAVE_<ARCH>.


There is also an ongoing discussion with @WillAyd about removing these macros in #64905, but we haven't found a nice replacement yet.

@Alvaro-Kothe
Copy link
Copy Markdown
Member Author

I will modify this PR and #64905 to only target the aarch64 and x86-64 baselines. Probably it's better to leave dispatch for later.

@Alvaro-Kothe Alvaro-Kothe force-pushed the build/simd-detection branch from 772767c to aab9418 Compare May 20, 2026 00:28
@jorisvandenbossche
Copy link
Copy Markdown
Member

Yes. But also probing for the macros __ARM_NEON, __ARM_NEON__, __SSE2__ is a way of checking compiler capability on the build machine.

Sure, in both cases the definitions are evaluated at compile time, but my main point I tried to make is that they have a different intended usage pattern:

Thus, as I understand, this is only useful in case those are used for implementing dynamic runtime dispatch?

No, it's also useful for compile‑time dispatch. The PANDAS_HAVE_<ARCH> macros were intended to be used in #ifdef and #if blocks to select SIMD code paths at compile time. Their purpose is the same to #64515 (before the xsimd refactor); only the detection method differs (Meson probing vs. inline compiler‑macro checks).

But then I'll repeat from above (#65471 (comment)): the way this PR was defining those variables would not be useful for actual compile-time dispatch, without a build flag to be able to control it.
For example, we are building the wheels on a machine that is capable of compiling AVX level, thus that would set PANDAS_HAVE_AVX2 to true, which we should not do for generic wheels (if used for static dispatch). At that point, we would need a build flag to control which static compile time intrinsics to allow.

So if a PANDAS_HAVE_.. flag is meant to unconditionally include simd intrinsics statically at compile-time, or the flag is meant to include the intrinsics at compile time but gated by a runtime check, that are two different cases and the flags should sometimes have different values for those two cases.
That is the main differentation I was trying to make, and trying to clarify which of the two this PR is targetting.

(and eg Arrow C++ is having two sets of flags for both cases)


Now, in the meantime you updated the PR to only cover the baseline, which are exactly those cases where there is no difference between both cases ;)
But once we go into runtime dispatch, we will again have to discuss the differentation I am trying to make above.


I think you meant #64515.

Whoops yes, updated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants