Fix FP16 NEON build on AArch64 CPUs without FP16FML support by Harish-endee · Pull Request #168 · endee-io/endee

Harish-endee · 2026-04-07T06:04:26Z

FP16 NEON distance functions (L2Sqr, InnerProductSim) rely on vfmlalq_low_f16 and vfmlalq_high_f16 intrinsics. These intrinsics require the AArch64 FP16FML extension. Not all AArch64 CPUs support FP16FML, leading to compilation failures on unsupported targets.
Added compatibility fallback implementations

Fallback uses universally supported NEON instructions: vcvt_f32_f16 and vfmaq_f32
Introduced compile-time macro dispatch to select the appropriate implementation
CPUs with FP16FML continue using the optimized single-instruction path with no overhead
Fallback code is only compiled when the FP16FML extension is unavailable

Some AArch64 CPUs don't support the FP16FML extension, which causes builds to fail due to missing vfmlalq_low_f16 and vfmlalq_high_f16 intrinsics. This adds compatibility fallbacks that use universally available NEON instructions (vcvt_f32_f16 + vfmaq_f32) instead, with automatic compile-time dispatch so CPUs with FP16FML still use the native single-instruction path.

github-actions · 2026-04-07T06:05:11Z

VectorDB Benchmark - Ready To Run

CI Passed ([lint + unit tests] (https://github.com/endee-io/endee/actions/runs/24066965981)) - benchmark options unlocked.

Post one of the command below. Only members with write access can trigger runs.

Available Modes

Mode	Command	What runs
Dense	`/correctness_benchmarking dense`	HNSW insert throughput · query P50/P95/P99 · recall@10 · concurrent QPS
Hybrid	`/correctness_benchmarking hybrid`	Dense + sparse BM25 fusion · same suite + fusion latency overhead

Infrastructure

Server	Role	Instance
Endee Server	Endee VectorDB — code from this branch	`t2.large`
Benchmark Server	Benchmark runner	`t3a.large`

Both servers start on demand and are always terminated after the run — pass or fail.

How Correctness Benchmarking Works

1. Post /correctness_benchmarking <mode>
2. Endee Server Create  →  this branch's code deployed  →  Endee starts in chosen mode
3. Benchmark Server Create  →  benchmark suite transferred
4. Benchmark Server runs correctness benchmarking against Endee Server
5. Results posted back here  →  pass/fail + full metrics table
6. Both servers terminated   →  always, even on failure

After a new push, CI must pass again before this menu reappears.

Harish-endee assigned hemant-endee and shaleenji Apr 7, 2026

Harish-endee requested review from hemant-endee and shaleenji April 7, 2026 06:26

Harish-endee unassigned hemant-endee and shaleenji Apr 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix FP16 NEON build on AArch64 CPUs without FP16FML support#168

Fix FP16 NEON build on AArch64 CPUs without FP16FML support#168
Harish-endee wants to merge 1 commit intomasterfrom
harish/fix-fp16-compat

Harish-endee commented Apr 7, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Harish-endee commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 7, 2026

VectorDB Benchmark - Ready To Run

Available Modes

Infrastructure

How Correctness Benchmarking Works

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Harish-endee commented Apr 7, 2026 •

edited

Loading