Not sure it worked as expected:
Before converting I got
loretoparisi@MacBook-Air-di-Loreto ANE-LM % ./build/ane-lm generate --model Qwen3.5-0.8B --prompt "Hello"
==========
Hello! How can I help you today?
==========
Prompt: 13 tokens, 21.170 tokens-per-sec
Generation: 9 tokens, 21.906 tokens-per-sec
then converting BF16 -> FP16:
loretoparisi@MacBook-Air-di-Loreto ANE-LM % ./build/ane-lm convert --model Qwen3.5-0.8B
Wrote 452 ANE blobs to Qwen3.5-0.8B/ane_weights
Done in 1495.4 ms
I got
loretoparisi@MacBook-Air-di-Loreto ANE-LM % ./build/ane-lm generate --model Qwen3.5-0.8B --prompt "Hello"
==========
Hello! How can I help you today? 😊
==========
Prompt: 13 tokens, 21.708 tokens-per-sec
Generation: 11 tokens, 22.341 tokens-per-sec
loretoparisi@MacBook-Air-di-Loreto ANE-LM % ./build/ane-lm generate --model Qwen3.5-0.8B --prompt "Hello"
==========
Hello! How can I help you today?
==========
Prompt: 13 tokens, 21.606 tokens-per-sec
Generation: 9 tokens, 22.150 tokens-per-sec
For a more complex and longer sequence of around ~1,000 tokens I've got
Prompt: 37 tokens, 21.824 tokens-per-sec
Generation: 925 tokens, 20.293 tokens-per-sec
Not sure it worked as expected:
Before converting I got
then converting
BF16 -> FP16:loretoparisi@MacBook-Air-di-Loreto ANE-LM % ./build/ane-lm convert --model Qwen3.5-0.8B Wrote 452 ANE blobs to Qwen3.5-0.8B/ane_weights Done in 1495.4 msI got
For a more complex and longer sequence of around ~1,000 tokens I've got