Local SLM testing. No wrappers. Direct hardware utilization via compiled binaries.
Intel i7-1165G7, 12GB RAM, AVX-512 optimized. See hardware-profile.json.
llama.cpp compiled from source with AVX-512 flags. See inference-config.json.
Tests are in no particular order - see .md files for each model's test results.