We should be able to use the code, cmake files, etc. from llama.cpp as is: https://github.com/docker/model-runner/pull/471