llm-inference fastapi-llama-cpp links about FastAPI: fastapi openai-compatible-fastapi FastAPI Qwen-VL示例 FastAPI Qwen2.5-7B-Instruct示例 links about llama-cpp https://github.com/ggml-org/llama.cpp ggml-org/llama.cpp#559 https://llama-cpp-python.readthedocs.io/en/latest/ https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md