The embedding process is much slower than it could be. Breadcrumbs:
- GPU compute utilisation is around 30%
- Other RAG apps like anything llm max out the GPU when embedding.
- embedding is done through the langchain chromadb api. Might not have been implemented efficiently.
The embedding process is much slower than it could be. Breadcrumbs: