observed init behavior on a large repo, 24 core / 48 ht box with a 3090 GPU:
- GPU is busy only intermittently, while it is busy CPU is minimal, single digit percents
- when GPU is not busy CPU spikes to 3-5 (ht) cores
Educated guess:
- GPU is primarily only used for embeddings up front [this is surprising given what I remember about og PLAID]
- Index construction is CPU bound and poorly parallelized
- Possibly there is also a pipelining issue where GPU sits idle while waiting for CPU (but also possibly it's just not a deep pipeline and the CPU is the bottleneck)
I do see thread counts briefly burst from ~40 to ~100 but this is not highly correlated to more %CPU reported by top
observed init behavior on a large repo, 24 core / 48 ht box with a 3090 GPU:
Educated guess:
I do see thread counts briefly burst from ~40 to ~100 but this is not highly correlated to more %CPU reported by top