This project is built on Node.js, which operates on a single-threaded architecture. Because of this single thread, we have to carefully manage the balance between I/O operations and CPU calculations. Currently, the project is facing a severe CPU-bound problem. When heavy tasks like Vector Search run, they completely block the main thread and cause the system to freeze.
To test and understand the exact time the main thread needs to complete these heavy tasks, I chose the Vector Search operation as a benchmark:
1.Load 100,000 dummy memory vectors into VectorIndex.
2.Run a setInterval heartbeat ping every 50ms to monitor Event Loop health.
3.Trigger a VectorIndex.search().
The heartbeat completely stops, and the server freezes for ~200ms - 2000ms (depending on hardware). All other async processes are queued and delayed until the math finishes.
Proposed Solution: Worker Threads & Memory Management
To prevent Event Loop starvation, we need to offload these heavy CPU-bound calculations to worker_threads (or a worker pool). This will ensure the Main Thread remains completely free to handle concurrent I/O operations and incoming requests without freezing.
However, we must carefully address the Serialization Overhead (Memory Problem). If we pass an array of 100,000 vectors to a worker via standard postMessage, Node.js will clone the entire payload. This will double our RAM consumption and add significant IPC (Inter-Process Communication) delay.
To make this architecture work without spiking memory, we should look into using SharedArrayBuffer. This will allow the Main Thread and Worker Thread to share the exact same memory block with zero-copy overhead, keeping the system fast and memory-efficient.
This project is built on Node.js, which operates on a single-threaded architecture. Because of this single thread, we have to carefully manage the balance between I/O operations and CPU calculations. Currently, the project is facing a severe CPU-bound problem. When heavy tasks like Vector Search run, they completely block the main thread and cause the system to freeze.
To test and understand the exact time the main thread needs to complete these heavy tasks, I chose the Vector Search operation as a benchmark:
1.Load 100,000 dummy memory vectors into VectorIndex.
2.Run a setInterval heartbeat ping every 50ms to monitor Event Loop health.
3.Trigger a VectorIndex.search().
The heartbeat completely stops, and the server freezes for ~200ms - 2000ms (depending on hardware). All other async processes are queued and delayed until the math finishes.
Proposed Solution: Worker Threads & Memory Management
To prevent Event Loop starvation, we need to offload these heavy CPU-bound calculations to worker_threads (or a worker pool). This will ensure the Main Thread remains completely free to handle concurrent I/O operations and incoming requests without freezing.
However, we must carefully address the Serialization Overhead (Memory Problem). If we pass an array of 100,000 vectors to a worker via standard postMessage, Node.js will clone the entire payload. This will double our RAM consumption and add significant IPC (Inter-Process Communication) delay.
To make this architecture work without spiking memory, we should look into using SharedArrayBuffer. This will allow the Main Thread and Worker Thread to share the exact same memory block with zero-copy overhead, keeping the system fast and memory-efficient.