Skip to content

Slowness at DGX, server CPUs (Intel Xeon Platinum 8480C, 8570) #784

@Joeycho

Description

@Joeycho

Hi Blosc Team,

I have observed significant degradation (2-5 times slower than workstation) in terms of compress/decompress speed at DGXs (CPUs are Intel Xeon Platinum 8480C, 8570). When I reduced the number of threads, it helps, but according the doc in the below, it should be similar speed. But it might be limited since the test size are small, 32 (blocks), 64 (chunks).

In order to reduce the overhead of threads as much as possible, I've

Or.. the current roadmap for 3.0, is exactly for optimization for DGX servers? I found the below points are very relevant.

* Optimization for multi-socket machines: right now, C-Blosc2 is optimized for single-socket machines. However, in multi-socket machines, memory access is not uniform (NUMA architecture), so optimizations are needed to make sure that every thread is accessing to local memory as much as possible. This would require to use e.g. `numactl <https://linux.die.net/man/8/numactl>`_ or `libnuma <https://man7.org/linux/man-pages/man3/numa.3.html>`_ so as to pin threads and memory allocations to the local socket.

* Support for GPUs: nowadays, GPUs are becoming more and more powerful, and having support for them in C-Blosc2 would be a great addition. The idea is to offload the compression, but most importantly, decompression tasks to the GPU, so that the CPU is free to do other tasks. This would require to use e.g. `CUDA <https://developer.nvidia.com/cuda-toolkit>`_ or `ROCm <https://rocm.docs.amd.com/>`_ so as to access to the GPU capabilities.

For now, if you can clarify that blosc2 speed (compress/decompress) at DGX is slower than the speed at the most modern workstations, that will be super helpful, and good to know information. Plus, your advice to optimize the latency with the current blosc2 at DGXs.

Btw, it seems obvious that per-core clock speed of Xeon Platinum 8480C, 8570 is weaker than workstation (AMD Threadripper Pro 5975WX).

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions