Skip to content

Optimize comparison (less-than) with AVX2 #302

@chfast

Description

@chfast

https://godbolt.org/z/xEcGzqKo9

unsigned bsr(unsigned m)
{
    return  31 - __builtin_clz(m);  
}

auto lt_avx(const u256& x, const u256& y)
{
    auto xv = std::bit_cast<__m256i>(x);
    auto yv = std::bit_cast<__m256i>(y);
    auto e = _mm256_cmpeq_epi64(xv, yv);
    auto ed = std::bit_cast<__m256d>(e);
    unsigned m = _mm256_movemask_pd(ed);
    auto f = m ^ 0xf;  // flip mask (4 bits)
    auto g = f | 1;  // fixup eq
    auto i = bsr(g);
    return x.w[i] < y.w[i];
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions