Skip to content

hrluo93/currentNe_GPU

Repository files navigation

currentNe_GPU

Modified GPU-accelerated currentNe(https://github.com/esrud/currentNe) with PED/MAP, and VCF input support, plus complete Ne estimation & confidence intervals. GPU-accelerated fork of currentNe adding PED/MAP and VCF input, and providing end-to-end Nₑ estimation with confidence intervals. The GPU path computes weighted LD (d²) in FP64 using atomicAdd(double*), while Nₑ and CIs follow the original integration and neural-network variance model.

Requires: NVIDIA GPU ≥ Pascal (SM ≥ 6.0), NVIDIA driver + CUDA Toolkit (12+), gcc/g++ & make, and ~1 GB free GPU memory (more for large datasets).

OpenCL version is available Under Release currentNe-ocl.zip can be used on Nvidia, AMD and Intel GPUs. The CUDA and OpenCL implementations produce results that are fully consistent with the original CPU version.

An Apple Metal FP32 version is also available for testing purposes. Because Metal does not support FP64, the FP32 estimated d² and Ne values may differ from those of the CPU version. The FP32 version is provided only as a test of Metal GPU computing. If needed, please contact the CurrentNe_gpu author: hrluo93@foxmail.com .

Benchmark

2adc8596-91cf-45c7-a114-bd76c468ed27

Cooling note: Not recommended to run on passively cooled (fanless) Tesla GPUs without server-grade, front-to-back airflow. The FP64 path saturates the FP units for extended periods, creating stress-test-level thermal load (stress FPU). Inadequate airflow will cause throttling or faults.

CurrentNe original Authors: Enrique Santiago, Carlos Köpke

Citations:

Santiago, E., Caballero, A., Köpke, C., & Novo, I. (2024). Estimation of the contemporary effective population size from SNP data while accounting for mating structure. Molecular Ecology Resources, 24, e13890. https://doi.org/10.1111/1755-0998.13890

Santiago, E., Köpke, C. & Caballero, A. Accounting for population structure and data quality in demographic inference with linkage disequilibrium methods. Nat Commun 16, 6054 (2025). https://doi.org/10.1038/s41467-025-61378-w##

CUDA build (recommended)

unzip currentNe_gpu_full.zip
cd currentNe_gpu_full
make ARCH=sm_89        # choose your GPU's SM arch (sm_70, sm_80, sm_86, sm_89 ...) also should be set `ARCH ?=sm_89` in Makefile accordingly.

This creates ./currentNe_gpu.

CPU fallback

make cpu

This creates ./currentNe_gpu_cpu (OpenMP).

OpenCL build (Tested on Nvidia GPU)

unzip currentNe-ocl.zip
cd currentNe-ocl
make -f Makefile.opencl

This creates ./currentNe_ocl.

Run

General form:

ulimit -s unlimited    #default Maxloci setting to 20 million, can increase in the cpp file.
./currentNe_gpu <datafile> <num_chromosomes> [options]
  • <datafile>: one of
    • prefix.vcf
    • prefix.ped (requires prefix.map in the same folder)
    • prefix.tped (with individuals as columns following the first 4 fields)
  • <num_chromosomes>: required (e.g., 22 for human-like autosomes, or the true count for your organism's autosomes).

Common options:

  • -s <N> Number of SNPs to use (default: all segregating)
  • -t <T> CPU threads (for non-GPU parts; default: OpenMP auto)
  • -o <file> Output filename (default: <prefix>_currentNe_OUTPUT.txt)
  • -k <int> Important, please see original description in currentNe
  • -q Quiet: only print Ne (and with -v also 50% & 90% CI)
  • -v With -q, also print CIs
  • -p Print full analysis to stdout instead of file

Examples:

# TPED
./currentNe_gpu mydata.tped 19 -t 8

# PED/MAP
./currentNe_gpu mypop.ped 19 -t 8

# VCF
./currentNe_gpu cohort.vcf 19 -t 8
./currentNe_gpu cohort.vcf 19 -t 8 -k 1 

-t 8 is enough

Output

  • Full report file (unless -p): <prefix>_currentNe_OUTPUT.txt
    Includes: input stats, d², expected/observed het, Ne point estimate, 50%/90% CI.

Notes

  • Double atomicAdd requires GPU architecture sm_60+; set ARCH accordingly.
  • For very large SNP counts, memory = L × N bytes (char). Consider filtering -s or thinning SNPs.

About

GPU version of the currentNe

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors