Unified Memory allows GPUs to access system RAM when GPU VRAM is insufficient. Essential for running large structures like 6QNR on consumer GPUs.
- Large structures exceeding GPU memory
- 6QNR requires ~15GB, RTX 4080 has 16GB total
- Getting out-of-memory errors
- Consumer GPUs with limited VRAM
- RTX 4080 (16GB) + 6QNR ✅
- RTX 3080 (10GB) + large proteins ✅
- High-end GPUs with sufficient VRAM
- H100 (80GB) ❌
- A100 (40GB) ❌
- RTX 4090 (24GB) ❌
Edit your config file:
nano my_system.config
# Change this line:
UNIFIED_MEMORY=trueWhen enabled, AFSysBench automatically sets:
XLA_PYTHON_CLIENT_PREALLOCATE=false # Don't pre-allocate GPU memory
TF_FORCE_UNIFIED_MEMORY=true # Enable TensorFlow unified memory
XLA_CLIENT_MEM_FRACTION=3.2 # Request 320% of GPU memory| Configuration | 6QNR Runtime | Notes |
|---|---|---|
| H100 (80GB) | ~5 minutes | No unified memory needed |
| RTX 4080 + Unified | ~8-10 minutes | 60-100% slower but works |
| RTX 4080 without | Fails | Out of memory |
# 1. Enable unified memory
nano my_system.config
# Set: UNIFIED_MEMORY=true
# 2. Run benchmark
python af_bench_runner_updated.py -c my_system.config inference -i 6QNR_subset_data.json -t 1
# 3. Expected output in log:
# Running model inference with seed 1 took 486.67 seconds
# Fold job 6QNR_subset done, output written to /output/6qnr_subset✅ Log shows: "Running model inference with seed 1 took XXX seconds"
✅ Output contains: *.cif, *_confidences.json files
✅ GPU memory stays at ~97% (not crashing)
✅ System RAM usage increases during run
- Verify:
grep UNIFIED_MEMORY my_system.configshowstrue - Check: Sufficient system RAM available (
free -h) - Ensure: No other GPU processes running
- Close other GPU applications
- Ensure fast system RAM (DDR4-3200+ recommended)
- Consider upgrading to 24GB+ GPU for production use