ELITE: Efficient Gaussian Head Avatar from a Monocular Video via Learned Initialization and TEst-time Generative Adaptation
Kim Youwang · Lee Hyoseok · Park Subin · Gerard Pons-Moll · Tae-Hyun Oh
Official implementation of the CVPR 2026 paper, "ELITE: Efficient Gaussian Head Avatar from a Monocular Video via Learned Initialization and TEst-time Generative Adaptation".
- ELITE is an efficient system for synthesizing high-fidelity, animatable Gaussian avatars from a short monocular video.
- ELITE leverages a mutually reinforcing synergy of 2D & 3D face priors (generative & data priors) to tackle the longstanding challenge in monocular synthesis of animatable, photorealistic 3D face avatar — Balancing between in-the-wild generalization and efficient synthesis.
Tested on Ubuntu 24.04, NVIDIA RTX A6000 (48GB), CUDA-11.8, gcc-11, g++-11. Later versions should work, but haven't tested.
git clone git@github.com:kaist-ami/ELITE.git
cd ELITE
git submodule update --init --recursive
export CUDA_HOME=/usr/local/cuda-11.8
export CC=/usr/bin/gcc-11
export CXX=/usr/bin/g++-11
conda create -n ELITE python=3.10
conda activate ELITE
pip install -r requirements.txt
pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py310_cu118_pyt201/download.html
CC=gcc-11 CXX=g++-11 pip install --no-build-isolation git+https://github.com/hbb1/diff-surfel-rasterization.git
pip install -e ./vhapDownload ELITE's pre-trained 3D prior model and 2D generative prior model checkpoints from this link. Put the downloaded checkpoints under checkpoints/{2d_prior,3d_prior}.pth.
Download FLAME related assets from FLAME. Before you continue, you must register at https://flame.is.tue.mpg.de/ and agree to the license terms.
- First download
FLAME 2023 (revised eye region, improved expressions, versions w/ and w/o jaw rotation)and put it underasset/flame/flame2023.pkl. - Then, download
FLAME Vertex Masksand put it underasset/flame/FLAME_masks.pkl
We mainly use monocular videos from INSTA (Zielonka, CVPR'23) dataset and create personalized avatars. Check their instructions and download videos.
Open configs/paths.sh and edit the variables at the top to match your environment — in particular CONDA_INIT_SCRIPT (path to your conda initializer) and CUDA_HOME. Then activate the config:
source configs/paths.shELITE is an end-to-end pipeline to create and render a personalized avatar from a monocular video.
Place the input video (e.g., bala.mp4 from INSTA) at data/source/input_videos/{ID}.mp4. A sample input video can be downloaded from this link.
Note: We currently support two resolutions (
heightxwidth):512x512and802x550.
Runs video preprocessing, face 3DMM tracking, and NeRF/3DGS-format export.
# bash scripts/process_source_video.sh {ID}
bash scripts/process_source_video.sh bala
# if you want to use specific gpu index
bash scripts/process_source_video.sh bala 4The processed artifacts are saved at:
# pre-processed video
data/source/input_videos/{ID}
# 3DMM tracked dataset
data/source/tracked/{ID}_whiteBg_staticOffset
# post-processed data in NeRF/3DGS-format
data/source/processed/{ID}_whiteBg_staticOffset_maskBelowLine
Personalize the generalized 3D prior model to the target identity via ELITE's test-time generative adaptation.
# bash scripts/personalize.sh {ID}
bash scripts/personalize.sh bala
# if you want to use specific gpu index
bash scripts/personalize.sh bala 4Renders the personalized avatar driven by a motion sequence. Please download some preset driver motion sequences from this link, and put them under data/drive.
# bash scripts/render_videos.sh {ID} {MOTION_NAME}
bash scripts/render_videos.sh bala nersemble_416_EMO-1-shout+laugh
# if you want to use specific gpu index or FPS
bash scripts/render_videos.sh bala nersemble_416_EMO-1-shout+laugh 4 30If you want to use your own motion sequence for a driving signal, you can run the preprocessing scripts from the Step 1 (scripts/process_source_video.sh) and put the "processed" data directories under data/drive.
The rendered output videos are saved at:
outputs/{ID}/vis_motion/{ID}_rgb_{MOTION_NAME}.mp4 # RGB render
outputs/{ID}/vis_motion/{ID}_nrm_{MOTION_NAME}.mp4 # Normal map render
If you find our code or paper helps, please consider citing:
@inproceedings{youwang2026elite,
title = {ELITE: Efficient Gaussian Head Avatar from a Monocular Video via Learned Initialization and TEst-time Generative Adaptation},
author = {Youwang, Kim and Hyoseok, Lee and Subin, Park and Pons-Moll, Gerard and Oh, Tae-Hyun},
booktitle = {CVPR},
year = {2026}
}Kim Youwang (youwang.kim@postech.ac.kr)
