Skip to content

xianzuwu/Niagara

Repository files navigation

🐃Niagara
Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View TCSVT 2025

Xianzu Wu · Zhenxin Ai · Harry Yang · Sernam Lim · Jun Liu · Huan Wang

animated


Paper PDF Project Page Code Github Huggingface GitHub stars

🦉 ToDo List

  • 📢18.03.2025: release code and paper.
  • Release Complete Checkpoint.

🎉 Key Result

Method PSNR (5f) SSIM (5f) LPIPS (5f) PSNR (10f) SSIM (10f) LPIPS (10f) PSNR (u[-30,30]f) SSIM (u[-30,30]f) LPIPS (u[-30,30]f)
Syn-Sin - - - - - - 22.30 0.740 -
SV-MPI 27.10 0.870 - 24.40 0.812 - 23.52 0.785 -
BTS - - - - - - 24.00 0.755 0.194
Splatter Image 28.15 0.894 0.110 25.34 0.842 0.144 24.15 0.810 0.177
MINE 28.45 0.897 0.111 25.89 0.850 0.150 24.75 0.820 0.179
Flash3D 28.46 0.899 0.100 25.94 0.857 0.133 24.93 0.833 0.160
Ours 29.00 0.904 0.099 26.30 0.862 0.131 25.28 0.836 0.156
Novel view synthesis comparison on the RealEstate10K dataset. Following Flash3D, we evaluate our method on the in-domain novel view synthesis task. As seen, our model consistently outperforms existing methods across different frame counts (f as frames,5 frames, 10 frames, u[-30,30] frames), in terms of PSNR, SSIM, and LPIPS. (The best results are in bold, and the second best is slanting typeface. ), and more result will be viewed from the two bottons entrances below.


Visual Result Project Page

🚀Setup

🛠️Create a python environment

Niagara has been trained and tested with the followings software versions:

  • Python 3.10
  • Pytorch 2.2.2
  • CUDA 11.8
  • GCC 11.2 (or more recent)

Begin by installing CUDA 11.8 and adding the path containing the nvcc compiler to the PATH environmental variable. Then the python environment can be created either via conda:

conda create -y python=3.10 -n niagara
conda activate niagara

or using Python's venv module (assuming you already have access to Python 3.10 on your system):

python3.10 -m venv .venv
. .venv/bin/activate

Finally, install the required packages as follows:

pip install -r requirements-torch.txt --extra-index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt

🛠️Add 3DGS python environment

git clone diff-gaussian-rasterization @ git+https://github.com/eldar/diff-gaussian-rasterization-w-pose@main
git submodule add diff-gaussian-rasterization @ git+https://github.com/eldar/diff-gaussian-rasterization-w-pose@main third_party/diff-gaussian-rasterization-w-pose
git submodule update --init --recursive

📝 Download training data

🧩RealEstate10K dataset

For downloading the RealEstate10K dataset we base our instructions on the Behind The Scenes scripts. First you need to download the video sequence metadata including camera poses from https://google.github.io/realestate10k/download.html and unpack it into data/ such that the folder layout is as follows:

data/RealEstate10K/train
data/RealEstate10K/test

Finally download the training and test sets of the dataset with the following commands:

python datasets/download_realestate10k.py -d data/RealEstate10K -o data/RealEstate10K -m train
python datasets/download_realestate10k.py -d data/RealEstate10K -o data/RealEstate10K -m test

This step will take several days to complete. Finally, download additional data for the RealEstate10K dataset. In particular, we provide pre-processed COLMAP cache containing sparse point clouds which are used to estimate the scaling factor for depth predictions. The last two commands filter the training and testing set from any missing video sequences.

sh datasets/dowload_realestate10k_colmap.sh
python -m datasets.preprocess_realestate10k -d data/RealEstate10K -s train
python -m datasets.preprocess_realestate10k -d data/RealEstate10K -s test

🧩KITTI dataset

For downloading the KITTI dataset, we base our instructions on the versatran01 scripts.

cd kitti_raw
wget -nc -i kitti_archives.txt

This step will take in some time to complete. Finally, the KITTI download data you need to extract.

unzip "*drive*.zip" "*/*/image*"
unzip "*drive*.zip" "*/*/oxts*"
unzip "*calib*.zip"

🧩Download and evaluate the pretrained model

We provide model weights that could be downloaded and evaluated on RealEstate10K test set:

python -m misc.download_pretrained_models -o exp/re10k_v2
sh evaluate.sh exp/re10k_v2

Huggingface login (国内需要先 export HF_ENDPOINT=https://hf-mirror.com)

huggingface-cli login 
# Then input your huggingface token for authentication

🏃‍♂️ Run the Code

🧩Training

In order to train the model on RealEstate10K dataset execute this command:

python train.py \
  +experiment=layered_re10k \
  model.depth.version=v1 \
  train.logging=false 

For multiple GPU, we can run with this command:

bash train.sh

🧩inference

bash evaluate.sh

📑BibTeX

@article{wu2025niagara,
  author={Wu, Xianzu and Ai, Zhenxin and Yang, Harry and Lim, Sernam and Liu, Jun and Wang, Huan},
  journal={IEEE Transactions on Circuits and Systems for Video Technology}, 
  title={Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View}, 
  year={2025},
  doi={10.1109/TCSVT.2025.3643728}
}

📖Acknowledgement

A large portion of codes in this repo is based on Flash3D, some of the code is borrowed from:

About

[TCSVT 2025] Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors