Ruichuan An*, Sihan Yang*, Ziyu Guo, Wei Dai, Zijun Shen, Haodong Li
Renrui Zhang†, Xinyu Wei, Guopeng Li, Wenshan Wu, Wentao Zhang‡
PKU, CUHK, StepFun, PolyU, MSRA.
* Equal Contribution † Project Leader ‡ Corresponding Author
📄 Blog | 🚀 Quick Start | 📦 Dataset | 📜 License | 📝 Citation | 📬 Contact
- 2026.02.11: 🌟 Release of the evaluation code and the core test dataset.
- TBD: Integration of more model inference scripts.
The dataset is available across multiple platforms for your convenience:
- Hugging Face: GENIUS
- Google Drive: Download Link
- Baidu Netdisk: Download Link (Password:
iek1)
Clone the repository and prepare your local environment:
git clone https://github.com/arctanxarc/GENIUS.git
cd GENIUSAfter downloading the dataset, ensure your directory structure matches the following:
./
├── cal_score.py # Scoring script
├── dataset/ # Test dataset
│ ├── implicit_pattern
│ ├── multi_semantic
│ ├── prior_conflicting
│ ├── symbolic_constraint
│ └── visual_constraint
├── eval_prompt.py # Prompt management
├── eval.py # Main evaluation logic
├── eval.sh # Entry script
├── GENIUS.pdf # Paper
└── README.md
Place the images generated by your models into the outputs directory. Organize them using the following hierarchy: outputs/<model_name>/<task_name>/{id}.png.
Important
The {id} must correspond strictly to the id field in test_data.json (Note: IDs are unique identifiers, not necessarily a continuous sequence starting from 0).
Example Structure:
./
./outputs/
└── nanobanana/ # Example: Model Name
├── implicit_pattern/
│ ├── 002.png # Matches ID=002 in ./dataset/implicit_pattern/test_data.json
│ ├── 003.png
│ └── ...
├── multi_semantic/
└── ...
Configure your credentials and target models in eval.sh:
- Set your
API_URLandAPI_KEYfor LMM-as-a-judge. - Define the evaluation scope:
DIMENSIONS=("implicit_pattern" "symbolic_constraint" "visual_constraint" "prior_conflicting" "multi_semantic")
MODELS=("your_model_name")- Execute the evaluation script:
bash eval.shThe dataset and code are released under CC-BY-NC 4.0 and are intended for academic research only. Commercial use is not permitted.
@misc{an2026geniusgenerativefluidintelligence,
title={GENIUS: Generative Fluid Intelligence Evaluation Suite},
author={Ruichuan An and Sihan Yang and Ziyu Guo and Wei Dai and Zijun Shen and Haodong Li and Renrui Zhang and Xinyu Wei and Guopeng Li and Wenshan Wu and Wentao Zhang},
year={2026},
eprint={2602.11144},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2602.11144},
}
- Issues: https://github.com/arctanxarc/GENIUS/issues
- Email: arctanxarc@gmail.com
