ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought

This is the official implementation of our paper “ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought”.

🔥 News

2026/01/30 💥 We release our paper and code.

📖 Introduction

While Chain-of-Thought (CoT) significantly enhances the performance of Large Language Models (LLMs), explicit reasoning chains introduce substantial computational redundancy. Recent latent reasoning methods attempt to mitigate this by compressing reasoning processes into latent space, but often suffer from severe performance degradation due to the lack of appropriate compression guidance. In this study, we propose Rendered CoT-Guided variational Latent Reasoning (ReGuLaR), a simple yet novel latent learning paradigm resolving this issue. Fundamentally, we formulate latent reasoning within the Variational Auto-Encoding (VAE) framework, sampling the current latent reasoning state from the posterior distribution conditioned on previous ones. Specifically, when learning this variational latent reasoning model, we render explicit reasoning chains as images, from which we extract dense visual-semantic representations to regularize the posterior distribution, thereby achieving efficient compression with minimal information loss. Extensive experiments demonstrate that ReGuLaR significantly outperforms existing latent reasoning methods across both computational efficiency and reasoning effectiveness, and even surpasses CoT through multi-modal reasoning, providing a new and insightful solution to latent reasoning.

Note: the LLM trained by ReGuLaR still follows the standard latent reasoning paradigm, accepting pure text inputs and imposing no extra computational cost during inference.

✨ Key Highlights

SOTA Performance: ReGuLaR significantly outperforms existing latent reasoning methods, achieving state-of-the-art performance with minimal reasoning length.

Extreme Compression: Even when compressing all reasoning information into one latent reasoning state, ReGuLaR maintains superior performance across all model scales and datasets.

Multi-Modal Reasoning: By rendering non-textual elements alongside text, ReGuLaR natively supports multi-modality within its latent reasoning processes, enabling it to surpass explicit CoT in complicated reasoning scenarios.

⚒️ Dependencies

We have provided an env.yml file that contains the necessary environment dependencies. To set up your environment, please execute:

conda env create -f env.yml
conda activate ReGuLaR

📦 Model Preparation

Please download required models from HuggingFace using the following script:

cd models
python model_download.py <YOUR_ACCESS_TOKEN>

💪 Experiments

Datasets

ReGuLaR is designed to be compatible with any reasoning dataset as long as each data sample within the dataset is formatted as the following JSON schema:

{
  "image_idx": "Unique identifier for subsequent rendering",
  "question": "Problem statement",
  "steps": "Reasoning chain",
  "answer": "Final answer"
}

For reference, the GSM8K-Aug dataset has been provided in the ./datasets folder, please unzip it before use.

Pre-computation

Since the rendering function is predefined and the visual encoder remains frozen in our work, we pre-compute visual representations offline before training, thereby reducing computational overhead.

cd data_precessing
python image_render.py GSM8K-Aug
python representation_extract.py GSM8K-Aug

Training

bash train.sh GSM8K-Aug

Evaluation

python run.py \
  --test_ckpt_path=/path/to/trained/model.ckpt \
  dataset_name=GSM8K-Aug

👍 Acknowledgments

We extend our sincere gratitude to CoLaR, DeepSeek-OCR and Glyph for their great work and codebase, which served as the foundation for developing ReGuLaR.

📌 Citation

If you find ReGuLaR useful in your research, please consider citing it. 😊

@article{Wang2026ReGuLaR,
  title={ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought},
  author={Wang, Fanmeng and Liu, Haotian and Zhao, Guojiang and Xu, Hongteng and Gao, Zhifeng},
  journal={arXiv preprint arXiv:2601.23184},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
data_processing		data_processing
datasets		datasets
models		models
src		src
LICENSE.txt		LICENSE.txt
README.md		README.md
env.yml		env.yml
run.py		run.py
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought

🔥 News

📖 Introduction

✨ Key Highlights

⚒️ Dependencies

📦 Model Preparation

💪 Experiments

Datasets

Pre-computation

Training

Evaluation

👍 Acknowledgments

📌 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought

🔥 News

📖 Introduction

✨ Key Highlights

⚒️ Dependencies

📦 Model Preparation

💪 Experiments

Datasets

Pre-computation

Training

Evaluation

👍 Acknowledgments

📌 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages