miniAlphaZero (Chess)

A minimal AlphaZero-style training loop for chess using python-chess and PyTorch. This project is intentionally compact and educational rather than highly optimized.

Features

Chess environment wrapper around python-chess.
Simple board and move encoding to build fixed-size policy output.
Lightweight CNN policy+value network in PyTorch.
MCTS with PUCT for policy improvement.
Self-play game generator and replay buffer.
Supervised-style training on self-play targets.
Simple evaluation arena to compare two checkpoints.

Requirements

Python 3.9+
PyTorch
python-chess
numpy

Install:

pip install torch python-chess numpy

For NVIDIA GPU training on Windows, install a CUDA-enabled PyTorch wheel (example for CUDA 12.8):

pip uninstall -y torch
pip install --index-url https://download.pytorch.org/whl/cu128 torch

Project Structure

minialphazero/
├── chess_env/
│   ├── board.py          # State encoding, move encoding
│   ├── rules.py          # Wrapper around python-chess
├── nn/
│   ├── model.py          # PyTorch NN (policy + value)
│   ├── inference.py      # NN interface used by MCTS
├── mcts/
│   ├── node.py
│   ├── mcts.py
│   ├── puct.py
├── selfplay/
│   ├── game.py
│   ├── buffer.py
├── training/
│   ├── train.py
│   ├── loss.py
├── eval/
│   ├── arena.py
├── config.py
├── main.py
└── README.md

Quick Start (Windows / PowerShell)

Activate virtual environment: ..venv\Scripts\Activate.ps1
Fast training: ..venv\Scripts\python.exe -m main --preset fast --self_play_games 12 --mcts_sims 24 --policy_topk 12 --alpha_beta_depth 0 --alpha_beta_topk 8
Faster iteration command: ..venv\Scripts\python.exe -m main --preset fast --self_play_games 12 --mcts_sims 24 --policy_topk 12 --alpha_beta_depth 0 --alpha_beta_topk 8
Higher-quality training: ..venv\Scripts\python.exe -m main --preset quality

Resume from checkpoint: ..venv\Scripts\python.exe -m main --preset fast --load checkpoints\fast\final.pt

Lower learning rate: ..venv\Scripts\python.exe -m main --preset fast --lr 3e-4

Arena mode (two checkpoints play eachother)

Save all moves from the last arena game to a file: ..venv\Scripts\python.exe -m arena --model_a checkpoints\fast\iter_011.pt --model_b checkpoints\fast\iter_012.pt --games 1 --sims 16 --alpha_beta_depth 0 --alpha_beta_topk 8 --forced_attack_color none --save_last_game_moves last_game.json --save_attack_pair_moves attack_pair.json

Move Visualizers

Simple GUI visualizer: ..venv\Scripts\python.exe -m visualizer_gui --moves_file last_game.json

Open the attack-pair game: ..venv\Scripts\python.exe -m visualizer_gui --moves_file attack_pair.json --game_index 0

GUI controls:

First / Prev / Next / Last
Autoplay with delay slider
White pieces are uppercase creamy text, black pieces are lowercase dark text

Notes

The move space is a fixed-size 20,480 vector indexed by (from_square, to_square, promotion) where promotion ∈ {None, N, B, R, Q}. Illegal moves are masked out before softmax.
Input planes: 12 piece-type planes (6 per color) + 1 side-to-move plane = shape (13, 8, 8).
This is educational code; for performance and strength, many enhancements are possible (history planes, castling/ep info, better architecture, dirichlet noise, resignation logic, etc.).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

miniAlphaZero (Chess)

Features

Requirements

Project Structure

Quick Start (Windows / PowerShell)

Arena mode (two checkpoints play eachother)

Move Visualizers

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.idea		.idea
.venv		.venv
__pycache__		__pycache__
checkpoints		checkpoints
chess_env		chess_env
eval		eval
mcts		mcts
nn		nn
selfplay		selfplay
training		training
.gitattributes		.gitattributes
README.md		README.md
arena.py		arena.py
attack_pair.json		attack_pair.json
config.py		config.py
last_game.json		last_game.json
main.py		main.py
mcts.py		mcts.py
model.py		model.py
play_vs_random.py		play_vs_random.py
self_play.py		self_play.py
train.py		train.py
visualizer.py		visualizer.py
visualizer_gui.py		visualizer_gui.py

Folders and files

Latest commit

History

Repository files navigation

miniAlphaZero (Chess)

Features

Requirements

Project Structure

Quick Start (Windows / PowerShell)

Arena mode (two checkpoints play eachother)

Move Visualizers

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages