Dominion Reinforcement Learning

A reinforcement learning implementation for a simplified version of the Dominion card game.

Overview

This project implements:

A simplified, single-player version of Dominion as a reinforcement learning environment
A Deep Q-Network (DQN) agent that learns to play the game
A baseline fixed strategy ("Buy Menu" strategy) that can be used for imitation learning

The agent uses a combination of imitation learning from a predefined strategy and randomized exploration to discover optimal play patterns.

A full write-up of the project, including graphs, can be found at https://www.solarflare.org.uk/dominion.

Components

Abstract base classes (environment.py, strategy.py)

Environment defines a reinforcement learning environment, including states, actions and rewards
Strategy defines a fixed strategy for an Environment
- Fixed strategies can either be evaluated on their own, or used as starting points for training

Dominion Environment (dominion.py)

DominionEnv defines an Environment for single-player Dominion including:
- Card definitions with types, costs and effects
  - Including approx 20 cards so far, from a mixture of the base set, Alchemy and Seaside
- Game state management (deck, hand, discard pile etc.)
- Game phases (Action and Buy phases, together with special phases for handling certain cards)
- Game state representation for the RL agent
- Reward calculation (the agent is rewarded for scoring victory points)
- Game end conditions (currently the game just ends after a fixed number of turns; empty piles are ignored)

Buy Menu Strategy (buy_menu_strategy.py)

BuyMenuStrategy instantiates the Strategy class with a simple, priority-based approach to playing the game:
- Follows a predefined "buy menu" for purchasing cards
- Uses heuristics for playing action cards and trashing

Learning Agent (learning.py)

learning.py defines a reinforcement learning agent that attempts to learn an optimal policy for any given Environment, including:

Neural network architecture for Q-value approximation
Experience replay buffer for stable learning
Epsilon-greedy exploration strategy
(Optional) Hybrid learning that starts from a predefined Strategy and gradually transitions to self-learned policy
Functions for training, saving, and loading the model
Tools for evaluating the agent and plotting graphs of training progress

Example Usage

Run the main script to train the agent:

python main.py

The script will:

Initialize the environment with a specific kingdom
Create a DQN agent with reinforcement learning, starting from a randomized initial strategy
Train the agent over 4,000 games
Save graphs of training progress to a "plots" folder (which will be created if required)
Save the trained model as dominion_dqn.pt

In the default kingdom, the agent learns a "Big Money + Wharf" strategy, which scores around 31 points in the time available.

To use the predefined strategy instead:

Edit the DQNAgent creation code in main.py, changing predefined_strategy=None to predefined_strategy=buy_menu_strategy
Delete the existing dominion_dqn.pt file if it exists
Run python main.py again

In this case, the agent begins with an "imitation learning" phase where it learns to copy the predefined strategy, scoring around 10 points. As training progresses, the agent shifts away from the predefined strategy and instead explores freely on its own, allowing it to discover a new strategy involving University, Alchemist, Wharf and Vineyard, scoring around 40--60 points on average (although it might be necessary to train for longer than the default 4,000 episodes in order to see this).

You can also re-run main.py to resume training (starting from the saved dominion_dqn.pt model) if desired. (If doing this, it might be desirable to decrease the initial epsilon and/or predefined strategy probability, as you don't need as much exploration when restarting from a previous run.)

For further experimentation, training parameters can be adjusted by editing main.py, and the neural network architecture can be changed by editing the QNetwork class in learning.py. Also, of course, different combinations of kingdom cards can be tried, by modifying the list near the top of main.py.

Customization

Adding New Cards

To add new cards, you need to:

Add an entry to the CardId enum in dominion.py
Create a CardInfo object with appropriate properties and effects
Update the CARD_INFO dictionary with the new card
(Optional) Edit the code for the DominionEnv itself to implement special rules (see existing examples for Alchemist, Gardens and Vineyard)

Creating Custom Strategies

You can create custom strategies by:

Implementing the Strategy abstract base class
Overriding the choose_action method to select actions based on your strategy

Alternatively, if your strategy can be expressed as a "buy menu", then passing a new buy menu to BuyMenuStrategy (and perhaps adjusting the action and trashing heuristics in buy_menu_strategy.py) might be sufficient.

Implementing a new Strategy subclass that can execute some of the strategies from Geronimoo's Dominion simulator might be an interesting project.

Requirements

Python 3.10+
PyTorch
NumPy
Matplotlib

After installing Python, the remaining requirements can be installed with pip install torch numpy matplotlib.

Future Improvements

Potential areas for enhancement:

Implement more Dominion cards and expansions
Add a multi-player mode, including self-play for multi-agent learning
Add more sophisticated exploration mechanisms, to reduce the need for imitation learning
Experiment with different RL algorithms (PPO, A2C, etc.) and neural network architectures
Improve the state representation for better learning
Add full game logging so we can see the played strategies in more detail
In the single-player case, experiment with mean-variance optimization (i.e. search for strategies that have high expected scores but also low variance of points scored)

Contact Details

This project was created by Stephen Thompson.

Email: stephen (at) solarflare.org.uk

Website: https://www.solarflare.org.uk/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dominion Reinforcement Learning

Overview

Components

Abstract base classes (environment.py, strategy.py)

Dominion Environment (dominion.py)

Buy Menu Strategy (buy_menu_strategy.py)

Learning Agent (learning.py)

Example Usage

Customization

Adding New Cards

Creating Custom Strategies

Requirements

Future Improvements

Contact Details

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
README.md		README.md
buy_menu_strategy.py		buy_menu_strategy.py
dominion.py		dominion.py
environment.py		environment.py
learning.py		learning.py
main.py		main.py
strategy.py		strategy.py

Folders and files

Latest commit

History

Repository files navigation

Dominion Reinforcement Learning

Overview

Components

Abstract base classes (environment.py, strategy.py)

Dominion Environment (dominion.py)

Buy Menu Strategy (buy_menu_strategy.py)

Learning Agent (learning.py)

Example Usage

Customization

Adding New Cards

Creating Custom Strategies

Requirements

Future Improvements

Contact Details

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages