SwitchFL: Switch-Centered Flatland Environment for Multi-Agent Reinforcement Learning

Welcome to SwitchFL, a custom multi-agent reinforcement learning (MARL) environment based on Flatland, designed with a novel switch-centered perspective. This environment shifts the focus of control from individual trains to railway switches, introducing unique coordination and planning challenges in a rail network.

Motivation

Traditional train-routing environments like Flatland focus on agent-centered control. SwitchFL introduces a novel abstraction by modeling switches as decision points. This perspective is better suited for decentralized control, real-world railway signal systems, and asynchronous agent interactions.

It’s particularly useful for:

Studying coordination across control points in transportation systems.
Training RL agents in asynchronous, partially observable environments.
Benchmarking switch-based vs. agent-based control.

Environment Overview

SwitchFL is built on top of Flatland but wrapped in a PettingZoo-compatible asynchronous multi-agent interface.

Key Features:

Asynchronous MARL environment
Switch-centered control abstraction
Compatible with PettingZoo (AEC) interface
Supports Flatland and networkx rendering
Includes 4 types of switches with varying complexity as agent types

Switch Types

SwitchFL includes the following switch types:

T-Crossing
Allows branching paths in a T shape (3-way split). This applies to case 2 and 6.
Standard Crossing
Classic railway cross with two intersecting paths (4-way cross). This applies to case 3.
Single Turn Switch
A switch with one 90° turn and two straight connections. This applies to case 4.
Double Turn Switch
A complex crossing with two 90° turns, offering multiple routing options. This applies to case 5.

Each switch is treated as an agent with its own observation and decision-making responsibility.

Installation

we opted for poetry as the package-management system:

git clone https://github.com/yourusername/switchfl.git
cd switchfl
poetry install

Ensure you have the following installed:

flatland-rl
pettingzoo
numpy
pygame (for rendering)
matplotlib (optional, for visual debugging)

Usage Example

from flatland.envs.rail_env import RailEnv
from flatland.envs.rail_generators import sparse_rail_generator
from flatland.envs.line_generators import sparse_line_generator
from switchfl.switch_env import ASyncSwitchEnv

random_seed = 41
rail_env = RailEnv(
    width=18,
    height=18,
    rail_generator=sparse_rail_generator(
        max_num_cities=5,
        grid_mode=True,
        max_rails_between_cities=1,
        max_rail_pairs_in_city=1,
        seed=random_seed,
    ),
    line_generator=sparse_line_generator(seed=random_seed),
    number_of_agents=2,
)

env = ASyncSwitchEnv(rail_env, render_mode="human")

env.reset(seed=random_seed)
for agent in env.agent_iter():
    observation, reward, termination, truncation, info = env.last()
    
    if termination or truncation:
        break
    
    # Replace with your custom switch policy
    action = env.action_space(agent).sample(info["action_mask"])
    env.render()
    env.step(action)

env.close()

ℹ️ Info

Please note, that we deviate slightly from the petting zoo framework as we also return the id of the next switch the train will reach after a step. This is going to help for approaches in RL which use TD learning as we need here some kind of notation for the next state. Otherwise the decision making of a switch agent will turn to a Multi-Armed-Bandit (MAB).

Observation Space

Each agent (switch) receives a localized view of its surroundings:

Switch state and occupancy
Train proximity or scheduling info
Track layout around the switch

The environment is equipped with a Observer scaffold. Therefor you can design your own observation space for a switch agent.
As a default option we use the StandardObserver which can be found here. This observer is discrete and includes:

which outgoing ports are blocked by incoming trains
where does an incoming would like to go
how much delay does a train have

If there is no train to extract this knowledge from the corresponding spots in an observation are -1.

Action Space

Each switch decides which path to activate. The action space is:

Discrete
Action-masked (invalid options are masked out via info["action_mask"])

Each discrete action corresponds to piping a train from one port to another if applicable. This results in different amount of available actions for an agent:

T-Crossing: 4
Standard Crossing: 2
Single Turn Switch: 6
Double Turn Switch: 8

For more details please refer to the switch definitions or print actions of an _Switch instance.

Rendering

SwitchFL supports human-friendly rendering via Flatland’s renderer:

env = ASyncSwitchEnv(..., render_mode="human")
env.render()

Use this to visually debug the movement of trains and decisions made by switches.

Implementation Details

SwitchFL transforms Flatland's train-centric approach into a switch-centric multi-agent system through several key components:

ASyncSwitchEnv: Main environment controller that orchestrates train movement simulation, agent activation, and reward calculation
RailNetwork: Manages railway topology, switch-to-switch connections, and train-switch interactions using graph representations
Switch Agents: Individual controllers for different switch types (T-Crossing, Standard Crossing, Single/Double Turn) with unique action spaces
RailGraph: Converts Flatland's grid-based environment into efficient graph structures for network analysis
Observer System: Provides localized state observations including train proximity, port status, and network topology

The environment operates through an asynchronous simulation loop where trains move toward switches, switches become active agents when trains approach, and routing decisions are executed based on agent actions. This creates a distributed control system where coordination emerges through individual switch decisions.

For comprehensive implementation details, architecture diagrams, and technical challenges, please refer to the detailed documentation.

Research & Use Cases

SwitchFL is a great platform for:

Investigating coordination under partial observability.
Benchmarking asynchronous vs. synchronous control paradigms.
Studying traffic bottlenecks and switch prioritization strategies.
Applying transformer-based policies or graph RL to transportation systems.

Contributing

We welcome contributions! To contribute:

Fork this repository.
Create a new branch (git checkout -b feature-foo)
Commit your changes (git commit -am 'Add feature foo')
Push to the branch (git push origin feature-foo)
Open a Pull Request.

License

MIT License. See LICENSE for details.

Authors

If you use this code in your research, please cite:

@misc{uhrich2025gbidiff,
  title={SwitchFL: Switch-Centered Flatland Environment for Multi-Agent Reinforcement Learning},
  author={Uhrich, Robin and Mussi, Marco and Restelli, Marcello},
  year={2025},
  url={https://github.com/RobinU434/SwitchFL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
images		images
switchfl		switchfl
test		test
.gitignore		.gitignore
README.md		README.md
debug_agent_sorting.py		debug_agent_sorting.py
debug_flatland.py		debug_flatland.py
debug_seed_issue.py		debug_seed_issue.py
debug_train_directions.py		debug_train_directions.py
example.ipynb		example.ipynb
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
test_multiple_runs.py		test_multiple_runs.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SwitchFL: Switch-Centered Flatland Environment for Multi-Agent Reinforcement Learning

Table of Contents

Motivation

Environment Overview

Key Features:

Switch Types

Installation

Usage Example

Observation Space

Action Space

Rendering

Implementation Details

Research & Use Cases

Contributing

License

Authors

Related Projects

About

Uh oh!

Releases

Packages

Languages

RobinU434/SwitchFL

Folders and files

Latest commit

History

Repository files navigation

SwitchFL: Switch-Centered Flatland Environment for Multi-Agent Reinforcement Learning

Table of Contents

Motivation

Environment Overview

Key Features:

Switch Types

Installation

Usage Example

Observation Space

Action Space

Rendering

Implementation Details

Research & Use Cases

Contributing

License

Authors

Related Projects

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages