Jove

N-Body Astrodynamics Reinforcement Learning Environment in Julia

Jove is a high-performance, physics-accurate reinforcement learning environment for autonomous gravity assist trajectory optimization. A Proximal Policy Optimization (PPO) agent learns to pilot a spacecraft through a live N-Body gravitational simulation, exploiting planetary flybys to reach Jupiter with minimal fuel expenditure.

Architecture

Physics Engine — `src/Physics/`

The gravitational dynamics of five bodies (Sun, Earth, Mars, Jupiter, Spacecraft) are modeled symbolically using ModelingToolkit.jl and integrated with OrdinaryDiffEq.jl.

Stiff ODE Integration: The Rosenbrock23 solver handles the stiff, tightly-coupled N-Body system reliably across the full range of agent-generated control inputs.
Symbolic Softening: A gravitational softening term (+ 1e6) prevents singularities as bodies approach one another.

Cache-Safe Continuous Control Injection — `src/Environment/wrapper.jl`

Injecting agent-commanded thrust into a live SciML integrator is non-trivial. Stiff solvers such as Rosenbrock23 use AutoForwardDiff internally, which maintains caches of Dual numbers for Jacobian evaluation. Mutating MTKParameters mid-integration (via integrator.ps[...] or setp(integrator, ...)) collides with these caches and raises MethodError.

Jove resolves this with a two-part strategy:

remake + solve per RL step: Rather than advancing a persistent integrator, each environment step issues a fresh solve call over the [t, t+dt] window. Each solve constructs its own cache from scratch, making parameter mutation provably safe.
Compiled setp setters: Parameter mutation uses setters compiled once at environment initialization time via ModelingToolkit.setp. These are called at O(1) cost on every step, with no symbolic dictionary lookups.

# Compiled once at startup:
set_tx = ModelingToolkit.setp(sys, sys.thrust_x)
set_ty = ModelingToolkit.setp(sys, sys.thrust_y)

# Called every environment step at O(1):
env.set_tx(env.prob, thrust_x)
env.set_ty(env.prob, thrust_y)
mini_prob = remake(env.prob; u0=env.current_u, tspan=(t, t+dt))
sol = solve(mini_prob, Rosenbrock23(); save_everystep=false, ...)

Dynamic State Index Resolution

ModelingToolkit.structural_simplify reorders state variables for cache alignment. Raw integer offsets (e.g., u[N + i]) will silently read wrong quantities after simplification. Jove resolves the canonical position of every body's x, y, vx, vy at startup using findfirst over unknowns(sys) and stores these indices explicitly:

x_idx[i] = findfirst(v -> isequal(v, sys.x[i]), unknowns(sys))

All state reads — including collision detection, reward computation, and observation extraction — use these resolved indices exclusively.

PPO Agent — `src/Agent/policy.jl`

A custom, dependency-free PPO implementation built on Flux.jl and Optimisers.jl. Decoupled from ReinforcementLearning.jl internals to ensure stability across library versions.

Actor-Critic network with shared trunk and separate policy/value heads
Clipped surrogate objective (PPO-Clip)
Generalized Advantage Estimation (GAE)
Entropy bonus for sustained exploration
PPOBuffer: Plain Julia Vectors — no dependency on RL.jl trajectory types

Repository Structure

Jove/
├── data/
│   └── ephemeris/          # NASA Horizons orbital state vectors
├── scripts/
│   ├── train.jl            # 1M-step custom PPO training loop
│   └── evaluate.jl         # Trajectory evaluation and visualization
├── src/
│   ├── Physics/
│   │   ├── equations.jl    # Symbolic N-Body ODESystem (ModelingToolkit)
│   │   └── constants.jl    # Physical constants
│   ├── Environment/
│   │   ├── wrapper.jl      # GravityAssistEnv: ODE step, control injection
│   │   ├── states.jl       # Observation extraction and normalization
│   │   └── reward.jl       # Reward shaping and termination conditions
│   └── Agent/
│       └── policy.jl       # JovePPOPolicy, PPOBuffer, jove_update!
├── test/                   # Unit tests (energy conservation, RL API)
├── Project.toml
└── LICENSE

Getting Started

Requirements: Julia 1.10+

Installation

julia --project=. -e 'using Pkg; Pkg.instantiate()'

Training

julia --project=. --threads auto scripts/train.jl

The training loop runs for 1,000,000 environment steps. Episode rewards are printed every 100 episodes. A trained agent checkpoint is saved to data/ upon completion.

Key Dependencies

Package	Role
`ModelingToolkit.jl`	Symbolic N-Body ODE construction
`OrdinaryDiffEq.jl`	Stiff ODE integration (Rosenbrock23)
`Flux.jl`	Actor-Critic neural network
`Optimisers.jl`	Adam optimizer for PPO updates
`ReinforcementLearning.jl`	`AbstractEnv` / `AbstractPolicy` interfaces

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data/models		data/models
docs		docs
notebooks		notebooks
scripts		scripts
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
Manifest.toml		Manifest.toml
Project.toml		Project.toml
README.md		README.md
check_api.jl		check_api.jl
jove_trajectory.png		jove_trajectory.png
smoke_test.jl		smoke_test.jl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jove

Architecture

Physics Engine — `src/Physics/`

Cache-Safe Continuous Control Injection — `src/Environment/wrapper.jl`

Dynamic State Index Resolution

PPO Agent — `src/Agent/policy.jl`

Repository Structure

Getting Started

Installation

Training

Key Dependencies

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Jove

Architecture

Physics Engine — src/Physics/

Cache-Safe Continuous Control Injection — src/Environment/wrapper.jl

Dynamic State Index Resolution

PPO Agent — src/Agent/policy.jl

Repository Structure

Getting Started

Installation

Training

Key Dependencies

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Physics Engine — `src/Physics/`

Cache-Safe Continuous Control Injection — `src/Environment/wrapper.jl`

PPO Agent — `src/Agent/policy.jl`

Packages