Skip to content

Non-Standard Reinforcement Learning for Prioritized Multi-Objective Problems: lunarlander environment

Notifications You must be signed in to change notification settings

NamaWho/lex-lunarlander

Repository files navigation

πŸš€ Non-Standard Reinforcement Learning for Prioritized Multi-Objective Problems: lunarlander

Authors: Daniel Namaki, NiccolΓ² Settimelli
Course: Symbolic and Evolutionary Artificial Intelligence
Academic Year: 2024/2025 – University of Pisa


🧠 Project Overview

This project investigates non-standard reinforcement learning (RL) methods that leverage lexicographic reward prioritization on the classic LunarLander-v2 environment. Instead of a single scalar reward, our agents optimize a vector reward with strict priorities:

  1. βœ… Survival (avoid crashing)
  2. 🎯 Landing quality (upright, centered touchdown)
  3. β›½ Fuel efficiency

We implement and compare:

  • Potential-Based Survival Shaping
  • Cone-Aware Survival Shaping
  • Curriculum Learning with Prioritized Replay
  • Standard DQN Baselines

πŸ—‚ Repository Structure

2025_SEAI_F01/
β”œβ”€β”€ models/                             # Saved model checkpoints
β”œβ”€β”€ networks/                           # LexQNetwork & standard Q-network code
β”œβ”€β”€ v_cone/                             # Cone-aware shaping agent
β”œβ”€β”€ v_potential_shaping/                # Potential-based shaping agent
β”œβ”€β”€ v_prioritized_curriculum_learning/  # Curriculum + prioritized replay agent
β”œβ”€β”€ v_standard/                         # Standard & prioritized DQN agents
β”œβ”€β”€ requirements.txt                    # Python dependencies
β”œβ”€β”€ doc_seai_f01.pdf                    # Full project report
└── README.md                           # This overview

About

Non-Standard Reinforcement Learning for Prioritized Multi-Objective Problems: lunarlander environment

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages