Skip to content

A better documentation would be nice 😄  #2

@xXWarMachineRoXx

Description

@xXWarMachineRoXx

a: 0.1
e: 9.251777478947598e-05
g: 0.9

I don't know what these mean its just an example but comments and how to make it better etc is surely missing. I know its your learning project but do tell us what you learnt as I too wanna do this Q learning project as my first and want to make the perfect snake 🥇 .

on a closer look I think its just ,

a (α - Alpha): This is the learning rate, denoted by α. It determines how quickly the Q-values are updated based on new experiences. A higher value means that the agent will adjust its Q-values more rapidly in response to new information. A lower value makes the agent more resistant to changing its Q-values based on new experiences.

e (ε - Epsilon): This is the exploration factor, denoted by ε. It determines the likelihood that the agent will choose a random action instead of following its learned policy. Exploration is important to discover new actions and states, which helps the agent find better policies. A higher ε encourages more exploration, while a lower ε favors exploitation of the current knowledge.

g (γ - Gamma): This is the discount factor, denoted by γ. It determines the agent's consideration of future rewards in the decision-making process. A higher value of γ makes the agent prioritize long-term rewards, while a lower value makes it focus more on immediate rewards. It is used to calculate the cumulative discounted future rewards when updating Q-values.

If it was written somewhere in README.md , it would be greatly helpful

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions