Adding Boltzmann Softmax Exploration

Please view the [Contributing Guidelines](https://github.com/Bluejee/MNEST/blob/main/CONTRIBUTING.md) for information on Contributing.

**Is your feature request related to a problem? Please describe.**
The Epsilon greedy method is not good at picking one of two high Q-values and it will always pick the highest Q-value and ignore all other high Q-values, this can sometimes lead to improper exploitation of all high valued actions. 

**Describe the solution you'd like**
Implementing Boltzmann Softmax exploration, which helps Brain to pick the actions on a probabilistic manner, where all high Q values are given high probability, so each action has a probability of getting picked.


**Additional context**
[Boltzmann Softmax Exploration](https://www.ijcai.org/Proceedings/2020/0276.pdf)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding Boltzmann Softmax Exploration #16

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Adding Boltzmann Softmax Exploration #16

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions