Why sample action from a gaussian distribution instead of using continuous methods, e.g., DDPG?

Hi developers,

Thank you for releasing the code. 

While reading your code, I find it interesting that you're sampling action from a gaussian distribution and control the mean of that gaussian. [link](https://github.com/nv-tlabs/meta-sim/blob/bd94f1aecdb53b8194a03f2d74757d73d850dd15/models/metasim.py#L87). 

There are some continuous control alternatives that can also be applied to this problem, e.g. PPO, DDPG. Can you please elaborate on why you choose your method and how does it help the meta-sim framework? 

Best

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why sample action from a gaussian distribution instead of using continuous methods, e.g., DDPG? #4

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Why sample action from a gaussian distribution instead of using continuous methods, e.g., DDPG? #4

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions