Skip to content

The rewards cannot be obtained in the MiniGrid-ObstructedMaze environment. #2

@MurrayMa0816

Description

@MurrayMa0816

Hi @swan-utokyo, I have also been focusing on the sparse reward problem recently. Your DEIR work has given me a lot of inspiration. I am planning to further study based on your work. However, when using your code, the following problem occurred:

  1. In Minigrid, results for various sizes in the environments of MultiRoom and KeydorrCorridor can be reproduced. However, in the MiniGrid-ObstructedMaze environments, including 1Dlh, 2Dlh, 1Dlhb, 2Dlhb, 1Q, 2Q, Full, using the same code and parameters as MultiRoom and KeydorrCorridor, no rewards are obtained even after training for over 5e7 steps. However, in Figure 5 of the paper, convergence has already been achieved in the 'Full' environment after training for 5e7 steps. So, I would like to ask how to reproduce the results in the MiniGrid-ObstructedMaze environment. Do I need to use different parameter settings than those for MultiRoom and KeydorrCorridor?
  2. I noticed that you have also implemented the NovelD algorithm separately. Regarding the NovelD algorithm, I encountered the same issue of not obtaining any results in the MiniGrid-ObstructedMaze environments, including 1Dlhb, 2Dlhb, 1Q, 2Q, and Full. For the hyperparameters, I adjusted them according to Appendix A.4, but still couldn't achieve any results.

I hope to get some guidance and advice from you. Thank you very much.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions