Zone air temperature fluctuates a lot after 200 episodes

Hello! I used the original code in baseline docker to train the RL model with command:
`python3 -m baselines_energyplus.trpo_mpi.run_energyplus --num-timesteps 10000000`. The model was trained for 200 episodes and saved and applied for inference with the same idf file and weather file. The west zone temperature during inference fluctuates a lot (annual mean temperature is about 22.4 degree), as shown in the graph below:
<img width="810" alt="截屏2022-07-27 12 52 45" src="https://user-images.githubusercontent.com/82103414/181304887-86f71995-f755-4a67-a966-3e53091c64b7.png">
Also, the set point temperature set by action is almost always at the lowest value possible:
<img width="963" alt="截屏2022-07-27 12 58 05" src="https://user-images.githubusercontent.com/82103414/181305835-bd5e8119-4ad8-46af-902b-564c2de27083.png">
Is this the expected behavior for the episode? 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zone air temperature fluctuates a lot after 200 episodes #109

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Zone air temperature fluctuates a lot after 200 episodes #109

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions