Dear authors, thank you for work,
I run your implementation and got lower results than ones reported in the paper. I have some questions:
-
Your results reported in the paper (average ~85.15) is selected by test-set?
-
I got average results from your implementation:
- selected by validation-set ~ 83.37
- selected by test-set ~ 84.25
- when I turn off your seed configs and try 5 runs with random seed?
- selected by validation-set ~ 82.63
- selected by test-set ~ 83.81
Thank you for your contribution!
Dear authors, thank you for work,
I run your implementation and got lower results than ones reported in the paper. I have some questions:
Your results reported in the paper (average ~85.15) is selected by test-set?
I got average results from your implementation:
Thank you for your contribution!