Questions about mask generation

Hi @thomasverelst 

Congrats, nice work! I have two questions out of curiosity:

1) Forward pass: Why did you choose to sample from the Bernoulli distribution instead of the Gumbel-softmax? To my knowledge, sampling from the Bernoulli distribution introduces a bias in the gradient estimation which could make optimization trickier. I understand that you would not be able to use sparse convolutions in the training but I wonder if there is another reason.

2) Have you tried annealing the temperature parameter to less than 1?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about mask generation #12

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Questions about mask generation #12

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions