-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
Hello, this is brilliant work, I want to use the binary gumbel-softmax for my work. But there are some problems.
I used the soft mask for the first layer only (just apply the generated mask to the features after the first layer),and I found a strange phenomenon。The gumbel noise seemed to influence the training process too much. I plotted the sparsity loss only, and I found I usually couldn't obtain the sparsity target I set. Is this process right?
temp=5.0

temp=1.0

Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels