Skip to content

question about the sparsity_target #10

@albertszg

Description

@albertszg

Hello, this is brilliant work, I want to use the binary gumbel-softmax for my work. But there are some problems.
I used the soft mask for the first layer only (just apply the generated mask to the features after the first layer),and I found a strange phenomenon。The gumbel noise seemed to influence the training process too much. I plotted the sparsity loss only, and I found I usually couldn't obtain the sparsity target I set. Is this process right?
temp=5.0
微信截图_20211206151218
temp=1.0
later

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions