Skip to content

Distillation degrades the accuracy #20

@lippman1125

Description

@lippman1125

I add distillation when training resnet18. But the Top-1 Acc degrades from 68.150 % to 67.364%。
Hyperparameters as follow:

4gpu
epochs: 90
learning_rate: 0.01
momentum: 0.9
weight_decay: 0.0001
mode: step
step_size: 20
gamma: 0.1

loss = ce_loss + 300 * distill_loss

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions