Thanks for your excellent work.
I have some doubts.
Based on the comparison, the biggest differences between SRSTE and Bi-Mask are the following:
- SRSTE: FirstConv runs on SparsityConv and uses "nn.CrossEntropy" in the training phase.
- Bi-Mask: FirstConv runs on DenseConv and uses "LabelSmooth(0.1)" in the training phase.
We know that these two differences can often lead to significant accuracy gaps.
Looking forward to your reply.
Thanks
Thanks for your excellent work.
I have some doubts.
Based on the comparison, the biggest differences between SRSTE and Bi-Mask are the following:
We know that these two differences can often lead to significant accuracy gaps.
Looking forward to your reply.
Thanks