Masked subword prediction problem

In pretrain get_loss function, loss_lm is calculated by mean.

Because of this, all zero values in loss_lm handles as a correct answer.

So, I think we need to change mean to numerator / denominator like tensorflow.

loss_lm = (loss_lm * masked_weights.float()).mean()
to
loss_lm_numerator = (loss_lm*masked_weights.float()).sum()
loss_lm_denominator = masked_weights.sum() + 1e-5
loss_lm = loss_lm_numerator / loss_lm_denominator

Is it correct?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Masked subword prediction problem #24

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Masked subword prediction problem #24

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions