Skip to content

Discrepancy between pre-trained and self trained models #28

@m-shayan73

Description

@m-shayan73

Hello,

We are running evaluations for the E3 Cadets dataset and have encountered a discrepancy between the paper's results and our own self-trained model.

The provided pre-trained model perfectly matches the results reported in the paper (0.9701 F1-Score). This gives us confidence that our evaluation setup is correct.

However, our self-trained model performs significantly worse. The F1-Score drops from 0.9701 to 0.8972, which is a -7.51% difference. We generally find a difference of 2% to be acceptable however this is greater than that.

OUR RESULTS

Model Precision Recall F1-Score % F1 Diff (from Paper)
Paper (Baseline) 0.9440 0.9977 0.9701 N/A
Pre-trained 0.9441 0.9977 0.9701 $0.00%$
Own-trained 0.8151 0.9977 0.8972 $-7.51%$

Since the pre-trained model and paper results are identical, the discrepancy seems to be in the training process itself. Any guidance on why there is such a difference?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions