Skip to content

Did evabyte support other evlaution architecture? #4

@EthanLI24

Description

@EthanLI24

Hi, wonderful and insightful job!

I attempted to reproduce your job and evaluate your Hugging Face checkpoint using OpenCompass. I set the batch size to 1 and enabled trust_remote_code, along with other settings you recommended. However, I found it challenging to achieve your results. I suspect there might be a discrepancy between my settings and yours. Have you evaluated your checkpoint on OpenCompass or other benchmarks? or any other experience can share with us?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions