Skip to content

Handle comparison-only answers in structured evaluation #396

Description

@ChathurangiShyalika

Description

Some scenarios require comparing two items where the agent might not explicitly provide the expected key. We need a clear rule for when to score an answer as an abstention versus when it should be treated as a clarification or partial response.

Why

Current evaluation can be ambiguous when the answer is based on comparison but does not use the exact expected key format.

Goal

Define a consistent policy for comparison-only scenarios so scoring is stable and predictable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions