fix(m3): make failure banner reachable by selectively suppressing ERR around evaluator invocations

## Background

In `benchmarks/m3/eval.sh`, the `ERR` trap fires on evaluator failure regardless of `set -e`/`set +e` state in bash. Because `cleanup()` unconditionally ends with `exit $exit_code`, the script never reaches `EVAL_EXIT=$?` or the `else` branch after the evaluator invocation. As a result, the "✗ M3 evaluation failed (exit code: ...)" banner is unreachable.

This was identified during review of PR #4 (comment: https://github.com/cuga-project/cuga-eval/pull/4#discussion_r3369504327).

## Functional impact

None — the crash-salvage behavior (`create_bundle` via `ERR`/`EXIT` trap) is already correct. This is a cosmetic issue only.

## Proposed fix

Selectively suppress `ERR` around each `uv run` evaluator invocation so the script falls through to the explicit success/failure branches instead of immediately trapping:

```bash
trap '' ERR
uv run python -m benchmarks.m3.eval_m3 ...
EVAL_EXIT=$?
trap cleanup ERR
```

This change needs careful testing against the crash-salvage path to ensure the bundle is still created correctly on early interrupt/exception.

## Requested by

@haroldship

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(m3): make failure banner reachable by selectively suppressing ERR around evaluator invocations #55

Background

Functional impact

Proposed fix

Requested by

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

fix(m3): make failure banner reachable by selectively suppressing ERR around evaluator invocations #55

Description

Background

Functional impact

Proposed fix

Requested by

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions