[Feature]: Add vakra small training data and make it the default --m3-data source

## Feature Request

Bundle the **vakra small** M3 training data with the repository and make it the default data source when `--m3-data` is passed without a path argument.

## Motivation / Problem

Today, `./benchmarks/m3/eval.sh --m3-data` requires an explicit path to a zip or directory:

```
if [[ -z "${2:-}" || "$2" == --* ]]; then
    echo "Error: --m3-data requires a path (zip file or directory)" >&2
    exit 2
fi
```

There is no out-of-the-box default, so every user must know the location of the M3 data before they can run `--m3-data` mode. The comment in `m3_registry_m3_data.yaml` already anticipates a "default zip" invocation (`./benchmarks/m3/eval.sh --m3-data  # default zip`), but that path does not exist yet.

## Use Case

- Developers and CI jobs want to run `./benchmarks/m3/eval.sh --m3-data` against a well-known, representative dataset without specifying an external path every time.
- The vakra small dataset is compact enough to ship alongside the repo and covers the capability domains already listed in `m3_registry_m3_data.yaml`.
- Having a canonical default lowers the barrier to entry for new contributors running M3 evals locally.

## Proposed Solution

1. **Add the vakra small data** — commit (or reference via a `just download-m3-data` task) the vakra small zip/directory under `benchmarks/m3/data/vakra_small/` (or as a tracked zip artifact).
2. **Wire it as the default** — in `eval.sh`, change the `--m3-data` argument parsing so that omitting a path falls back to the bundled vakra small data:

   ```bash
   --m3-data)
       M3_DATA=true
       if [[ -z "${2:-}" || "$2" == --* ]]; then
           # No path supplied — use bundled vakra small data
           M3_DATA_PATH="$SCRIPT_DIR/data/vakra_small"
       else
           M3_DATA_PATH="$2"
           shift
       fi
       shift
       ;;
   ```

3. **Update help text** — document that `--m3-data` (with no path) runs against the bundled vakra small dataset.
4. **Update `m3_registry_m3_data.yaml`** — align the `domains` lists with whatever capabilities and domains vakra small actually contains.

## Alternatives Considered

- Keeping the required-path behavior and adding a separate `--m3-data-default` flag. Rejected: adds flag surface with no benefit; the implicit default is cleaner.
- Downloading the data at eval time via `setup_m3.sh`. This works but requires network access and makes the no-arg invocation slower and less reproducible.

## Priority

Medium

## Additional Context

- Related files: `benchmarks/m3/eval.sh`, `benchmarks/m3/config/m3_registry_m3_data.yaml`, `benchmarks/m3/m3_data_loader.py`
- The vakra scoring pipeline (`benchmarks/m3/m3_vakra_score.py`) already consumes the data shape that `M3DataLoader` produces, so no scoring-side changes are expected.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Add vakra small training data and make it the default --m3-data source #61

Feature Request

Motivation / Problem

Use Case

Proposed Solution

Alternatives Considered

Priority

Additional Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature]: Add vakra small training data and make it the default --m3-data source #61

Description

Feature Request

Motivation / Problem

Use Case

Proposed Solution

Alternatives Considered

Priority

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions