feat: intervention-sycophantic by n0w0f · Pull Request #338 · lamalab-org/corral

n0w0f · 2026-04-08T16:09:22Z

Add scripts to run intervention experiments that inject steps from
successful/failed traces into new agent runs to measure knowledge
vs reasoning gaps across scientific environments.

Pipeline: select tasks (from reports_v2) -> run baseline -> pick
traces from baseline -> run intervention conditions -> analyze.

Add scripts to run intervention experiments that inject steps from successful/failed traces into new agent runs to measure knowledge vs reasoning gaps across scientific environments. Pipeline: select tasks (from reports_v2) -> run baseline -> pick traces from baseline -> run intervention conditions -> analyze.

- Each env now has two server ports (react/toolcalling) to allow safe parallel runs — the server is stateful and concurrent agents would clash - Add scripts/setup_envs.sh for one-time venv creation (uv for spectra/ resistor, micromamba for wetlab due to conda-only reaktoro) - launch_sweep.sh gains --start-servers/--stop-servers/--server-status - Resistor env.py uses argparse with --mode single/chained (no path needed) - Wetlab pyproject.toml updated with corral dep and uv.sources

Replace declare -A (bash 4+) with case-based lookup functions. Tested on macOS bash 3.2.57. Also add generated task_selection.json.

- setup_envs.sh: upgrade promptstore + install boto3 for Bedrock - launch_sweep.sh: add --trials flag for smoke testing (e.g. --trials 1) - run_intervention.py: cap k_values at trials count to avoid validation error - Verified end-to-end: setup venvs → start servers → launch baselines → reports

Updated uv.lock files across all task environments after upgrading promptstore. Added generated prompts/index.json.

sourcery-ai

Sorry, we are unable to review this pull request

The GitHub API does not allow us to fetch diffs exceeding 20000 lines

coderabbitai · 2026-04-08T16:09:36Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Free

Run ID: 1e5c2147-b36b-4f8d-a5f0-bc1c962f27f1

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

Note

🎁 Summarized by CodeRabbit Free

Your organization is on the Free plan. CodeRabbit will generate a high-level summary and a walkthrough for each pull request. For a comprehensive line-by-line review, please upgrade your subscription to CodeRabbit Pro by visiting https://app.coderabbit.ai/login.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

MrtinoRG · 2026-04-20T07:39:13Z

@n0w0f can we close this?

n0w0f added 15 commits April 1, 2026 13:18

fix: bash 3 compatibility for launch_sweep.sh

60a9663

Replace declare -A (bash 4+) with case-based lookup functions. Tested on macOS bash 3.2.57. Also add generated task_selection.json.

fix: count dry-run launches in launch_sweep.sh

932d645

chore: update lock files and add promptstore index

e4b83aa

Updated uv.lock files across all task environments after upgrading promptstore. Added generated prompts/index.json.

chore: baseline runs

2cdc46c

chore: pass plot

b251081

chore: checkpoint intervention runs

25f178d

chore: push update

10e47cc

chore: intervention first batch

f200363

chore: retro interention runs

50219b7

chore: internvention more runs

4ee008b

chore: sycophantic intervention

e260b16

chore: plot

2784108

sourcery-ai Bot reviewed Apr 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: intervention-sycophantic#338

feat: intervention-sycophantic#338
n0w0f wants to merge 15 commits intodevfrom
corral-intervention-sycophantic

n0w0f commented Apr 8, 2026

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

coderabbitai Bot commented Apr 8, 2026

Review skipped

Uh oh!

MrtinoRG commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

n0w0f commented Apr 8, 2026

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Apr 8, 2026

Review skipped

Uh oh!

MrtinoRG commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants