Skip to content

feat(lbfgsb): multi-start L-BFGS-B gradient local optimizer + structural catalog#232

Draft
haraldschilly wants to merge 1 commit into
masterfrom
claude/funny-hamilton-Wlwn8
Draft

feat(lbfgsb): multi-start L-BFGS-B gradient local optimizer + structural catalog#232
haraldschilly wants to merge 1 commit into
masterfrom
claude/funny-hamilton-Wlwn8

Conversation

@haraldschilly
Copy link
Copy Markdown
Owner

Summary

Panobbgo's portfolio is rich in derivative-free generators (DE family, PSO, CMA-ES, Nelder-Mead, COBYQA) but had no working gradient-based arm. A fresh --standard --baselines run makes the gap concrete:

  • Every Panobbgo strategy scores 0.0 on Rosenbrock_5D (a smooth, ill-conditioned valley); composite 0.26.
  • scipy's dual_annealing solves it — and that win owes entirely to its own L-BFGS-B local-search step.

Panobbgo already shipped a LBFGSB heuristic that could have closed this gap, but it was effectively dead code: it ran a single descent from the box centre, then went idle for the rest of the budget, and was wired into neither the default strategies nor the self-improvement loop's structural catalog.

This PR rescues it:

  • Rewrites panobbgo/heuristics/lbfgsb.py into a robust multi-start bound-constrained quasi-Newton optimizer. The worker runs fmin_l_bfgs_b repeatedly — first descent from the box centre (deterministic), every subsequent descent from a fresh uniform-random restart — using the whole strategy budget instead of stopping after one convergence. on_restart warm-starts at the Restart analyzer's centre. Subprocess lifecycle re-modelled on the well-tested COBYQA adapter (spawn, cap=1, graceful SystemExit-on-closed-pipe). New validated kwargs max_starts / maxfun / epsilon / seed.
  • Adds LBFGSB to default_structural_catalog() as the 15th add_heuristic candidate (avoid_duplicates=True) — the only gradient-based arm the loop can deploy on valley-shaped problems.

Evidence (local A/B via the harness, base_seed 42, budget 200)

  • A dedicated LBFGSB strategy solves Rosenbrock_2D and Rosenbrock_5D to func_distance ≈ 3e-11, SR 5/5 — where every default strategy scores 0.0. A standalone scipy check confirms a single centre descent reaches Rosenbrock_5D f < 0.02 in ~210 evals.
  • Negative result recorded: simply adding LBFGSB (or COBYQA) to the existing 5-heuristic Rewarding_Diverse portfolio does not crack Rosenbrock_5D and can regress other problems (e.g. StyblinskiTang) — the bandit splits the budget across 6 arms, so no single descent gets enough evaluations. The value is in dedicated / loop-discovered portfolios, which is exactly what the structural catalog lets the loop search for.

This is why the change is catalog-only and does not touch the default battery: adding a gradient arm to a budget-split portfolio is not an unconditional win, so the loop's accept/reject + bootstrap-CI guard is the right place to decide it per battery.

Evidence form (per AGENTS.md "Agent-driven improve X PRs"): local A/B with the harness; backwards-compatible (composite baseline byte-identical, existing ledgers stay valid); queued for nightly loop validation via the structural catalog.

Backwards compatibility

Strictly safe. LBFGSB is opt-in (not in any default strategy), so the composite baseline on every default battery is byte-identical. The first descent still starts from the box centre exactly as before, so the existing integration tests and the on_new_results penalty-value contract pass unchanged.

Test plan

  • tests/test_heuristic_lbfgsb.py (29) + tests/test_heuristic_lbfgsb_robustness.py (9), rewritten on the COBYQA template: ctor validation, subprocess lifecycle, pipe wiring, restart, fake-pipe worker multi-start / reproducibility / robustness, registration, and a Rosenbrock_5D scipy smoke.
  • tests/test_heuristics_lbfgsb_constraints.py, tests/test_heuristics_integration.py, tests/test_heuristics.py pass unchanged.
  • tests/test_self_improve.py (180) pass (catalog addition).
  • ruff format --check ., flake8 panobbgo --select=E9,F63,F7,F82, pyright panobbgo all clean.
  • Full pytest suite green in CI.

Docs

heuristics.rst, guide_architecture.rst, guide_benchmarking.rst, guide.rst, AGENTS.md, TODO.md, and planning/SELF_IMPROVEMENT_LOOP.md (§13 entry + LBFGSB follow-ups: dedicated default strategy needing an ADR, warm-start-from-best, max_starts catalog rule).

https://claude.ai/code/session_01UQJwKNDp2Grmcufp7iZbdW


Generated by Claude Code

…ral catalog

Rewrite the one-shot, box-centre, restart-blind, unreferenced LBFGSB stub
into a robust multi-start bound-constrained quasi-Newton local optimizer and
add it to default_structural_catalog()'s add_heuristic pool as the only
gradient-based arm the self-improvement loop can deploy.

The benchmark harness shows every Panobbgo strategy scores 0.0 on
Rosenbrock_5D (a smooth ill-conditioned valley), while scipy dual_annealing
solves it via its own L-BFGS-B local search. A dedicated multi-start LBFGSB
strategy now solves Rosenbrock_2D/5D to func_distance ~3e-11 (SR 5/5).
Recorded negative result: adding it to a budget-split portfolio does not
crack Rosenbrock and can regress other problems, so the change is
catalog-only (composite baseline byte-identical, no ADR) and left for the
loop's accept/reject guard to deploy per battery.

The worker runs fmin_l_bfgs_b repeatedly (first descent from the box centre,
subsequent from fresh random restarts) over the full budget instead of going
idle after one convergence; on_restart warm-starts at the Restart analyzer's
centre. Subprocess lifecycle re-modelled on the tested COBYQA adapter.

Tests rewritten (29 + 9) on the COBYQA template; integration tests and the
on_new_results penalty contract pass unchanged. Docs and planning log updated.

https://claude.ai/code/session_01UQJwKNDp2Grmcufp7iZbdW
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants