feat(lbfgsb): multi-start L-BFGS-B gradient local optimizer + structural catalog by haraldschilly · Pull Request #232 · haraldschilly/panobbgo

haraldschilly · 2026-05-27T05:39:06Z

Summary

Panobbgo's portfolio is rich in derivative-free generators (DE family, PSO, CMA-ES, Nelder-Mead, COBYQA) but had no working gradient-based arm. A fresh --standard --baselines run makes the gap concrete:

Every Panobbgo strategy scores 0.0 on Rosenbrock_5D (a smooth, ill-conditioned valley); composite 0.26.
scipy's dual_annealing solves it — and that win owes entirely to its own L-BFGS-B local-search step.

Panobbgo already shipped a LBFGSB heuristic that could have closed this gap, but it was effectively dead code: it ran a single descent from the box centre, then went idle for the rest of the budget, and was wired into neither the default strategies nor the self-improvement loop's structural catalog.

This PR rescues it:

Rewrites panobbgo/heuristics/lbfgsb.py into a robust multi-start bound-constrained quasi-Newton optimizer. The worker runs fmin_l_bfgs_b repeatedly — first descent from the box centre (deterministic), every subsequent descent from a fresh uniform-random restart — using the whole strategy budget instead of stopping after one convergence. on_restart warm-starts at the Restart analyzer's centre. Subprocess lifecycle re-modelled on the well-tested COBYQA adapter (spawn, cap=1, graceful SystemExit-on-closed-pipe). New validated kwargs max_starts / maxfun / epsilon / seed.
Adds LBFGSB to default_structural_catalog() as the 15th add_heuristic candidate (avoid_duplicates=True) — the only gradient-based arm the loop can deploy on valley-shaped problems.

Evidence (local A/B via the harness, base_seed 42, budget 200)

A dedicated LBFGSB strategy solves Rosenbrock_2D and Rosenbrock_5D to func_distance ≈ 3e-11, SR 5/5 — where every default strategy scores 0.0. A standalone scipy check confirms a single centre descent reaches Rosenbrock_5D f < 0.02 in ~210 evals.
Negative result recorded: simply adding LBFGSB (or COBYQA) to the existing 5-heuristic Rewarding_Diverse portfolio does not crack Rosenbrock_5D and can regress other problems (e.g. StyblinskiTang) — the bandit splits the budget across 6 arms, so no single descent gets enough evaluations. The value is in dedicated / loop-discovered portfolios, which is exactly what the structural catalog lets the loop search for.

This is why the change is catalog-only and does not touch the default battery: adding a gradient arm to a budget-split portfolio is not an unconditional win, so the loop's accept/reject + bootstrap-CI guard is the right place to decide it per battery.

Evidence form (per AGENTS.md "Agent-driven improve X PRs"): local A/B with the harness; backwards-compatible (composite baseline byte-identical, existing ledgers stay valid); queued for nightly loop validation via the structural catalog.

Backwards compatibility

Strictly safe. LBFGSB is opt-in (not in any default strategy), so the composite baseline on every default battery is byte-identical. The first descent still starts from the box centre exactly as before, so the existing integration tests and the on_new_results penalty-value contract pass unchanged.

Test plan

tests/test_heuristic_lbfgsb.py (29) + tests/test_heuristic_lbfgsb_robustness.py (9), rewritten on the COBYQA template: ctor validation, subprocess lifecycle, pipe wiring, restart, fake-pipe worker multi-start / reproducibility / robustness, registration, and a Rosenbrock_5D scipy smoke.
tests/test_heuristics_lbfgsb_constraints.py, tests/test_heuristics_integration.py, tests/test_heuristics.py pass unchanged.
tests/test_self_improve.py (180) pass (catalog addition).
ruff format --check ., flake8 panobbgo --select=E9,F63,F7,F82, pyright panobbgo all clean.
Full pytest suite green in CI.

Docs

heuristics.rst, guide_architecture.rst, guide_benchmarking.rst, guide.rst, AGENTS.md, TODO.md, and planning/SELF_IMPROVEMENT_LOOP.md (§13 entry + LBFGSB follow-ups: dedicated default strategy needing an ADR, warm-start-from-best, max_starts catalog rule).

https://claude.ai/code/session_01UQJwKNDp2Grmcufp7iZbdW

Generated by Claude Code

…ral catalog Rewrite the one-shot, box-centre, restart-blind, unreferenced LBFGSB stub into a robust multi-start bound-constrained quasi-Newton local optimizer and add it to default_structural_catalog()'s add_heuristic pool as the only gradient-based arm the self-improvement loop can deploy. The benchmark harness shows every Panobbgo strategy scores 0.0 on Rosenbrock_5D (a smooth ill-conditioned valley), while scipy dual_annealing solves it via its own L-BFGS-B local search. A dedicated multi-start LBFGSB strategy now solves Rosenbrock_2D/5D to func_distance ~3e-11 (SR 5/5). Recorded negative result: adding it to a budget-split portfolio does not crack Rosenbrock and can regress other problems, so the change is catalog-only (composite baseline byte-identical, no ADR) and left for the loop's accept/reject guard to deploy per battery. The worker runs fmin_l_bfgs_b repeatedly (first descent from the box centre, subsequent from fresh random restarts) over the full budget instead of going idle after one convergence; on_restart warm-starts at the Restart analyzer's centre. Subprocess lifecycle re-modelled on the tested COBYQA adapter. Tests rewritten (29 + 9) on the COBYQA template; integration tests and the on_new_results penalty contract pass unchanged. Docs and planning log updated. https://claude.ai/code/session_01UQJwKNDp2Grmcufp7iZbdW

This was referenced May 30, 2026

feat(self_improve): inactivity-guarded eps_accept relaxation #235

Draft

codify: Sobol.scramble=False in Rewarding_Diverse (first ledger-driven default change) #236

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(lbfgsb): multi-start L-BFGS-B gradient local optimizer + structural catalog#232

feat(lbfgsb): multi-start L-BFGS-B gradient local optimizer + structural catalog#232
haraldschilly wants to merge 1 commit into
masterfrom
claude/funny-hamilton-Wlwn8

haraldschilly commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

haraldschilly commented May 27, 2026

Summary

Evidence (local A/B via the harness, base_seed 42, budget 200)

Backwards compatibility

Test plan

Docs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants