Skip to content

feat(nl_shade_lbc): NL-SHADE-LBC adaptive DE (CEC 2022 winner)#233

Draft
haraldschilly wants to merge 1 commit into
masterfrom
claude/funny-hamilton-1zn3H
Draft

feat(nl_shade_lbc): NL-SHADE-LBC adaptive DE (CEC 2022 winner)#233
haraldschilly wants to merge 1 commit into
masterfrom
claude/funny-hamilton-1zn3H

Conversation

@haraldschilly
Copy link
Copy Markdown
Owner

Summary

Add NL-SHADE-LBC (Stanovov, Akhmedova & Semenkin, CEC 2022 winner) — the literature frontier of the SHADE / L-SHADE / jSO / NL-SHADE-RSP adaptive Differential Evolution lineage. It is a direct subclass of NLSHADE_RSP that adds Linear Bias Change in the success-history memory update.

The standard L-SHADE / jSO / NL-SHADE-RSP Lehmer mean uses fixed exponents (s²/s¹); NL-SHADE-LBC generalises this to Σ(w·sᵖ) / Σ(w·s^(p−m)) with the order p linearly scheduled across budget progress:

p_F(r)  = (1 − r) · p_F_init  + r · p_F_final    # defaults 3.5 → 1.5
p_CR(r) = (1 − r) · p_CR_init + r · p_CR_final   # defaults 1.0 → 1.5

with constant spread m_lbc = 1.5. Literature defaults from Stanovov et al. (2022) — verified against the MetaBox open-source reference implementation. At p = 2, m = 1 the formula recovers the standard L-SHADE Lehmer mean, so both regimes are reachable from the default catalog and the bandit can flip between them.

Why

Closes the NL-SHADE-LBC (CEC 2022 winner) DE-family follow-up in planning/SELF_IMPROVEMENT_LOOP.md. Adds a fifth DE-family arm to the structural catalog — alongside basic DE / L-SHADE (CEC 2014) / jSO (CEC 2017) / NL-SHADE-RSP (CEC 2021) — so the self-improvement loop's bandit can pick whichever wins on the current battery. Brings panobbgo to the most recent CEC bound-constrained competition winner.

Backwards compatibility

Strictly safe. NLSHADE_LBC is opt-in:

  • Not added to any _make_quick_strategies / _make_standard_strategies / _make_full_strategies spec, so the composite baseline on every default battery is byte-identical and existing ledgers stay valid.
  • NL-SHADE-RSP / jSO / L-SHADE base classes are untouched — only the LBC subclass overrides _update_memory; a regression test verifies that NLSHADE_RSP._update_memory still produces the standard L-SHADE Lehmer mean output.
  • The structural catalog gains one extra add_heuristic candidate (avoid_duplicates=True).
  • The six new kwarg rules fire only when a spec sets the matching kwarg explicitly.

Impact

Like the prior CEC-DE refinements (NL-SHADE-RSP), NL-SHADE-LBC is a large-budget specialist — at panobbgo's small composite-battery budgets (75–500 evals) the bias-change schedule barely warms up, so the quick-mode signal is expected within noise. The value of shipping it today is to give the self-improvement loop the literature-frontier arm the bandit can select once it has accumulated per-arm reward history.

Evidence form (per AGENTS.md "Agent-driven improve X PRs"): catalog-only addition; backwards-compatible (composite baseline byte-identical, existing ledgers stay valid); queued for nightly loop validation via the structural catalog.

Deviations from the full CEC-2022 paper

For transparency (the Panobbgo norm is literature-faithful ports), two mechanisms are intentionally not ported:

  • The adaptive binomial / exponential crossover blend (the same async-pipeline limitation that motivated omitting it from NL-SHADE-RSP).
  • The repetitive-generation bound-constraint handling (Panobbgo's asynchronous pipeline runs through strategy.constraint_handler and the L-SHADE midpoint-reflection repair instead).

Both are queued as follow-ups in planning/SELF_IMPROVEMENT_LOOP.md.

Test plan

  • tests/test_heuristic_nl_shade_lbc.py (30 tests): construction validation (defaults, custom kwargs, subclass invariant spanning NLSHADE_RSP / JSO / LSHADE, NaN / inf / m_lbc≤0 rejection, inherited NLSHADE_RSP / jSO H≥2 / p_best ordering / k_rank rules); LBC schedule (endpoints, linear midpoint, clipping at progress > 1, fallback to p_init when budget unknown); memory update (no write to anchor bin H-1, pointer advances % (H-1), no-op on empty buffer, F memory clamped to [0,1], LBC formula at progress=0 matches Σ(w·F^3.5)/Σ(w·F^2.0), p=2/m=1 recovers L-SHADE for both F and CR, CR=0 plants terminal sentinel, terminal-bin stays terminal, mixed-zero CR values filtered, zero-delta successes fall back to uniform weights); pipeline (on_start emits NP_init, smoke convergence on a quadratic, restart resets archive and pending); inheritance safety (NLSHADE_RSP _update_memory still produces standard L-SHADE mean); registration (package re-export + __all__, structural catalog, six kwarg dials).
  • tests/test_heuristic_nl_shade_rsp.py (34 tests), tests/test_heuristic_jso.py (36 tests), tests/test_heuristic_lshade.py (99 tests) — all pre-existing DE-family tests pass unchanged.
  • tests/test_self_improve.py (180 tests) — structural catalog and kwarg rule additions don't break anything.
  • Full pytest tests/ (1187 tests) — all pass.
  • ruff format --check, ruff check, flake8 --select=E9,F63,F7,F82, pyright all clean.

Docs

heuristics.rst, guide_architecture.rst, guide_benchmarking.rst, guide.rst, TODO.md, and planning/SELF_IMPROVEMENT_LOOP.md (§13 entry + NL-SHADE-LBC follow-ups: categorical LBC regimes, per-CR / per-F sub-regime A/B, adaptive bias bounds from the success history).

https://claude.ai/code/session_01QubVM5JJdGDw2EhtZJFY8n


Generated by Claude Code

Add NLSHADE_LBC, a direct NLSHADE_RSP subclass porting the
Stanovov-Akhmedova-Semenkin (CEC 2022) refinement: **Linear Bias
Change** in the success-history memory update. The standard L-SHADE
/ jSO / NL-SHADE-RSP Lehmer mean uses fixed exponents (s^2/s^1);
NL-SHADE-LBC generalises this to Sigma(w*s^p) / Sigma(w*s^(p-m)) with
the order p linearly scheduled across budget progress
(p_F: 3.5 -> 1.5, p_CR: 1.0 -> 1.5, spread m_lbc = 1.5). At p=2, m=1
the formula recovers the standard L-SHADE Lehmer mean, so both
regimes are reachable from the default catalog and the bandit can
flip between them.

NLSHADE_LBC inherits the entire NL-SHADE-RSP / jSO / L-SHADE
asynchronous pipeline (per-slot pending dict, generation-by-count
book-keeping, archive of replaced parents, jSO frozen anchor memory
bin, weighted current-to-pbest-w/1 mutation, linear p_best schedule,
asymmetric F-cap, NLPSR, RSP r1 selection, randomised adaptive
archive, warm restart) and overrides only _update_memory.

The CR-zero handling preserves the L-SHADE terminal sentinel rule
and filters strict zeros out of the LBC sum (because s^(p-m) with
p<m blows up at s=0).

Deviations from the full CEC-2022 paper (documented for honesty):
the adaptive binomial / exponential crossover blend (the same
mechanism intentionally not ported from NL-SHADE-RSP) and the
repetitive-generation bound-constraint handling are not ported.
Both are queued as follow-ups.

Registered in the structural catalog as a fifteenth add_heuristic
candidate (avoid_duplicates=True). Six new kwarg rules in
default_catalog (NP_init, p_F_init, p_F_final, p_CR_init, p_CR_final,
m_lbc) so the loop can probe the bias-change schedule.

Opt-in only: not in any default battery, so the composite baseline
stays byte-identical and existing ledgers stay valid. NL-SHADE-RSP /
jSO / L-SHADE base classes are untouched - only the LBC subclass
overrides _update_memory; verified by a regression test that
NLSHADE_RSP._update_memory still produces the standard L-SHADE Lehmer
mean output. Adds the CEC-2022-class arm to the bandit's pool, which
already includes basic DE, L-SHADE (CEC 2014), jSO (CEC 2017), and
NL-SHADE-RSP (CEC 2021).

Like the prior CEC-DE refinements, NL-SHADE-LBC is a large-budget
specialist - at panobbgo's small composite-battery budgets the
bias-change schedule barely warms up, so the quick-mode signal is
expected within noise. The value of shipping it today is to give the
self-improvement loop the literature frontier the bandit can select
once it has accumulated per-arm reward history.

Tests: tests/test_heuristic_nl_shade_lbc.py (30 tests) covering
construction validation, LBC schedule (endpoints, midpoint, clipping,
budget-unknown fallback), memory update (anchor-bin skip, pointer
mod (H-1), Sigma(w*F^3.5)/Sigma(w*F^2.0) at progress=0, p=2/m=1
recovers L-SHADE for both F and CR, CR=0 terminal sentinel,
mixed-zero CR filtered, uniform weights on zero-delta), pipeline
(on_start, smoke convergence, restart resets), inheritance safety
(NLSHADE_RSP unchanged), and registration.

Documentation: planning/SELF_IMPROVEMENT_LOOP.md gets a new
2026-05-28 entry and a follow-up idea for the next iteration;
guide.rst, guide_architecture.rst, guide_benchmarking.rst,
heuristics.rst, and TODO.md updated.

https://claude.ai/code/session_01QubVM5JJdGDw2EhtZJFY8n
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants