Add ChaosGrad optimizer and TemporalScheduler for RealNet training by theomgdev · Pull Request #6 · theomgdev/OdyssNet

theomgdev · 2026-03-17T02:24:14Z

This pull request refactors the initialization of the RealNetTrainer across multiple proof-of-concept and experiment scripts to standardize optimizer and training configuration. Instead of manually setting the optimizer and loss function, the code now leverages the new ChaosGradConfig class to encapsulate optimizer settings and training parameters, resulting in cleaner and more maintainable experiment scripts.

Trainer initialization and configuration refactor:

Replaced manual optimizer setup in all scripts (convergence_gates.py, convergence_identity.py, convergence_mnist.py, convergence_mnist_embed.py, convergence_mnist_record.py, convergence_mnist_revive.py, convergence_mnist_scaled.py, convergence_mnist_tiny.py, convergence_realnet_as_database.py, convergence_sine_wave.py, convergence_adder.py, convergence_detective_thinking.py, convergence_latch.py) with initialization using ChaosGradConfig passed to RealNetTrainer. This eliminates direct optimizer assignment and manual weight decay/loss function setup, promoting consistency and reducing boilerplate. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20]
Updated imports in all affected files to include ChaosGradConfig (and TemporalSchedulerConfig where relevant), reflecting the new dependency and usage pattern. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
Removed redundant optimizer and scheduler assignments, and updated loss function assignment to be handled by the trainer where appropriate, simplifying experiment setup and reducing code duplication. [1] [2] [3] [4] [5] [6] [7]
Updated learning rate and weight decay configuration to be encapsulated within the ChaosGradConfig presets (default, aggressive, tiny_network), ensuring consistent and experiment-specific settings. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
Updated learning rate reporting and scheduler usage in convergence_mnist_record.py to reference the trainer’s scheduler/optimizer, reflecting the new encapsulation. [1] [2]

Overall, these changes make the experiment scripts easier to maintain, more consistent, and less error-prone by centralizing optimizer and training configuration logic.

…ng engine - Add ChaosGrad: parameter-group-aware optimizer (chaos_core/projections/lightweight) - Gradient centralization, adaptive per-param LR, plateau escape - Spectral radius clipping, input gradient sentinel - Pre-built configs: default, aggressive, finetune, large_network, tiny_network - Add TemporalScheduler: adaptive LR scheduler - Warmup + cosine decay + loss-aware warm restarts - Convergence rate tracking, checkpoint support (state_dict) - Pre-built configs: default, llm, short_experiment, finetune, adaptive - Enhance RealNetTrainer with backward-compatible integration - chaos_config/scheduler_config params (opt-in, all defaults preserve legacy) - use_chaos_grad=True shortcut for fixed-LR observation mode - Diagnostics: get_diagnostics(), get_input_health(), get_spectral_radius() - Auto re-init ChaosGrad after neurogenesis (expand) - Migrate all PoC/experiments to ChaosGrad (scheduler-free, fixed LR) - Update LIBRARY.md with full documentation - Update __init__.py exports

…enesis

… residual_mode parameter to RealNet (none, simple, gated) - none: original behavior (bit-for-bit identical, verified) - simple: pre-norm residual h=h+f(norm(h)) for gradient flow - gated: learnable per-neuron gate alpha*h+(1-alpha)*f(norm(h)) - Extract _inject_input() helper for DRY input injection - Support residual_gate expansion in neurogenesis - Add RESIDUAL_MODE config to experiment_llm.py - Update LIBRARY.md, README.md, PoC_STANDARDS.md docs

…d) - Add residual_mode parameter to RealNet (none, simple, gated) - none: original behavior (bit-for-bit identical, verified) - simple: pre-norm residual h=h+f(norm(h)) for gradient flow - gated: learnable per-neuron gate alpha*h+(1-alpha)*f(norm(h)) - Extract _inject_input() helper for DRY input injection - Support residual_gate expansion in neurogenesis - Add RESIDUAL_MODE config to experiment_llm.py - Update LIBRARY.md, README.md, PoC_STANDARDS.md docs" This reverts commit 80a0bb4.

Copilot

Pull request overview

This PR introduces RealNet-native training components (ChaosGrad optimizer + TemporalScheduler) and refactors RealNetTrainer + various PoC/experiment scripts to use config-driven optimizer/scheduler initialization rather than manual per-script setup.

Changes:

Added ChaosGrad/ChaosGradConfig and TemporalScheduler/TemporalSchedulerConfig, and exposed them via realnet.__init__.
Refactored RealNetTrainer to auto-select optimizers/schedulers (including re-init behavior after neurogenesis) and added diagnostic helpers.
Updated multiple PoC/experiment scripts and library docs to use the new config-based trainer initialization (plus some dataset/hyperparameter changes in the LLM notebook/script).

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 13 comments.

Show a summary per file

File	Description
realnet/utils/realstore.py	Extends weight transplantation to support re-initializing new regions via an `init_new` strategy.
realnet/utils/neurogenesis.py	Updates neurogenesis weight expansion initialization for new connections.
realnet/training/trainer.py	Adds ChaosGrad/TemporalScheduler integration, optimizer/scheduler selection, and diagnostics.
realnet/training/chaos_scheduler.py	New adaptive LR scheduler with warmup/cosine decay/restart logic and presets.
realnet/training/chaos_optimizer.py	New ChaosGrad optimizer with parameter grouping, adaptive LR, plateau escape, and diagnostics.
realnet/core/network.py	Adds new weight init strategies (`micro_quiet`, `micro_quiet_8bit`).
realnet/init.py	Exports ChaosGrad and TemporalScheduler APIs at package top-level.
realnet/LIBRARY.md	Documents ChaosGrad/TemporalScheduler usage via `RealNetTrainer` and direct usage.
RealNET.ipynb	Switches the LLM notebook’s dataset/tokenizer flow to TinyStories and updates naming/paths.
README_TR.md	Removes FineWeb-specific wording from an insight bullet.
README.md	Removes FineWeb-specific wording from an insight bullet.
PoC/experiments/experiment_llm.py	Refactors trainer init to ChaosGrad + trainer-integrated scheduler; also changes dataset + several hyperparameters.
PoC/experiments/convergence_stopwatch.py	Switches optimizer setup to `ChaosGradConfig` passed into the trainer.
PoC/experiments/convergence_sine_wave.py	Switches optimizer setup to `ChaosGradConfig` passed into the trainer.
PoC/experiments/convergence_realnet_as_database.py	Switches optimizer setup to `ChaosGradConfig` passed into the trainer.
PoC/experiments/convergence_mnist_tiny.py	Switches optimizer setup to `ChaosGradConfig` passed into the trainer.
PoC/experiments/convergence_mnist_scaled.py	Switches optimizer setup to `ChaosGradConfig` passed into the trainer.
PoC/experiments/convergence_mnist_revive.py	Switches optimizer setup to `ChaosGradConfig` passed into the trainer.
PoC/experiments/convergence_mnist_record.py	Switches optimizer setup to `ChaosGradConfig`; removes manual scheduler stepping and updates LR reporting source.
PoC/experiments/convergence_mnist_embed.py	Switches optimizer setup to `ChaosGradConfig` passed into the trainer.
PoC/experiments/convergence_latch.py	Switches optimizer setup to `ChaosGradConfig` passed into the trainer.
PoC/experiments/convergence_detective_thinking.py	Switches optimizer setup to `ChaosGradConfig` passed into the trainer.
PoC/experiments/convergence_adder.py	Switches optimizer setup to `ChaosGradConfig` passed into the trainer.
PoC/convergence_mnist.py	Switches optimizer setup to `ChaosGradConfig` passed into the trainer.
PoC/convergence_identity.py	Switches optimizer setup to `ChaosGradConfig.tiny_network` passed into the trainer.
PoC/convergence_gates.py	Switches optimizer setup to `ChaosGradConfig.tiny_network` passed into the trainer.

Comments suppressed due to low confidence (2)

PoC/experiments/experiment_llm.py:18

TemporalSchedulerConfig is imported here but never used (scheduler config is constructed as a plain dict). Consider removing the unused import, or switching to TemporalSchedulerConfig.*() presets for consistency with the rest of the refactor.
PoC/experiments/experiment_llm.py:86
This script change switches the tokenizer training dataset from FineWeb to TinyStories (and also adjusts several model/data hyperparameters in this file). That’s a significant behavioral change not described in the PR description, which focuses on optimizer/scheduler refactors. Either update the PR description to include the dataset/hyperparameter change, or move these adjustments into a separate PR to keep the scope clear.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

PoC/experiments/convergence_mnist_revive.py

PoC/experiments/convergence_mnist_record.py

realnet/training/trainer.py

realnet/utils/realstore.py

RealNET.ipynb

+        "    print(f\"📚 Training new {k_size}k BPE Tokenizer from data slice...\")\n",
        "    tokenizer = ByteLevelBPETokenizer()\n",
-        "    dataset_sample = load_dataset(\"HuggingFaceFW/fineweb-edu\", name=\"CC-MAIN-2024-10\", split=\"train\", streaming=True)\n",
+        "    dataset_sample = load_dataset(\"roneneldan/TinyStories\", split=\"train\", streaming=True)\n",
        "    \n",


PoC/experiments/convergence_mnist_tiny.py

PoC/experiments/convergence_mnist_embed.py

realnet/training/chaos_scheduler.py

+    def _cosine_lr(self, step):
+        """Calculate cosine decay LR multiplier."""
+        effective_step = step - self._cycle_start_step
+        effective_max = self.max_steps - self._cycle_start_step


realnet/training/chaos_optimizer.py

realnet/training/chaos_scheduler.py

+            return self.min_lr_ratio
+
+        # Cosine decay
+        decay_ratio = (effective_step - self.warmup_steps) / max(1, effective_max - self.warmup_steps)


Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

…st warmup steps

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

…ne shrinking on warm restart

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

… loss shock

…story

theomgdev · 2026-03-17T04:04:09Z

@copilot what do you think now?

Copilot · 2026-03-17T04:04:16Z

@theomgdev I've opened a new pull request, #7, to work on those changes. Once the pull request is ready, I'll request review from you.

theomgdev added 10 commits March 11, 2026 06:50

micro_quiet and micro_quiet_8bit for quiet transplantation and neurog…

e26e79f

…enesis

Default transplant micro_quiet for llm

ede524c

micro_quiet_8bit standardized

6dd75a0

ups

17a5384

tweak

e448e6a

Use TinyStories instead of fineweb

d7f72c3

Better defaults for experiment_llm

59cf1c2

Copilot AI review requested due to automatic review settings March 17, 2026 02:24

Copilot started reviewing on behalf of theomgdev March 17, 2026 02:24 View session

Copilot AI reviewed Mar 17, 2026

View reviewed changes

theomgdev and others added 12 commits March 17, 2026 05:33

Potential fix for pull request finding

01a6cb2

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

ab98457

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

be6743e

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

4fa35cf

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

chore: align experiment_llm with ChaosGrad/TemporalScheduler and adju…

1bf1f70

…st warmup steps

Potential fix for pull request finding

f5762b1

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

12f429a

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

fix(chaos_scheduler): treat max_steps as cycle length to prevent cosi…

85ef091

…ne shrinking on warm restart

Fix type conversion GPU-CPU .item bottleneck of chaos optimizer

972aa50

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

fix(chaos_scheduler): skip warmup hold on warm restarts for immediate…

cc02a0a

… loss shock

style: refactor code comments to focus on mechanics instead of dev hi…

71f39b7

…story

style: comment cleanup

3bf9de6

Copilot AI mentioned this pull request Mar 17, 2026

[WIP] Add ChaosGrad optimizer and TemporalScheduler for RealNet training #7

Closed

theomgdev merged commit 3617b41 into main Mar 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ChaosGrad optimizer and TemporalScheduler for RealNet training#6

Add ChaosGrad optimizer and TemporalScheduler for RealNet training#6
theomgdev merged 22 commits intomainfrom
dev

theomgdev commented Mar 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

theomgdev commented Mar 17, 2026

Uh oh!

Copilot AI commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

theomgdev commented Mar 17, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

theomgdev commented Mar 17, 2026

Uh oh!

Copilot AI commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants