Turn off the ironing out loop by default by tsmbland · Pull Request #1170 · EnergySystemsModellingLab/MUSE2

tsmbland · 2026-03-04T14:02:34Z

Description

Turns off the ironing out loop by default (max_ironing_out_iterations = 1), as requested by @ahawkes, but maintains a regression test for a patched version of the simple model with the loop turned on (iterations = 10, the previous default). To do this I had to modify the patching API to allow patching the example model's model.toml (this will also be useful for #1105)

Turns out for the "simple" example this doesn't have any impact, but for others it does

Fixes #1165

Type of change

Bug fix (non-breaking change to fix an issue)
New feature (non-breaking change to add functionality)
Refactoring (non-breaking, non-functional change to improve maintainability)
Optimization (non-breaking change to speed up the code)
Breaking change (whatever its nature)
Documentation (improve or add documentation)

Key checklist

All tests pass: $ cargo test
The documentation builds and looks OK: $ cargo doc
Update release notes for the latest release if this PR adds a new feature or fixes a bug
present in the previous release

Further checks

Code is commented, particularly in hard-to-understand areas
Tests added that prove fix is effective or that feature works

codecov · 2026-03-04T14:05:27Z

Codecov Report

❌ Patch coverage is 95.10490% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.02%. Comparing base (0eedcfe) to head (a0c9e2d).
⚠️ Report is 21 commits behind head on main.

Files with missing lines	Patch %	Lines
src/simulation/prices.rs	62.50%	3 Missing ⚠️
src/patch.rs	97.22%	2 Missing ⚠️
src/cli/example.rs	85.71%	0 Missing and 1 partial ⚠️
src/example/patches.rs	98.18%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1170      +/-   ##
==========================================
+ Coverage   87.32%   89.02%   +1.69%     
==========================================
  Files          57       57              
  Lines        7726     7838     +112     
  Branches     7726     7838     +112     
==========================================
+ Hits         6747     6978     +231     
+ Misses        673      564     -109     
+ Partials      306      296      -10

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

This PR changes the default behavior of the model “ironing out” loop by reducing max_ironing_out_iterations from 10 to 1, and updates expected regression outputs accordingly. It also extends the patched-example mechanism so a patched example can optionally include a model.toml patch (intended to preserve coverage of the loop-enabled configuration via a patched “simple” model).

Changes:

Set the default max_ironing_out_iterations to 1 (and update the input schema default accordingly).
Extend the patched-example registry to include an optional TOML patch, adding a simple_ironing_out patched example that sets iterations back to 10.
Refresh a large set of regression “golden” CSV outputs to match the new default behavior; add new golden outputs for simple_ironing_out.

Reviewed changes

Copilot reviewed 25 out of 26 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/data/two_regions/commodity_prices.csv	Updated expected commodity price outputs under the new default iteration count.
tests/data/two_regions/assets.csv	Updated expected asset outputs under the new default iteration count.
tests/data/two_outputs/commodity_prices.csv	Updated expected commodity price outputs under the new default iteration count.
tests/data/two_outputs/assets.csv	Updated expected asset outputs under the new default iteration count.
tests/data/simple_npv/assets.csv	Updated expected asset outputs for the NPV patched scenario under the new default.
tests/data/simple_ironing_out/commodity_prices.csv	New golden output for the patched “simple” model with ironing-out enabled (iterations=10).
tests/data/simple_ironing_out/commodity_flows.csv	New golden output for the patched “simple” model with ironing-out enabled (iterations=10).
tests/data/simple_ironing_out/assets.csv	New golden output for the patched “simple” model with ironing-out enabled (iterations=10).
tests/data/simple/debug_solver.csv	Updated debug solver expectations reflecting fewer iterations by default.
tests/data/simple/debug_dispatch_assets.csv	Updated debug dispatch expectations reflecting fewer iterations by default.
tests/data/simple/debug_commodity_balance_duals.csv	Updated debug dual expectations reflecting fewer iterations by default.
tests/data/simple/debug_appraisal_results.csv	Updated appraisal debug expectations reflecting fewer iterations by default.
tests/data/muse1_default/commodity_prices.csv	Updated expected commodity price outputs under the new default iteration count.
tests/data/muse1_default/commodity_flows.csv	Updated expected commodity flow outputs under the new default iteration count.
tests/data/muse1_default/assets.csv	Updated expected asset outputs under the new default iteration count.
tests/data/circularity/commodity_prices.csv	Updated expected commodity price outputs under the new default iteration count.
tests/data/circularity/assets.csv	Updated expected asset outputs under the new default iteration count.
src/model/parameters.rs	Changes the default `max_ironing_out_iterations` from 10 to 1.
src/example/patches.rs	Extends patch registry to include optional TOML patch; adds `simple_ironing_out` patch entry.
src/cli/example.rs	Updates CLI patched-example extraction to apply file patches + optional TOML patch.
schemas/input/model.yaml	Updates schema default for `max_ironing_out_iterations` to 1.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot · 2026-03-04T14:18:43Z

 fn extract_example(name: &str, patch: bool, dest: &Path) -> Result<()> {
    if patch {
-        let patches = get_patches(name)?;
+        let (file_patches, toml_patch) = get_patches(name)?;


get_patches(name)? returns a reference to a (Vec<FilePatch>, Option<String>), but this code destructures it as if it were an owned tuple. This won’t compile (type mismatch &(…) vs (…)). Either bind the returned reference and access its fields, or change get_patches to return an owned tuple (e.g., cloned) so destructuring works.

Suggested change

let (file_patches, toml_patch) = get_patches(name)?;

let (file_patches, toml_patch) = get_patches(name)?.clone();

Well it does compile...

tsmbland · 2026-03-04T14:24:50Z

@alexdewar Now getting slightly different results on macos which is concerning. I seem to remember something like this has happened before...?

alexdewar · 2026-03-04T16:36:34Z

@alexdewar Now getting slightly different results on macos which is concerning. I seem to remember something like this has happened before...?

These differences are pretty large too:

thread 'two_regions' (11401) panicked at tests/regression.rs:78:5:
The following errors occurred:
  * commodity_flows.csv: line 751:
    + "2045,8,gas,all-year.night,-0.86000992"
    - "2045,8,gas,all-year.night,-1.39307392"
  * commodity_flows.csv: line 752:
    + "2045,8,electricity,all-year.night,0.514976"
    - "2045,8,electricity,all-year.night,0.834176"
  * commodity_flows.csv: line 753:
    + "2045,8,CO2f,all-year.night,47.20784992"
    - "2045,8,CO2f,all-year.night,76.46891392"
  * commodity_flows.csv: line 955:
    + "2045,25,gas,all-year.night,-0.5330640000000001"
    - "2045,25,gas,all-year.night,-0.0"
  * commodity_flows.csv: line 956:
    + "2045,25,electricity,all-year.night,0.3192000000000001"
    - "2045,25,electricity,all-year.night,0.0"
  * commodity_flows.csv: line 957:
    + "2045,25,CO2f,all-year.night,29.261064000000008"
    - "2045,25,CO2f,all-year.night,0.0"

Not sure what's going on here. If you want to compare the macOS results with e.g. the Windows ones, we upload them as test artifacts you can see on the summary page: https://github.com/EnergySystemsModellingLab/MUSE2/actions/runs/22673536177?pr=1170

tsmbland · 2026-03-05T11:24:28Z

The difference concerns electricity supply in the night timeslice in 2045 by two assets (8 and 25):

on mac: 0.32 electricity supply by asset 25, 0.51 by asset 8 (= 0.83 total)
on windows/ubuntu: 0 electricity supply by asset 25, 0.83 by asset 8 (= 0.83 total)

These are both gasCCGT assets in R2, and although the commission year is different, the parameters for these assets are identical. Therefore, as far as the objective is concerned, the two solutions are equally good. It's probably the case for a lot of optimisations that there's no single optimal solution, rather a space of equally good solutions, but 99.9% of the time highs seems to pick the same solution every time, regardless of OS (I think this always ends up being a "corner" solution, i.e. where one or more parameters in the space is at its limit. The two solutions above are different corner solutions where supply by asset 25 is at its upper and lower limit respectively).

I don't know the factors that cause it to choose one optimum over another, but it seems that tiny floating-point differences between platforms can have an impact (perhaps tilting the energy landscape slightly so that one solution becomes ever-so-slightly favourable over another, or affecting pivot decisions in the optimisation path so it lands on a different vertex).

The best solution I can think of is to modify the objective so that we can guarantee a single optimum solution. One idea would be L2 regularisation, adding a small epsilon-scaled term to the objective to minimise the sum of squares of the variables. Generally this favours smaller values over large "extreme" values, which for our purposes means spreading activity over multiple assets rather than allocating all activity to a single asset - probably not a bad thing to aim for in any case. (In this particular example, the mac solution would be the single unique optimum).

In practice I'm not sure how easy this would be. highs-sys has Highs_passHessian which I believe makes this possible, but this isn't exposed in the highs crate we're using. I believe it's also incompatible with integer variables, which we are using in some situations, but I think we could circumvent this with a two-step approach. This is all based on a conversation with AI so take with a pinch of salt.

Apart from that I don't really have any other ideas...

tsmbland · 2026-03-05T11:37:41Z

Annoying that this PR is getting blocked by something completely unrelated! For the time being, do you think it would be ok to turn off this particular regression test until we can get this fixed? What else can we do...?

tsmbland · 2026-03-05T11:39:39Z

Or revert this particular example model back to using the ironing out loop and update it later?

alexdewar · 2026-03-05T12:50:43Z

The difference concerns electricity supply in the night timeslice in 2045 by two assets (8 and 25):

on mac: 0.32 electricity supply by asset 25, 0.51 by asset 8 (= 0.83 total)

on windows/ubuntu: 0 electricity supply by asset 25, 0.83 by asset 8 (= 0.83 total)

These are both gasCCGT assets in R2, and although the commission year is different, the parameters for these assets are identical. Therefore, as far as the objective is concerned, the two solutions are equally good. It's probably the case for a lot of optimisations that there's no single optimal solution, rather a space of equally good solutions, but 99.9% of the time highs seems to pick the same solution every time, regardless of OS (I think this always ends up being a "corner" solution, i.e. where one or more parameters in the space is at its limit. The two solutions above are different corner solutions where supply by asset 25 is at its upper and lower limit respectively).

I don't know the factors that cause it to choose one optimum over another, but it seems that tiny floating-point differences between platforms can have an impact (perhaps tilting the energy landscape slightly so that one solution becomes ever-so-slightly favourable over another, or affecting pivot decisions in the optimisation path so it lands on a different vertex).

This sounds plausible.

The best solution I can think of is to modify the objective so that we can guarantee a single optimum solution. One idea would be L2 regularisation, adding a small epsilon-scaled term to the objective to minimise the sum of squares of the variables. Generally this favours smaller values over large "extreme" values, which for our purposes means spreading activity over multiple assets rather than allocating all activity to a single asset - probably not a bad thing to aim for in any case. (In this particular example, the mac solution would be the single unique optimum).

In practice I'm not sure how easy this would be. highs-sys has Highs_passHessian which I believe makes this possible, but this isn't exposed in the highs crate we're using. I believe it's also incompatible with integer variables, which we are using in some situations, but I think we could circumvent this with a two-step approach. This is all based on a conversation with AI so take with a pinch of salt.

Apart from that I don't really have any other ideas...

This is an interesting idea, but probably quite a lot of work and, as you say, out of scope for this PR anyway.

As an aside, you can actually call the functions in the lower-level highs-sys crate by getting the pointer from your Model struct and passing that in (it'll have to be in an unsafe block), but if we know we want this functionality it would be cleaner to open a PR for the highs crate adding it (I did that to get the objective value thing added).

Annoying that this PR is getting blocked by something completely unrelated! For the time being, do you think it would be ok to turn off this particular regression test until we can get this fixed? What else can we do...?

How about we just skip this test on non-x86 platforms for now?

This should work:

diff --git a/tests/regression.rs b/tests/regression.rs
index 41c933b9..4307bc3e 100644
--- a/tests/regression.rs
+++ b/tests/regression.rs
@@ -26,6 +26,9 @@ define_regression_test!(circularity);
 // Patched examples
 define_regression_test_with_patches!(simple_divisible);
 define_regression_test_with_patches!(simple_npv);
+
+// We get different results on ARM Macs, for reasons that aren't clear
+#[cfg(target_arch = "x86_64")]
 define_regression_test_with_patches!(simple_ironing_out);
 
 // ------  END: regression tests  ------

It would be better if we opened an issue and linked to it in the comment. It would be nice if the examples gave similar results on all the platforms we care about, but it probably shouldn't be a high priority to fix this. But let's keep track of it anyway!

tsmbland · 2026-03-05T13:19:08Z

I'm going to hold this until #1173. That will be another major change to all of the example models, and there's every chance that that will magically fix the issue here. (Or, it could just as likely introduce more issues just like this...)

In any case, this does reveal an underlying issue, so I've opened #1174

make regression test files order-dependent again add pr number to upcoming regenerate regression data

…sion results

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot · 2026-03-12T10:38:35Z

@@ -31,7 +31,7 @@ properties:
  max_ironing_out_iterations:
    type: integer
    description: The maximum number of iterations to run the "ironing out" step of agent investment for


The max_ironing_out_iterations schema description sentence is truncated (ends with "for"). Since this PR touches the default here, it’d be good to complete the description so generated docs/help text are clear.

Suggested change

description: The maximum number of iterations to run the "ironing out" step of agent investment for

description: The maximum number of iterations to run the "ironing out" step of agent investment before stopping the process

It's actually not truncated, but maybe Copilot's wording is clearer...

tsmbland · 2026-03-12T10:56:41Z

@Aurashk @alexdewar Tried rebasing this on top of @Aurashk's PR and it seems to have worked! Sort of...

Tests now passing on ubuntu. One of the tests @Aurashk added in #1185 is failing on Windows, but that's a separate problem that needs to be fixed in #1185. Tests still failing on macos, but that's to do with this this PR as explained above.

So together I think we can fix this!

@Aurashk fix file_patch_with_replacement_missing_base_file_fails in add regression tests for different pricing strategies with simple model #1185
I'll try @alexdewar's suggestion here and open up an issue about this
Combine these together and all tests should pass

We'd be sweeping the core problems under the rug a bit, but in the interests of getting things merged I think we have to for now

Aurashk · 2026-03-12T13:53:06Z

file_patch_with_replacement_missing_base_fil

Thanks @tsmbland, I think my latest commit should fix the windows issue in that test.

…ng_out_off

tsmbland · 2026-03-12T14:06:51Z

file_patch_with_replacement_missing_base_fil

Thanks @tsmbland, I think my latest commit should fix the windows issue in that test.

Et voila

tsmbland · 2026-03-12T14:14:11Z

@alexdewar In the interest of getting this and @Aurashk's PR merged I've done what you've suggested and disabled the problematic test on macos, and have opened up #1197 to keep an eye on this

alexdewar

LGTM! Great that you've managed to fix the failing tests.

alexdewar · 2026-03-12T15:03:20Z

-/// A map of file patches, keyed by name
-type PatchMap = BTreeMap<&'static str, Vec<FilePatch>>;
+/// Map of patches keyed by name, with the file patches and an optional TOML patch
+type PatchMap = BTreeMap<&'static str, (Vec<FilePatch>, Option<String>)>;


The Option<String> could actually be an Option<&'static str> (where &'static str means string literal) then you wouldn't need to have .to_string()s.

Part of me is wondering whether this would be clearer if you defined a struct rather than using a tuple, but I'm not sure. Up to you.

One day I hope to understand strings in rust!

alexdewar · 2026-03-12T15:20:37Z

 define_regression_test!(circularity);

+// For this model we get different results on ARM Macs, for reasons that aren't clear
+#[cfg(target_arch = "x86_64")]


It would be even better if we could mark this as "expected to fail", like you can with pytest, but seemingly there's no built-in way to do that with cargo test. I guess we can just check whether it's miraculously started passing at some point.

tsmbland · 2026-03-12T15:44:30Z

LGTM! Great that you've managed to fix the failing tests.

More like sweep under the rug 😂

alexdewar · 2026-03-12T15:48:34Z

LGTM! Great that you've managed to fix the failing tests.

More like sweep under the rug 😂

🙈

Copilot

Pull request overview

Copilot reviewed 37 out of 38 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

+// For this model we get different results on ARM Macs, for reasons that aren't clear
+#[cfg(target_arch = "x86_64")]
+define_regression_test!(two_regions);


@@ -167,13 +218,21 @@ impl FilePatch {

    /// Apply this patch to a base model and return the modified CSV as a string.


@@ -31,7 +31,7 @@ properties:
  max_ironing_out_iterations:
    type: integer
    description: The maximum number of iterations to run the "ironing out" step of agent investment for


tsmbland marked this pull request as ready for review March 4, 2026 14:13

Copilot AI review requested due to automatic review settings March 4, 2026 14:13

Copilot started reviewing on behalf of tsmbland March 4, 2026 14:14 View session

Copilot AI reviewed Mar 4, 2026

View reviewed changes

tsmbland requested review from Aurashk and alexdewar March 4, 2026 14:25

tsmbland mentioned this pull request Mar 5, 2026

Potential for different dispatch results on different platforms #1174

Open

Aurashk and others added 12 commits March 10, 2026 12:53

add regression tests for different pricing strategies with simple model

44cffa6

change all pricing methods to use index map

1a449a4

make regression test files order-dependent again add pr number to upcoming regenerate regression data

make patches with_replacement feature more robust

a1a47e7

make spply with_replacement more robust

048e52e

update patch unit test

4758654

make with_replacement patch easier to use and less error prone

fd9ea49

show the correct csv line number when there is a difference in regres…

63ddd2d

…sion results

Allow toml patching for regression tests

438aabc

Add patch for the simple model with the ironing out loop turned on

a02d102

Change default to 1 iteration (no loop)

3e86126

Fix typo

5c94a60

Add regression test for new model

fd1a6a4

Copilot AI review requested due to automatic review settings March 12, 2026 10:32

tsmbland force-pushed the ironing_out_off branch from 55ef9ae to fd1a6a4 Compare March 12, 2026 10:32

Copilot started reviewing on behalf of tsmbland March 12, 2026 10:32 View session

tsmbland changed the base branch from main to add-regression-tests-for-pricing-strategies March 12, 2026 10:33

Copilot AI reviewed Mar 12, 2026

View reviewed changes

Update results files

12ce6ff

make path testing more robust across os's

58d149d

tsmbland and others added 2 commits March 12, 2026 13:58

Disable test on macs

2e9e9d7

Merge branch 'add-regression-tests-for-pricing-strategies' into ironi…

c62ee35

…ng_out_off

tsmbland mentioned this pull request Mar 12, 2026

Regression test disabled on macos #1197

Open

alexdewar approved these changes Mar 12, 2026

View reviewed changes

Switch to string literal

ab0107a

alexdewar mentioned this pull request Mar 13, 2026

add regression tests for different pricing strategies with simple model #1185

Closed

11 tasks

tsmbland changed the base branch from add-regression-tests-for-pricing-strategies to main March 16, 2026 16:27

Merge branch 'main' into ironing_out_off

8187abd

Copilot AI review requested due to automatic review settings March 16, 2026 16:33

Copilot started reviewing on behalf of tsmbland March 16, 2026 16:34 View session

Copilot AI reviewed Mar 16, 2026

View reviewed changes

Regenerate test data

a0c9e2d

tsmbland merged commit 5389d44 into main Mar 16, 2026
6 checks passed

tsmbland deleted the ironing_out_off branch March 16, 2026 16:58

	let (file_patches, toml_patch) = get_patches(name)?;
	let (file_patches, toml_patch) = get_patches(name)?.clone();

	description: The maximum number of iterations to run the "ironing out" step of agent investment for
	description: The maximum number of iterations to run the "ironing out" step of agent investment before stopping the process

		@@ -167,13 +218,21 @@ impl FilePatch {

		/// Apply this patch to a base model and return the modified CSV as a string.

Conversation

tsmbland commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Key checklist

Further checks

Uh oh!

codecov Bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

tsmbland Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tsmbland commented Mar 4, 2026

Uh oh!

alexdewar commented Mar 4, 2026

Uh oh!

tsmbland commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tsmbland commented Mar 5, 2026

Uh oh!

tsmbland commented Mar 5, 2026

Uh oh!

alexdewar commented Mar 5, 2026

Uh oh!

tsmbland commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

alexdewar Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

tsmbland commented Mar 12, 2026

Uh oh!

Aurashk commented Mar 12, 2026

Uh oh!

tsmbland commented Mar 12, 2026

Uh oh!

tsmbland commented Mar 12, 2026

Uh oh!

alexdewar left a comment

Choose a reason for hiding this comment

Uh oh!

alexdewar Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

tsmbland Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

alexdewar Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

tsmbland commented Mar 12, 2026

Uh oh!

alexdewar commented Mar 12, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

tsmbland commented Mar 4, 2026 •

edited

Loading

codecov Bot commented Mar 4, 2026 •

edited

Loading

tsmbland commented Mar 5, 2026 •

edited

Loading

tsmbland commented Mar 5, 2026 •

edited

Loading