Skip to content

Add year field to ProcessParameter, create ProcessParameterMap#475

Merged
tsmbland merged 24 commits intomainfrom
annual_field_alternative
Apr 28, 2025
Merged

Add year field to ProcessParameter, create ProcessParameterMap#475
tsmbland merged 24 commits intomainfrom
annual_field_alternative

Conversation

@tsmbland
Copy link
Copy Markdown
Collaborator

@tsmbland tsmbland commented Apr 10, 2025

Description

Creating ProcessParameterMap which hold process parameters for each year. I've added a year column to process_parameters.csv, which can be either "all", a single year or a list of years. This gets read in by deserialize_year to create a YearSelection, and then a ProcessParameterMap is created out of all of the entries in read_process_parameters_from_iter

Comments:

  • I've created a try_insert helper function, which will return an error if attempting to insert data for a key that already exists in the map. For example, if you had year = "all" and year = 2020 rows for a certain process, you'd get an error telling you there's duplicate data for 2020. The annoying thing is that this won't tell you which process this is for, as this information isn't available to try_insert. Not sure if there's a good solution here. Would be nice to use this for other maps as well.
  • One really annoying thing about this is that it complicates making test data as we now need to create a dictionary covering all years necessary for the test. And we need to do this for all tests that use process parameters, so it gets quite repetitive. One thing I liked about Add AnnualField, implement on ProcessParameter #474 was that you could just do AnnualField::Constant and you didn't have to worry about this. In any case, since the test data is becoming more and more complex, I wonder if it's worth thinking about making test fixtures which we can reuse between modules, like we (sort-of) do in MUSE1 (if such a thing is even possible/standard in Rust).
  • Assets should use parameters for their commission year, not the current year, so care is needed to pass the appropriate year when extracting parameters
  • I've moved start_year and end_year over the the main processes table, as these should not vary by year
  • It's no longer checking that parameters are provided for every process. Not sure the best way to do this so I've just left it for now

Partially fixes #493 (also need to add a region dimension)

Type of change

  • Bug fix (non-breaking change to fix an issue)
  • New feature (non-breaking change to add functionality)
  • Refactoring (non-breaking, non-functional change to improve maintainability)
  • Optimization (non-breaking change to speed up the code)
  • Breaking change (whatever its nature)
  • Documentation (improve or add documentation)

Key checklist

  • All tests pass: $ cargo test
  • The documentation builds and looks OK: $ cargo doc

Further checks

  • Code is commented, particularly in hard-to-understand areas
  • Tests added that prove fix is effective or that feature works

@tsmbland tsmbland changed the title Annual field alternative Add year field to ProcessParameter, create ProcessParameterMap (alternative to #474) Apr 14, 2025
@tsmbland tsmbland changed the base branch from main to maps April 16, 2025 15:32
@tsmbland tsmbland force-pushed the annual_field_alternative branch from 8610b55 to be12c26 Compare April 23, 2025 13:34
@tsmbland tsmbland changed the base branch from maps to main April 23, 2025 13:34
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 23, 2025

Codecov Report

Attention: Patch coverage is 83.24873% with 33 lines in your changes missing coverage. Please review.

Project coverage is 94.33%. Comparing base (154e6d7) to head (35b7d49).
Report is 25 commits behind head on main.

Files with missing lines Patch % Lines
src/input/process.rs 82.19% 2 Missing and 11 partials ⚠️
src/year.rs 29.41% 11 Missing and 1 partial ⚠️
src/input/process/parameter.rs 86.79% 2 Missing and 5 partials ⚠️
src/input.rs 88.88% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #475      +/-   ##
==========================================
- Coverage   95.03%   94.33%   -0.70%     
==========================================
  Files          36       37       +1     
  Lines        4893     4751     -142     
  Branches     4893     4751     -142     
==========================================
- Hits         4650     4482     -168     
- Misses        123      138      +15     
- Partials      120      131      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tsmbland tsmbland changed the title Add year field to ProcessParameter, create ProcessParameterMap (alternative to #474) Add year field to ProcessParameter, create ProcessParameterMap Apr 23, 2025
@tsmbland tsmbland marked this pull request as ready for review April 23, 2025 16:21
@tsmbland tsmbland requested a review from alexdewar April 23, 2025 16:21
Copy link
Copy Markdown
Member

@alexdewar alexdewar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! A few suggestions.

Given that Assets need to look up various process parameters for their commission year, I think it would be better to just do that lookup once and store the process parameters as a field in the asset. You could store Rc<ProcessParameter>s in the parameter map to make them cheap to copy (also useful for when there are ranges of years specified).

I don't think we actually need the YearSelection struct as we're not going to be using it outside of the input layer. You could just represent years as strings in the relevant *Raw structs and then have a function to parse them, kind of like I suggested before, e.g.:

pub fn parse_year_str(s: &str, milestone_years: &[u32]) -> Result<Vec<u32>> {
    if s.eq_ignore_ascii_case("all") {
        return Ok(Vec::from_iter(milestone_years.iter().copied()));
    }

    s.split(";")
        .map(|y| {
            let year = y.parse::<u32>()?;
            ensure!(milestone_years.binary_search(&year).is_ok());
            Ok(year)
        })
        .try_collect()
        .context("Invalid year")
}

This would mean you could do parsing and validation (e.g. checking that they're all valid milestone years) in one step.

Comment thread examples/simple/process_parameters.csv
Comment thread src/input/process.rs Outdated
Comment thread src/input/process.rs Outdated
Comment thread src/process.rs Outdated
Comment thread src/input/process/parameter.rs Outdated
Comment thread src/input/process/parameter.rs Outdated
Comment thread src/input/process/parameter.rs Outdated
Comment thread src/utils.rs Outdated
Comment thread src/utils.rs Outdated
@alexdewar
Copy link
Copy Markdown
Member

PS -- I agree with you about test fixtures. I've written a few functions in places which essentially serve as fixtures, but I haven't been very systematic about it.

One approach would be to have a file with these sorts of functions (or constants) in for tests. Alternatively, this crate seems to let you write fixtures in the same kind of way you can with pytest, which is kinda neat. Might be worth looking into.

Co-authored-by: alexdewar <alexdewar@users.noreply.github.com>
@tsmbland
Copy link
Copy Markdown
Collaborator Author

Thanks @alexdewar

I'm going to keep YearSelection for now. I like your suggestion, but if we're going to do this we should do this for RegionSelection as well, so those should be done together (could probably do this on top of #498)

But yeah, I think if we do this for both year and region then life will be simpler

@tsmbland tsmbland requested a review from alexdewar April 28, 2025 10:06
@alexdewar
Copy link
Copy Markdown
Member

I think it would be better to fix it now, unless you're really time pressured or something. I'm keen not to repeat the mistake I made with RegionSelection (which we should also fix -- though we can do that later).

The YearSelection stuff also makes for a more fiddly interface, because users have to handle the All and Some cases separately. If we're mostly just populating maps using years as keys, then we just need something that gives an iterator of years. The caller shouldn't need to care about what form the years are in in the input file.

@tsmbland
Copy link
Copy Markdown
Collaborator Author

Done!

@tsmbland tsmbland mentioned this pull request Apr 28, 2025
10 tasks
Copy link
Copy Markdown
Member

@alexdewar alexdewar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few really minor things, but other than that, I think we're nearly there.

Comment thread src/input/process.rs
.with_context(|| format!("Missing availabilities for process {id}"))?;
process.flows = flows
.remove(id)
.with_context(|| format!("Missing flows for process {id}"))?;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason an unwrap() was used here (and for the other ones) is because we already check whether all process IDs are covered earlier on, so we know it won't fail. We could always get rid of the other checks.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so actually (at least not that I can find)

I'd probably propose making availabilities optional, although I'll leave for now as this will all change with #363

Comment thread src/utils.rs Outdated
Comment thread src/utils.rs Outdated
pub fn try_insert<K, V>(map: &mut HashMap<K, V>, key: K, value: V) -> Result<()>
where
K: Eq + Hash + std::fmt::Display,
K: Eq + Hash + std::fmt::Display + std::marker::Copy,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think K needs to implement Copy for this. If you do map.insert(key.clone(), value) instead below, then it will also work for non-Copy types.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basic question: what's the difference between copy and clone?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy is essentially a version of clone where you don't actually have to call clone() to get a copy of it, but it only works if you can just copy the underlying bytes to do so. So AgentID can implement Copy (because it's just a u32 under the hood) but e.g. CommodityID can't (if you just copy the raw bytes of an Rc<str> bad things will happen).

Comment thread src/year.rs
Comment thread src/year.rs
);
Ok(year)
})
.collect()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seem to remember your code also rejected empty sets of years (i.e. the user has just input an empty string), which makes sense. Shall we keep that check? I don't think we want users to be able to provide no years as an input.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. We could either disallow or interpret an empty string to mean all years. I think I'd favour the former for now, but possibly something to revisit

Comment thread src/utils.rs Outdated
@tsmbland tsmbland requested a review from alexdewar April 28, 2025 15:05
@tsmbland tsmbland merged commit d502c32 into main Apr 28, 2025
7 checks passed
@tsmbland tsmbland deleted the annual_field_alternative branch April 28, 2025 18:44
@github-project-automation github-project-automation Bot moved this to ✅ Done in MUSE Apr 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

Add "year" and "region" fields to process_parameters.csv and associated data structures

2 participants