Create an inputs.h5 -> inputs.gdx pipeline to help move processing out of b_inputs.gms#61
Conversation
…s other region levels even though it isn't passed to GAMS)
wesleyjcole
left a comment
There was a problem hiding this comment.
I like how tidy this makes the change.
I did see that the comments are not printed in the b_declare_{}.gms files. Would that be easy to add? I don't know if it's important, but there might be an instance where having comments there would be helpful.
| @@ -884,11 +905,20 @@ def main(reeds_path, inputs_case, agglevel, regions): | |||
| 'cap_cspns': True, | |||
| 'can_imports_capacity': True, | |||
| } | |||
| gamstype = { | |||
| 'pcat': 'set', | |||
There was a problem hiding this comment.
I thought that was the case, but it's still used in b_inputs.gms; here's a snippet from that PR branch:
Lines 1179 to 1208 in 06c948a
There was a problem hiding this comment.
Why the underscore for aliases and pcat? None of the others use that convention.
There was a problem hiding this comment.
Just to set them apart because they are not read directly into GAMS (explained in inputs/sets/README.md)
| @@ -1,5 +1,5 @@ | |||
| o "once through", | |||
There was a problem hiding this comment.
Is it feasible to keep comments in the set files? I see you moved these to the ReadMe--it just seems more useful here than there, but keeping it in the file might be a hassle.
There was a problem hiding this comment.
I'm not sure if gdxpds supports element-level comments, and it would add a lot of overhead to propagate them to the inputs.h5 file. To me it doesn't seem worth it for something that's only used in two sets. I also feel like README.md is a more suitable place for documentation than raw .csv files.
| - `_aliases.csv`: aliases (extra names for the same set) used in GAMS | ||
| - Aliases of primary sets should be added here | ||
| - Aliases of sets defined in `b_inputs.gms` (e.g., `h`→`hh`) should instead be defined in GAMS after the set definition | ||
| - `_pcat.csv`: prescribed capacity categories | ||
| - The `pcat` set in GAMS (defined in `writecapdat.py`) includes the members of the `i` set; this file includes only the *extra* elements on top of the `i` set |
There was a problem hiding this comment.
I see here that these are special case file. Is that why the underscore is used, as a flag that this won't exactly match the GAMS set?
There was a problem hiding this comment.
Yeah that's right; I thought it would be confusing if all the csv files in inputs/sets match the sets used in the model except for these files.
…g and clarify funciton name in h5_to_gdx.py
I wasn't thinking of |
| $include b_declare_sets.gms | ||
| $include b_declare_parameters.gms | ||
| $gdxin inputs_case%ds%inputs_0.gdx | ||
| $include b_load_sets.gms | ||
| $include b_load_parameters.gms | ||
| $gdxin |
There was a problem hiding this comment.
This approach could be simplified by using gams.transfer to write regular (not relaxed) sets, and then loading the whole file with $declareAndLoad. But since it would require an update to environment.yml and only affects the code in two places (these lines and h5_to_gdx.py), my preference is to do it in a followup PR and move ahead with the current approach for now, so we can start adopting the new input formalism sooner rather than later.
| @@ -884,11 +905,20 @@ def main(reeds_path, inputs_case, agglevel, regions): | |||
| 'cap_cspns': True, | |||
| 'can_imports_capacity': True, | |||
| } | |||
| gamstype = { | |||
| 'pcat': 'set', | |||
| Returns: | ||
| pd.DataFrame | ||
| """ | ||
| key = Path(name).stem |
There was a problem hiding this comment.
Just curious - are there any cases where name would be different from key? Or is this just in case someone passes a relative file path instead of just a name?
There was a problem hiding this comment.
Yeah, the idea was just to standardize the input so the user can provide either name or {case}/inputs_case/{name}.csv or {name}.csv. I can't remember if there's a specific place where it's used that second way in the code though. If you think it's unnecessary/confusing to have that flexibility, we could just go with the direct name input instead.
There was a problem hiding this comment.
Got it thanks, and no worries I think it's fine as-is
| print(f'{Path(h5path).name}: Wrote {key} from {calling_file}') | ||
|
|
||
|
|
||
| def write_csv_to_h5( |
There was a problem hiding this comment.
Could you either rename this to write_csv_to_inputs_h5 or something similar (since it only ever writes to inputs.h5), or add an argument to specify an .h5 filepath? Also "copy" rather than "write" would be a bit clearer to me as the first word of the function name, but I think either's fine
There was a problem hiding this comment.
Sounds good, I went with write_csv_to_inputs_h5() because it's not exactly a copy given the dtype/name changes, but I agree it's better to specify inpugs_h5 instead of just h5
Co-authored-by: kodiobika <35176195+kodiobika@users.noreply.github.com>
… mcs_sampler.py working
Summary
This PR addresses #38 by creating a new
inputs.h5container and anh5_to_gdx.pyscript that feeds it into GAMS. This structure helps facilitate the exploration of non-GAMS approaches for ReEDS by letting input-processing calculations be moved out ofb_inputs.gmswithout relying on hundreds of new.csvfiles for data transfer.Technical details
Implementation notes
reeds.io.write_input_to_h5()andreeds.io.read_input()val_r = pd.read_csv(os.path.join(case,'inputs_case','val_r.csv'), header=None)is nowval_r = reeds.io.read_input(case, 'r')h5_to_gdx.py, runs as the lastinput_processingscript and convertsinputs.h5toinputs_0.gdxb_declare_(sets|parameters).gmsandb_load_(sets|parameters).gmsfiles to facilitate reading the.gdxfile inb_inputs.gms(I couldn't figure out how to get it to work with$declareAndLoad inputs_case%ds%inputs_0.gdxalone; it wasn't recognizing subsets for domain checking)inputs/sets/_aliases.csvinputs/sets/README.mdreeds.io.write_input_to_h5()when adding new inputs instead of writing them to a.csvfile and loading explicitly inb_inputs.gmsinput_processingas necessary) instead ofb_inputs.gmsb_inputs.gms/e_report.gmsonce this PR is mergedAdditional changes
sw.csv->wst_surface.csvto avoid confusion with switches (which are abbreviatd in python assw)gen_mandate_tech_list.csv->nat_gen_tech_frac.csvto match its name in GAMSIssues resolved
#38
Validation, testing, and comparison report(s)
Zero change for the Pacific test case:
results-v20260426_mainM0_Pacific,v20260426_inputsM1_Pacific.pptx
The only changes to
inputs.gdx(aside from some minor parameter renaming) are due to rounding: diff_inputs-v20260426_mainM0_Pacific-v20260426_inputsM1_Pacific.gdx.zipNear-zero change (rounding differences only) for the USA_defaults case: results-mainK0_USA_defaults,inputsK0_USA_defaults.pptx
20260522
Double checked, and still only rounding-error differences in the Pacific test case. I also made sure the MonteCarlo_LHS case still works.
Checklist for author
Details to double-check
General information to guide review
Did you use LLM tools (chatbot or copilot) in the preparation of this PR? If so, describe how
No