NatLabRockies · genevievestarke · Apr 22, 2026 · Apr 8, 2026 · Apr 8, 2026 · Apr 8, 2026
diff --git a/docs/hercules_input.md b/docs/hercules_input.md
@@ -131,7 +131,7 @@ The old format is still supported for backward compatibility but will show a dep
 ### External Data File Format
 
 The CSV file must contain:
-- A `time_utc` column with UTC timestamps in ISO 8601 format
+- A `time_utc` column with UTC timestamps in ISO 8601 format. Unlike wind/solar/SCADA/playback inputs (which are treated as start-of-period period averages), external data values are treated as **instantaneous** samples at their timestamps and are upsampled to the simulation time grid via `"instantaneous_to_instantaneous"` (linear interpolation). If you need zero-order-hold (piecewise-constant) behaviour -- e.g. for LMP prices -- pre-process the file to include an extra row at the end of each interval carrying the previous value; see [Achieving zero-order-hold (ZOH) behaviour](timing.md#achieving-zero-order-hold-zoh-behaviour) and the [`generate_locational_marginal_price_dataframe_from_gridstatus`](../hercules/grid/grid_utilities.py) helper.
 - One or more data columns with external signals. Note that the names of the other columns are arbitrary; any column names will be carried forward and interpolated. However, the values must be floats. Additionally, some controllers and plotting utilities that work on external signals may require specific column names like `lmp_rt`, `lmp_da`, `wind_forecast`, etc.
 
 Example `lmp_data.csv`:

diff --git a/docs/output_files.md b/docs/output_files.md
@@ -2,6 +2,8 @@
 
 Hercules generates HDF5 output files containing simulation data for analysis and visualization. This page describes the file format, available utilities for reading the data, and how HerculesModel generates these files.
 
+All values in output files represent **instantaneous** quantities at each time step, not period averages. This differs from the convention used by input data files, where timestamps mark the start of a reporting period. See [Time Interpretation](timing.md#time-interpretation-inputs-vs-internal-values) for details on this distinction and the midpoint correction applied during input interpolation.
+
 ## File Format
 
 Hercules outputs simulation data in HDF5 (Hierarchical Data Format 5) format.

diff --git a/docs/power_playback.md b/docs/power_playback.md
@@ -32,7 +32,7 @@ power_unit_1:
 
 The input file must contain the following columns:
 
-- `time_utc`: Timestamps in UTC (ISO 8601 format or parseable datetime strings)
+- `time_utc`: Timestamps in UTC (ISO 8601 format or parseable datetime strings). Each timestamp marks the **start of a reporting period**; the power value on that row is treated as the period average. See [Time Interpretation](timing.md#time-interpretation-inputs-vs-internal-values) for how Hercules converts these to instantaneous values.
 - `power`: Power output in kW
 
 Supported file formats: `.csv`, `.p`, `.pkl` (pickle), `.f`, `.ftr` (feather).

diff --git a/docs/solar_pv.md b/docs/solar_pv.md
@@ -12,7 +12,7 @@ Presently only one solar simulator is available
 
 Both models require an input weather file:
 1. A CSV file that specifies the weather conditions (e.g. NonAnnualSimulation-sample_data-interpolated-daytime.csv). This file should include:
-    - timestamp (see [timing](timing.md) for time format requirements)
+    - timestamp (see [timing](timing.md) for time format requirements). Each `time_utc` timestamp marks the **start of a reporting period**; irradiance and weather values on that row are treated as period averages. See [Time Interpretation](timing.md#time-interpretation-inputs-vs-internal-values) for how Hercules converts these to instantaneous values.
     - direct normal irradiance (DNI)
     - diffuse horizontal irradiance (DHI)
     - global horizontal irradiance (GHI)

diff --git a/docs/timing.md b/docs/timing.md
@@ -9,6 +9,105 @@ Timing in Hercules is specified using two complementary representations:
 - `time` (float): Simulation time in seconds, where `time=0` corresponds to `starttime_utc`
 - `time_utc` (datetime): Absolute UTC timestamp
 
+## Time Interpretation: Inputs vs. Internal Values
+
+### Input files: start-of-period convention
+
+In external data sources such as weather files, SCADA records, and resource
+databases, each `time_utc` timestamp marks the **beginning** of a reporting
+period and the associated values (irradiance, wind speed, power, etc.)
+represent an average or aggregate over that period.  For example, an hourly
+weather file with a row at `2020-06-15T12:00:00Z` and GHI = 735 W/m² means
+that 735 W/m² is the average GHI from 12:00 to 13:00.
+
+### Hercules internal values: instantaneous convention
+
+Inside the simulation, values at a given time step represent **instantaneous**
+quantities at that moment.  All Hercules output values follow this same
+instantaneous convention.
+
+### Interpolation methods
+
+The `interpolate_df` function in `utilities.py` accepts a mandatory
+`interpolation_method` parameter that controls how numeric columns are
+resampled onto the simulation time grid.  Two methods are available:
+
+#### `"averaged_to_instantaneous"` (wind, solar, and similar resource and power signals)
+
+Input values are period averages whose timestamps mark the **start** of each
+period.  The best single-point estimate of a period-averaged value is at the
+**midpoint** of its interval, not the start.  For example, the hourly average
+from 12:00-13:00 is most representative of conditions at 12:30. This also ensures that an average of the signal back to the original time interval will match the original data.
+
+1. Each numeric value is assigned to the midpoint of its input interval
+   (using `_compute_interval_midpoints`).
+2. Linear interpolation is then performed between these midpoints to produce
+   values at the simulation time steps.
+
+```
+Input file (start-of-period):
+
+time_utc             value
+12:00                100        ← average over [12:00, 13:00)
+13:00                200        ← average over [13:00, 14:00)
+
+After midpoint correction:
+
+time                 value
+12:30                100        ← midpoint of [12:00, 13:00)
+13:30                200        ← midpoint of [13:00, 14:00)
+
+Querying at 13:00 yields 150 (halfway between midpoints).
+```
+
+#### `"instantaneous_to_instantaneous"`
+
+Input values already represent instantaneous measurements at their
+timestamps.  Standard linear interpolation is performed directly on the
+original timestamps with no midpoint shift.
+
+---
+
+In both methods, datetime columns (e.g. `time_utc`) are linearly
+interpolated on the raw timestamps without any shift, because they are
+instantaneous coordinate mappings between simulation time and wall-clock
+time, not period-averaged measurements.
+
+#### Achieving zero-order-hold (ZOH) behaviour
+
+`interpolate_df` does not provide a dedicated zero-order-hold mode.  If you
+need step/piecewise-constant values -- for example, LMP prices that
+should be held constant across each reporting interval -- pre-process your
+input data to include an additional row at the end of each interval that
+carries the same value as the start-of-interval row, and then use
+`"instantaneous_to_instantaneous"`.  Linear interpolation between each pair
+of identical endpoints reproduces the ZOH shape.
+
+```
+Original data (start-of-interval only):
+
+time_utc             value
+12:00                100
+13:00                200
+
+After inserting end-of-interval rows (just before the next start):
+
+time_utc             value
+12:00                100
+12:59:59             100   ← added endpoint
+13:00                200
+13:59:59             200   ← added endpoint
+
+Querying at 12:30 with "instantaneous_to_instantaneous" yields 100.
+Querying at 13:00 yields 200.
+```
+
+See
+[`generate_locational_marginal_price_dataframe_from_gridstatus`](../hercules/grid/grid_utilities.py)
+in `hercules/grid/grid_utilities.py` for a worked example of this
+endpoint-insertion pattern (it shifts a copy of the data by `dt - 1` seconds
+and merges it back in before handing the frame to Hercules).
+
 ## Input Requirements
 
 All Hercules input files must specify start and end times using UTC datetime strings:
@@ -113,7 +212,20 @@ For the example above, `endtime` would be 3600.0 seconds.
 
 ### Wind and Solar Input Data
 
-Both wind and solar input CSV/Feather/Parquet files must contain a `time_utc` column with UTC timestamps:
+Both wind and solar input CSV/Feather/Parquet files must contain a `time_utc` column with UTC timestamps.  Each `time_utc` value marks the **start of a reporting period**; the data values on that row are treated as period averages.  These are interpolated with `"averaged_to_instantaneous"`.  See [Interpolation methods](#interpolation-methods) above for details.
+
+### External Data (LMP, etc.)
+
+External data files loaded via `_read_external_data_file` are upsampled onto
+the simulation time grid with `"instantaneous_to_instantaneous"` (linear
+interpolation between the supplied timestamps).  If you want zero-order-hold
+(piecewise-constant) behaviour for signals like LMP prices, pre-process the
+file to include end-of-interval rows that repeat the previous value as
+described in [Achieving zero-order-hold (ZOH) behaviour](#achieving-zero-order-hold-zoh-behaviour).
+The helper
+[`generate_locational_marginal_price_dataframe_from_gridstatus`](../hercules/grid/grid_utilities.py)
+in `hercules/grid/grid_utilities.py` is a concrete example of adding those
+endpoint rows for LMP data.
 
 ```text
 time_utc,wd_mean,ws_000,ws_001,ws_002
@@ -145,6 +257,8 @@ Key Points:
 
 ## Output Files
 
+All values in Hercules output files represent **instantaneous** quantities at each time step, not period averages.  See [Time Interpretation](#time-interpretation-inputs-vs-internal-values) for the distinction from input files.
+
 Hercules output HDF5 files store:
 
 - `time` array: Simulation time points (seconds from t=0)

diff --git a/docs/wind.md b/docs/wind.md
@@ -54,7 +54,7 @@ Required parameters for WindFarmSCADAPower:
 **SCADA File Format:**
 
 The SCADA file must contain the following columns:
-- `time_utc`: Timestamps in UTC (ISO 8601 format or parseable datetime strings)
+- `time_utc`: Timestamps in UTC (ISO 8601 format or parseable datetime strings). Each timestamp marks the **start of a reporting period**; values on that row are treated as period averages. See [Time Interpretation](timing.md#time-interpretation-inputs-vs-internal-values) for how Hercules converts these to instantaneous values.
 - `wd_mean`: Mean wind direction in degrees
 - `pow_###`: Power output for each turbine (e.g., `pow_000`, `pow_001`, `pow_002`)
 

diff --git a/hercules/hercules_model.py b/hercules/hercules_model.py
@@ -172,10 +172,23 @@ def _read_external_data_file(self, filename):
         """
         Read and interpolate external data from a CSV, feather, or pickle file.
 
-        This method reads external data from the specified file (CSV, feather, or pickle)
-        and interpolates it according to the simulation time steps. The external data must
-        include a 'time_utc' column which will be converted to simulation time.
-        The interpolated data is stored in self.external_signals_all.
+        This method reads external data from the specified file (CSV, feather, or
+        pickle) and upsamples it onto the simulation time grid using
+        ``"instantaneous_to_instantaneous"`` (linear interpolation between the
+        values at the supplied timestamps).
+
+        If zero-order-hold (piecewise-constant / step) behavior is desired --
+        for example, LMP prices that should be held constant across each
+        reporting interval -- the external data file must be pre-processed to
+        include an additional row at the end of each interval carrying the
+        same value.  Linear interpolation between each pair of identical
+        endpoints then reproduces the ZOH shape.  See
+        ``hercules.grid.grid_utilities.generate_locational_marginal_price_dataframe_from_gridstatus``
+        for a worked example of this endpoint-insertion pattern.
+
+        The external data must include a ``time_utc`` column which will be
+        converted to simulation time.  The interpolated data is stored in
+        ``self.external_signals_all``.
 
         Args:
             filename (str): Path to the file containing external data. Supported formats:
@@ -216,7 +229,9 @@ def _read_external_data_file(self, filename):
         )
 
         # Interpolate using the utility function
-        df_interpolated = interpolate_df(df_ext, new_times)
+        df_interpolated = interpolate_df(
+            df_ext, new_times, interpolation_method="instantaneous_to_instantaneous"
+        )
 
         # Convert interpolated DataFrame to dictionary format
         for col in df_interpolated.columns:

diff --git a/hercules/plant_components/power_playback.py b/hercules/plant_components/power_playback.py
@@ -122,7 +122,9 @@ def __init__(self, h_dict, component_name):
 
         # Interpolate df_scada on to the time steps
         time_steps_all = np.arange(self.starttime, self.endtime, self.dt, dtype=hercules_float_type)
-        df_scada = interpolate_df(df_scada, time_steps_all)
+        df_scada = interpolate_df(
+            df_scada, time_steps_all, interpolation_method="averaged_to_instantaneous"
+        )
 
         # Confirm that there is a column called "power"
         if "power" not in df_scada.columns:

diff --git a/hercules/plant_components/solar_pysam_base.py b/hercules/plant_components/solar_pysam_base.py
@@ -126,7 +126,9 @@ def _load_solar_data(self, h_dict):
 
         # Interpolate df_solar on to the time steps
         time_steps_all = np.arange(self.starttime, self.endtime, self.dt, dtype=hercules_float_type)
-        df_solar = interpolate_df(df_solar, time_steps_all)
+        df_solar = interpolate_df(
+            df_solar, time_steps_all, interpolation_method="averaged_to_instantaneous"
+        )
 
         # Can now save the input data as simple columns
         self.year_array = df_solar["time_utc"].dt.year.values

diff --git a/hercules/plant_components/wind_farm.py b/hercules/plant_components/wind_farm.py
@@ -188,7 +188,9 @@ def __init__(self, h_dict, component_name):
 
         # Interpolate df_wi on to the time steps
         time_steps_all = np.arange(self.starttime, self.endtime, self.dt, dtype=hercules_float_type)
-        df_wi = interpolate_df(df_wi, time_steps_all)
+        df_wi = interpolate_df(
+            df_wi, time_steps_all, interpolation_method="averaged_to_instantaneous"
+        )
 
         # INITIALIZE FLORIS BASED ON WAKE MODEL
         if self.wake_method == "precomputed":

diff --git a/hercules/plant_components/wind_farm_scada_power.py b/hercules/plant_components/wind_farm_scada_power.py
@@ -128,7 +128,9 @@ def __init__(self, h_dict, component_name):
 
         # Interpolate df_scada on to the time steps
         time_steps_all = np.arange(self.starttime, self.endtime, self.dt, dtype=hercules_float_type)
-        df_scada = interpolate_df(df_scada, time_steps_all)
+        df_scada = interpolate_df(
+            df_scada, time_steps_all, interpolation_method="averaged_to_instantaneous"
+        )
 
         # Get a list of power columns and infer number of turbines
         self.power_columns = sorted([col for col in df_scada.columns if col.startswith("pow_")])