Dev flow control anti windup#107
Draft
lnagel wants to merge 11 commits into
Draft
Conversation
Correct the PID integral at observation period boundaries by comparing actual valve delivery against commanded duty cycle. This prevents integral windup when the duty cycle is too small for the minimum run time, avoiding excessive overshoot when demand later rises. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use a period-start snapshot (commanded_duty_cycle) instead of the end-of-period PID duty cycle as u_commanded. This prevents sign errors when the duty cycle drifts during the observation period — e.g., when the PI outputs 15% at period start but drops to 6% as the room warms. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update data flow diagram to reflect back-calculation step during period transition. Fix pid.py docstring that still referenced "period end" after the snapshot refactor. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The single period-start snapshot was too coarse — the PID drifts during the period, and when a valve delivers its quota and closes, the stale snapshot can cause false corrections. Replace with paired fields (last_action_at, last_requested_duration) bumped forward at PWM convergence points (valve open, valve close, period start). These fields are also persisted, fixing incorrect corrections after restart. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The harness was missing the mid-period convergence point tracking that the coordinator performs after evaluate(). This ensures TURN_ON/TURN_OFF events update last_requested_duration for accurate back-calculation at period boundaries. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- assert_integral_converged helper: checks integral average stays below a threshold after settling time - test_low_kp_integral_converges: kp=10 at outdoor=19°C, verifies integral settles proportional to steady-state duty (~3.73%) - test_demand_transition_bounded_overshoot: outdoor 20→0°C cold snap, verifies temperature stays within setpoint ± 2°C - Strengthen test_sustained_under_delivery with integral level check (max_value=15 vs theoretical duty ~2.5%) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add per-zone nominal flow rate configuration and two flow-rate constraints to the heating controller: - Flow constraint (max): Limits how many zones can be ON simultaneously by capping aggregate flow rate. TURN_ON candidates are prioritized by remaining quota (front-loading high-demand zones). Zones already ON are never preempted, and a single zone is never starved. - Flow minimum gating: Suppresses the boiler heat request when aggregate flow from active zones is below a configured minimum threshold (latent heat mode), allowing residual heat to be used before firing the boiler. Both thresholds are optional and configurable via a new "Flow Scheduling" options menu step. Per-zone nominal flow rate is configurable in zone entity settings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace string literals with CONF_NOMINAL_FLOW_RATE, CONF_OPTIMAL_FLOW_RATE_MIN, and CONF_OPTIMAL_FLOW_RATE_MAX constants in config_flow.py and coordinator.py for consistency and typo prevention. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Zones wanting TURN_ON are now demoted to STAY_OFF when total prospective flow (STAY_ON + admitted TURN_ON) falls below optimal_flow_rate_min. This prevents opening valves when the boiler won't fire due to insufficient flow. Back-calculation naturally corrects the integral since used_duration=0 when the valve stays closed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cover edge cases: zones without nominal_flow_rate passing through unconstrained, single-zone never-starve path, and min-flow enforcement (demote, preserve STAY_ON, sufficient flow). Brings scheduler.py diff coverage to 100%. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #107 +/- ##
==========================================
- Coverage 96.61% 95.38% -1.24%
==========================================
Files 20 21 +1
Lines 1715 1842 +127
Branches 257 297 +40
==========================================
+ Hits 1657 1757 +100
- Misses 36 59 +23
- Partials 22 26 +4
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Branch Summary: Flow-Rate Scheduling + Back-Calculation Anti-Windup
Problem Statement
A multi-zone UFH system controlled by PWM (pulse-width modulation of valve open time) faces two interacting constraints:
Flow-rate limits: The boiler/heat pump has a minimum and maximum aggregate flow rate. Too many zones open simultaneously exceeds the heat source capacity; too few zones open means insufficient flow for the heat source to fire at all.
PID integral windup under contention: When zones compete for limited flow slots, some zones are deferred — their valves stay closed even though the PID controller commands heating. The PID integral accumulates error during deferral. Without correction, the integral ratchets to its clamp (100%), causing aggressive overshoot when the zone finally gets its turn.
These two features must work together: flow scheduling creates the contention, and back-calculation anti-windup prevents the PID from misbehaving because of it.
Feature 1: Back-Calculation Anti-Windup
Mechanism
At each observation period boundary (every 2 hours by default), the controller compares what the PID commanded against what was actually delivered:
u_commanded: duty cycle derived from
last_requested_duration / observation_period. Thelast_requested_durationis captured at convergence points — valve state transitions (TURN_ON, TURN_OFF) and period boundaries — to reflect the duty cycle at the moment the PWM decision was last re-evaluated.u_actual:
used_duration / observation_period.used_durationaccumulates real seconds of flow (valve open and flow confirmed). When a supply temperature sensor is available, accumulation is weighted by supply coefficient (actual supply temp / target supply temp), so partial-heat delivery counts proportionally.The correction uses the standard back-calculation formula (Åström & Hägglund):
When
u_actual < u_commanded(zone was deferred or partially served), the correction is negative, pulling the integral down. Whenu_actual > u_commanded(rare — zone got more than requested), the correction is positive.Why convergence-point tracking?
The naive approach — comparing
used_durationagainst the current PID duty cycle — fails because the PID output changes continuously. By the time the period ends, the current duty cycle no longer reflects what was commanded when the scheduling decision was made. Convergence-point tracking snapshotsrequested_durationat the moment the controller last made a meaningful decision about the zone, giving a stable reference for comparison.Design question for review
Tracking gain Kt = Ki/Kp: This is the textbook choice for continuous PI controllers (Kt = 1/Ti where Ti = Kp/Ki). However, this system is not continuous — it's a sampled PWM system where corrections happen at discrete 2-hour boundaries. The effective correction per period is:
With default Ki=0.001, Kp=50, observation_period=7200s:
So a 100% mismatch (u_actual=0, u_commanded=100) produces a correction of -14.4 per period. This means it takes roughly 7 periods (14 hours) to unwind a fully saturated integral.
Is this correction speed appropriate? Too fast risks oscillation (zone gets deferred → integral drops → zone loses priority → gets deferred again). Too slow means the integral stays elevated for many hours after a zone is starved.
Feature 2: Flow-Rate Scheduling
Max-flow enforcement
When aggregate flow from zones wanting to open would exceed
optimal_flow_rate_max, TURN_ON candidates are admitted by priority (remaining quota descending — zones needing the most time get first access). Excess candidates are demoted to STAY_OFF. Already-running zones (STAY_ON) are never preempted mid-run.Never-starve rule: If a single zone is the only candidate and no other zones are running (committed_flow = 0), it is admitted regardless of whether its flow exceeds max. This prevents a high-flow zone from being permanently starved. The boiler should still be able to modulate for a single circuit.
Min-flow enforcement (new in this branch)
After max-flow admission, if total prospective flow (STAY_ON + admitted TURN_ON zones) is below
optimal_flow_rate_min, all TURN_ON candidates are demoted to STAY_OFF. The rationale: with insufficient flow, the boiler/heat pump won't fire. Opening valves just circulates cold water through the buffer/pipes — wasting pump energy and providing no useful heating.STAY_ON zones are not demoted — closing a running valve mid-cycle would cause unnecessary wear and transient behavior.
Interaction with back-calculation
When min-flow enforcement defers a zone:
used_duration = 0u_actual = 0vsu_commanded > 0Without min-flow enforcement, the zone's valve would open but the boiler wouldn't fire (heat_request suppressed). The zone would accumulate
used_durationfrom valve-open time, makingu_actual ≈ u_commanded, and back-calculation would see no mismatch. The integral would ratchet to 100% unchecked.Design question for review
Should min-flow ever be overridden? Currently, if only 1 zone has demand and the other 4 are satisfied, that zone is permanently deferred until another zone develops demand. In a real installation:
Room Thermal Model
Simulations use a lumped-capacitance model per EN ISO 13790:
Three room archetypes (per EN 12831 / EN 1264):
The simulation harness models valve ramp (180s open, 90s close) and flow detection (valve position > 85% threshold).
Test Scenarios
All simulations use 60-second time steps and 2-hour observation periods unless noted.
Anti-Windup Tests (
test_anti_windup.py)test_integral_clamps_at_maxtest_integral_clamped_at_zero_above_setpointtest_integral_recovers_from_clamptest_under_delivery_correctiontest_over_delivery_tolerancetest_sustained_under_deliverytest_low_kp_integral_convergestest_demand_transition_bounded_overshootFlow Control Tests (
test_flow_control.py)All flow control tests use 5 zones × 2 L/min each, min_flow=4 L/min (requires ≥ 2 zones), max_flow=6 L/min (allows ≤ 3 zones), well-insulated room archetype, Kp=30, Ki=0.001.
test_flow_limited_zones_reach_setpointtest_flow_limited_integral_stays_reasonabletest_flow_limited_fair_allocationtest_heat_request_requires_min_flowtest_single_demand_zone_deferred_by_min_flowUnit Tests (
test_scheduler.py)Six unit tests covering edge cases in
apply_flow_constraint:nominal_flow_ratepasses through unconstrained (max-flow)Open Questions for Heating Engineer Review
Min-flow threshold vs boiler protection: Most modern boilers have a built-in minimum flow switch or bypass valve. Should the controller's min-flow enforcement be considered a "soft" optimization (save pump energy) or a "hard" safety requirement (prevent boiler lockout)? This affects whether the never-starve rule should apply to min-flow as well.
Single-zone deferral duration: When only one zone has demand, it is deferred indefinitely. In practice, how long is acceptable for a zone to wait? Would a 2-hour timeout (one observation period) be reasonable before overriding the min-flow constraint?
Correction speed: The back-calculation corrects ~14.4 integral units per 2-hour period for a full mismatch. With integral range [0, 100], this means about 14 hours to fully unwind a saturated integral. Is this response speed appropriate for residential UFH with its long thermal time constants (13–60 hours)?
Supply coefficient weighting: When a supply temperature sensor is configured,
used_durationis weighted byactual_supply_temp / target_supply_temp. This means a zone receiving 30°C water when the target is 40°C accumulates at 75% rate. Is this linear approximation reasonable for UFH heat transfer, or should it be non-linear (e.g., accounting for floor surface resistance)?Valve ramp in flow calculation: Flow is detected when valve position exceeds 85%. The simulation models valve ramp (180s to open, 90s to close), but the flow constraint uses nominal_flow_rate as a binary value — a zone either contributes its full flow or zero. In practice, partially-open valves have reduced flow. Does this simplification matter for the scheduling decisions, or is the binary model adequate?
Observation period alignment: Observation periods are aligned to midnight (00:00, 02:00, 04:00...). This means the first period after startup may be short. Back-calculation uses the full observation_period as
dt, regardless of the actual elapsed time. Should this be the actual period duration instead?