Skip to content

feat(#38): multi-manufacturer CPAP import via open-cpap-parser#44

Draft
camden-bock wants to merge 30 commits into
joshuamyers-dev:mainfrom
camden-bock:feat/open-cpap-parser
Draft

feat(#38): multi-manufacturer CPAP import via open-cpap-parser#44
camden-bock wants to merge 30 commits into
joshuamyers-dev:mainfrom
camden-bock:feat/open-cpap-parser

Conversation

@camden-bock
Copy link
Copy Markdown
Contributor

@camden-bock camden-bock commented May 21, 2026

Closes #38, closes #19.

Prerequisites: This branch is based on feat/licensing (PR #42, GPLv3 adoption) and requires PR #43 (PyJWT dep fix) to also merge before this can land on main.


What this is

Full implementation of multi-manufacturer CPAP import via cpap-parser. Adds manufacturer detection, per-session machine equipment registration, and provenance tracking (manufacturer, data_source, parser_validated) for all import paths.

What's included

  • importer/cpap_parser_import.py — detect + import via cpap-parser (create_parser(), map_directory_to_sleeplab())
  • importer/import_sessions.py — routing hook: cpap-parser path when detect_open_cpap_layout() returns True, else native ResMed path
  • importer/db.pyfind_or_create_machine_equipment(), update_session_machine_equipment(), INT16 sentinel clamp for timeseries metrics
  • migrations/013_add_manufacturer_and_source.sqlmanufacturer, data_source, parser_validated on sessions; device_serial, parser_validated on user_equipment; machine_equipment_id FK on sessions; 'machine' equipment type
  • api/models.py + api/routers/equipment.py — machine equipment type, device_serial/parser_validated fields, session_id param on /equipment/inferred for per-session machine lookup
  • frontend/ — machine card in SessionDetail, parser warning banner with link to upstream validation issues
  • Dockerfile — multi-stage build: Rust/maturin in builder stage, pre-built wheel copied to slim runtime

Bug fixes (post-initial-review)

  • Events not stored: cpap-parser events use a session_start tagged at block level, not the summary start_datetime. Redesigned event grouping to key by folder_date and normalize onset relative to summary CSL start. All 263 Löwenstein events now stored correctly.
  • SpO2/pulse showing 0 instead of NULL: All-zero SpO2/pulse from machines without an oximeter (e.g. Löwenstein Eyra) were passing the 0 <= v <= 100 range check. Fixed sentinel to 0 < v <= 100; also treats 0 as None in the timeseries extractor.
  • Summary stats unpopulated (AHI, avg_leak, avg_resp_rate, avg_tidal_vol, avg_min_vent): Löwenstein firmware summary doesn't include these fields. Added derivation from timeseries rows (metrics_by_date) and event counts (event_counts_by_date) when the firmware-provided values are absent.
  • Chart time axis showing wrong timezone: MetricsChartSplit.tsx was calling toLocaleTimeString without a timeZone option, ignoring the server-configured DISPLAY_TZ. Fixed to pass getDisplayTz() so chart tick labels reflect the user's local timezone (e.g. Europe/Prague).

Known upstream issues

Löwenstein Eyra timeseries channels (leak, resp_rate, tidal_vol) contain misidentified flow-rate data from the Prisma Line binary parser. mask_pressure is correct. Filed upstream: open-cpap/cpap-parser#27. Workaround: physiological range clamping in db.py nulls out out-of-range values before DB insertion.

ResMed adapter bugs found during validation: open-cpap/cpap-parser#28. See ResMed validation section below.


Integration testing (branch integration/test-prs-50-54)

Integration-tested against PRs #50, #51, #52, #53, and #54 in a combined branch (integration/test-prs-50-54). All five PRs plus #44 were merged and validated against a fresh Postgres + app container. The chart x-axis unification work (unified [xMin, xMax] domain across Night Metrics, Oximetry, and Wearable Sleep Stage charts) was developed during this integration pass and lives on the integration branch — it is not part of this PR but is a direct outcome of the combined testing.

Conflicts resolved:

File Conflict Resolution
frontend/src/App.tsx #54 Equipment nav link + old theme toggle vs #53's redesigned mobile header Kept Equipment link, adopted #53's mobile-responsive nav classes, dropped old nav-level theme toggle (moved to header by #53)
importer/import_sessions.py (imports) #52 adds parse_brp/replace_session_waveform/get_session_db_id; #44 adds find_or_create_machine_equipment/session_exists/update_session_machine_equipment Merged all imports from both sides
importer/import_sessions.py (function) #52 adds backfill_waveform_for_block; #44 uses str | None type syntax Kept backfill_waveform_for_block, used str | None
importer/db.py #52 adds replace_session_waveform; #44 adds replace_session_metrics_cpap/machine equipment helpers Kept all functions from both sides
api/models.py #52 adds WaveformResponse/EventWindowResponse; #44 adds "machine" to EquipmentType Kept both
frontend/src/components/EquipmentCatalog.tsx Trivial: replacementDays vs DEFAULT_REPLACEMENT_DAYS[t] ?? null Took ?? null form

Migration renumbering: #44 (013_add_manufacturer_and_source.sql) and #52 (013_add_session_waveform.sql) both claim migration number 013 from the same main base. In the integration branch, #52's migrations were renumbered:

  • 013_add_session_waveform.sql014_add_session_waveform.sql
  • 014_add_event_inspector_indexes.sql015_add_event_inspector_indexes.sql

⚠️ Merge order dependency: If #44 and #52 land in the same batch, whoever merges second must renumber. The safe ordering is: #44 first (establishes 013), then #52 with its files renumbered to 014/015.

Integration test results:

6 sessions imported across 6 nights (Löwenstein Eyra, cpap-parser path)
data_source='open_cpap_parser', manufacturer='Löwenstein Medical', parser_validated=true
AHI: 3.3–15.4, avg_leak: 40–114 L/min
Events: 263 total (64 CentralApnea, 186 Hypopnea, 5 ObstructiveApnea, 8 ClearAirwayApnea)
SpO2 rows: 0 (no oximeter, correctly NULL)
Machine equipment: 1 row, device_serial=30167534, parser_validated=true
Migrations 013/014/015 all applied cleanly
All API endpoints: sessions, equipment, stats/overview, config — 200 OK
51 backend tests pass, TypeScript clean

ResMed + cpap-parser validation

Routing ResMed through cpap-parser is explicitly out of scope for this PR. detect_open_cpap_layout() correctly falls through to the native EDF path for ResMed SD cards, and MANUFACTURER_VALIDATED["ResMed"] = False is set to reflect unvalidated status. This section documents why that flag is correct and what must be fixed upstream before it can become True.

The issues below are scoped to the ResMed adapter's dependency on cpap-py. They do not affect the Löwenstein adapter (native Rust) or other manufacturers currently implemented in cpap-parser.

Validation methodology

Side-by-side import of the same ResMed AirSense 10 DATALOG (72 nights, 2026-02-03 – 2026-05-15) into two user accounts using native path vs. forced cpap-parser path. Compared against OSCAR v1.7.x summary export as ground truth.

Full validation report: sleepData/resmed-cpap-parser-validation-2026-05-25.md

Findings

Bug 1 — Metrics timestamps anchored to Unix epoch (BLOCKER)

timestamps_low from cpap-py contains relative offsets (elapsed seconds since session start), not absolute timestamps. The current integration stores them directly, resulting in all metrics rows timestamped at 1970-01-01T00:00:00Z. Charts are completely unusable.

Second-order impact on PR #52 (Event Inspector): the window query WHERE sm.ts >= event_datetime - N seconds returns zero rows because the 56-year gap between metric timestamps (epoch) and event timestamps (wall-clock) means no metrics ever fall in the event window. This must be fixed before the Event Inspector can work for any cpap-parser-sourced ResMed session.

The fix is in cpap_parser_import.py _extract_metrics_from_timeseries: add session.start_time.timestamp() to each ts offset. Filed upstream: open-cpap/cpap-parser#28.

Bug 2 — Duration accuracy (BLOCKER)

cpap-parser reads duration from STR.edf daily summaries; native reads from PLD EDF num_records × duration_per_record. Median offset vs OSCAR: +60s; worst case: +30,060s (8.4 hours) on a night with a fragmented session.

Bug 3 — Device serial returns "Unknown"

directory.machine.serial_number returns the string "Unknown" for all ResMed imports. Native importer reads the correct serial from EDF signal headers.

Bug 4 — Ghost sessions from STR.edf history

cpap-parser returns 439 daily_summaries (full STR.edf device history, 2025-02-04 onward) while only 119 DATALOG EDF session blocks exist. The 346 summary-only dates would create empty sessions in the DB — correct AHI/duration from STR.edf but no metrics, events, or waveform. This is a ResMed-specific architectural issue; Löwenstein's native Rust adapter exposes exactly as many summaries as actual session blocks (no ghost sessions).

What's correct

AHI accuracy is roughly equivalent between native and cpap-parser (88% vs 87% of nights within ±0.1 of OSCAR). The sensor data itself is parsed correctly — the issues are timestamp handling, metadata extraction, and the STR.edf/DATALOG boundary mismatch.

Path forward

After upstream bugs 1–3 are fixed in cpap-parser:

  1. Re-run the validation script against a fresh import
  2. If AHI, pressure, and event counts all pass, set MANUFACTURER_VALIDATED["ResMed"] = True
  3. Filter ghost sessions by intersecting daily_summaries with dates that have session blocks before upserting (SleepLab-side guard)
  4. Flip the routing hook in detect_open_cpap_layout() to return True for ResMed — and remove the native EDF path when confidence is high enough

Notes for the reviewer (Joshua)

A few design decisions worth calling out explicitly:

1. cpap-parser pinned to main tarball, not a tagged release

requirements.txt references cpap-parser @ https://gitlab.com/open-cpap/cpap-parser/-/archive/main/cpap-parser-main.tar.gz. There are no PyPI releases or version tags. This means the pin is reproducible only by hash, not by version. uv.lock captures the exact hash at lock time, so builds are deterministic until someone runs uv lock --upgrade. Worth tracking upstream whether tagged releases are planned.

2. parser_validated is a SleepLab-layer flag, not upstream metadata

MANUFACTURER_VALIDATED in cpap_parser_import.py is a dict SleepLab maintains. Currently {"Lowenstein": True, "ResMed": False}. When cpap-parser fixes the ResMed bugs, SleepLab must update this dict and re-run validation — cpap-parser itself does not expose a "this manufacturer is validated" signal. The flag drives both the warning banner in SessionDetail and the parser_validated column on sessions and user_equipment rows.

3. Machine equipment deduplication uses a partial unique index

find_or_create_machine_equipment() does an upsert against (user_id, equipment_type, device_serial) with a partial unique index (WHERE equipment_type = 'machine'). If a device has no serial (e.g. ResMed returning "Unknown"), the serial is stored as NULL and the partial index doesn't apply — two imports from the same machine with no serial will create two machine equipment rows. This is intentional: we can't deduplicate what we can't identify. If upstream fixes Bug 3, this self-corrects.

4. MACHINE_TZ env var is required for correct EDF header timestamps

cpap-parser reads EDF timestamp fields as local time but stores them naively (no tzinfo). MACHINE_TZ (defaulting to UTC) is passed as context to create_parser() to shift them to UTC before storage. If this var is wrong, all session start/end times will be off by the timezone offset. This is the same issue the native ResMed importer has always had — it's not new here, but it's worth knowing that a wrong MACHINE_TZ will silently produce incorrect session timestamps for both paths.

5. INT16 sentinel clamping in db.py

cpap-parser occasionally emits out-of-range timeseries values from the Löwenstein Prisma Line binary parser (see upstream issue #27). replace_session_metrics_cpap() clamps values against physiological ranges before insertion rather than failing loudly. The ranges are conservative — anything outside them is almost certainly a parser artifact, not real physiology. If upstream fixes the Prisma parser, these values will simply stop appearing and the clamping becomes a no-op.

6. Dockerfile build size

The multi-stage build compiles cpap-parser's Rust extension (maturin build) in a full Rust+Python builder image and copies only the resulting wheel into the slim runtime image. The Rust toolchain is ~700 MB; the final image stays under 400 MB. If cpap-parser eventually ships a pre-built wheel to PyPI, the builder stage can be dropped entirely.


Test plan

  • Migration 013: all new columns + partial unique index present, app starts cleanly
  • ResMed native import: sessions import, data_source='resmed_native', parser_validated=true, machine_equipment_id set
  • Equipment API: /equipment/ and /equipment/inferred?session_id= return correct machine per session
  • Sessions API: parser_validated + manufacturer in list and detail responses
  • Session detail UI: machine card shows correct device per session (ResMed vs Löwenstein)
  • Parser warning banner: ⚠ with link to upstream validation issues on parser_validated=false sessions
  • cpap-parser path: Löwenstein Eyra import succeeds, data_source='open_cpap_parser', machine equipment registered
  • Events stored correctly: all cpap-parser events normalized and written to session_events
  • SpO2/pulse: all-zero oximeter data (no oximeter attached) correctly stored as NULL
  • Summary stats: AHI, avg_leak, avg_resp_rate, avg_tidal_vol, avg_min_vent derived from timeseries when absent from firmware summary
  • Chart timezone: time-axis labels respect DISPLAY_TZ (e.g. Europe/Prague)
  • Integration tested with PRs fix: handle timezone for imported metric timestamps #50, enhancement: Add equipment page to main navigation #51, feat: add event inspector waveform view #52, enhancement: general UI and mobile web app modifications/cleanup #53, feat: add advanced trends overview #54 — all conflicts resolved, migrations renumbered
  • ResMed via cpap-parser: validated and explicitly out of scope — 4 upstream bugs block adoption (see above)
  • NOTICE.md updated: cpap-parser (GPL-3.0-only) and cpap-py (MIT)

🤖 Generated with Claude Code

…bs, migration, deps

Adds the dependency, migration, and documented implementation skeleton
for wiring open-cpap-parser into SleepLab's import infrastructure.
No logic is implemented yet; this commit establishes the interfaces,
cross-references, and plan for collaborative completion.

Changes:
- importer/open_cpap_import.py  — new stub module; documents
  detect_open_cpap_layout(), run_open_cpap_import(), and the
  MANUFACTURER_VALIDATED registry with full docstrings and
  TODO(open-cpap-parser) markers
- importer/import_sessions.py   — module docstring updated with routing
  hook pattern; commented import stubs; TODO block in run_local_import()
- importer/db.py                — upsert_session() annotated with the
  three new columns from migration 013 (manufacturer, data_source,
  parser_validated)
- migrations/013_add_manufacturer_and_source.sql — adds manufacturer
  provenance columns; defaults existing rows to 'resmed_native'/true
- pyproject.toml / requirements.txt / api/requirements.txt — add
  open-cpap-parser git direct reference (GPLv3; adopted in PR joshuamyers-dev#42)
- docs/dev/open-cpap-parser-design.md         — architecture design doc
- docs/dev/open-cpap-parser-implementation-plan.md — step-by-step plan
  cross-referencing open-cpap-parser#14 (upstream schema additions)

Blocked on open-cpap/open-cpap-parser#14 for: start_time, spo2_avg,
spo2_min, arousal_count, has_spo2 — all fields propagate automatically
once the upstream schema additions merge.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@camden-bock
Copy link
Copy Markdown
Contributor Author

@joshuamyers-dev — tagging you directly since this PR spans both sides of the integration. On the parser side, open-cpap-parser#14 is the main blocker (start_time, SpO2 summary fields, arousal_count). On the SleepLab side, I'd appreciate your review on the adapter wiring in importer/open_cpap_import.py and the routing hook in import_sessions.py — especially the detection order (open-cpap-parser first, native ResMed EDF fallback) and whether MANUFACTURER_VALIDATED should be driven differently.

- Add open-cpap-parser (GPL-3.0-or-later) to the Python runtime table
  (already noted in the SleepyHead/OSCAR prose section at the top)
- Replace python-jose + ecdsa entries with PyJWT (MIT) to match the
  dependency change made in PR joshuamyers-dev#43

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@camden-bock
Copy link
Copy Markdown
Contributor Author

Implementation recommendations — cpap-parser adapter wiring

Following a full review of both codebases, here are the concrete gaps and recommended steps to wire open_cpap_import.py into the existing importer infrastructure.


1. Migration needed before anything else

db.py:upsert_session() already references therapy_mode, mask_type, humidity_level, temperature_c in its INSERT, but schema.sql does not have these columns yet. Run a migration before testing the importer end-to-end:

-- 013_add_cpap_parser_columns.sql
ALTER TABLE sessions
  ADD COLUMN IF NOT EXISTS therapy_mode    TEXT,
  ADD COLUMN IF NOT EXISTS mask_type       TEXT,
  ADD COLUMN IF NOT EXISTS humidity_level  SMALLINT,
  ADD COLUMN IF NOT EXISTS temperature_c   NUMERIC(4,1),
  ADD COLUMN IF NOT EXISTS manufacturer    TEXT,
  ADD COLUMN IF NOT EXISTS parser_validated BOOLEAN DEFAULT TRUE;

-- arousal_count must be nullable for manufacturers that don't report it
ALTER TABLE sessions ALTER COLUMN arousal_count DROP NOT NULL;
ALTER TABLE sessions ALTER COLUMN arousal_count SET DEFAULT NULL;

2. SpO2 column name mismatch

schema.sql has avg_spo2 / min_spo2. The cpap-parser adapter outputs spo2_avg / spo2_min. Neither name is in the current upsert_session() INSERT. The importer bridge needs to rename these keys before calling upsert_session:

session_dict["avg_spo2"] = session_dict.pop("spo2_avg", None)
session_dict["min_spo2"] = session_dict.pop("spo2_min", None)

And upsert_session() needs avg_spo2 / min_spo2 added to its INSERT and ON CONFLICT UPDATE clauses (they are currently missing from db.py entirely).


3. MACHINE_TZ must be applied before upsert

map_directory_to_sleeplab() returns naive datetimes for start_datetime and pld_start_datetime (machine-local time, no tz). Postgres TIMESTAMPTZ columns need a timezone. Apply the same _localize() used by the ResMed importer:

from import_sessions import _localize  # or inline the same logic

session_dict["start_datetime"]     = _localize(session_dict["start_datetime"])
session_dict["pld_start_datetime"] = _localize(session_dict["pld_start_datetime"])

4. Event format needs an adapter shim

replace_session_events(conn, session_db_id, events, csl_start) expects events as (onset, duration, event_type) 3-tuples and a separate csl_start for computing absolute timestamps.

map_sessions_to_events() returns (event_type, onset_sec, duration_sec, session_start) 4-tuples (field order is different; session_start is embedded). Simplest fix — add a shim in open_cpap_import.py:

def _adapt_events(cpap_events):
    # cpap-parser: (event_type, onset_sec, duration_sec, session_start)
    # db.py: (onset, duration, event_type) + separate csl_start
    by_start = {}
    for event_type, onset, duration, session_start in cpap_events:
        by_start.setdefault(session_start, []).append((onset, duration, event_type))
    return by_start  # keyed by session_start datetime

Then call replace_session_events(conn, db_id, events, csl_start) per group.


5. Metrics format needs a new DB helper

replace_session_metrics() is EDF-specific — it takes an EDF header and a channels dict. map_timeseries_to_metrics() returns a flat list of dicts with Unix epoch float timestamps:

{"ts": 1700000000.0, "mask_pressure": 10.2, "leak": 4.1, ...}

Recommended: add a new replace_session_metrics_cpap(conn, session_db_id, rows) to db.py:

def replace_session_metrics_cpap(conn, session_db_id: int, rows: list[dict]):
    from datetime import timezone
    with conn.cursor() as cur:
        cur.execute("DELETE FROM session_metrics WHERE session_id = %s", (session_db_id,))
    if not rows:
        return
    data = [
        (
            session_db_id,
            datetime.fromtimestamp(r["ts"], tz=timezone.utc),
            r.get("mask_pressure"), r.get("leak"), r.get("resp_rate"),
            r.get("tidal_vol"), r.get("min_vent"), r.get("snore"), r.get("flow_lim"),
        )
        for r in rows
    ]
    sql = """INSERT INTO session_metrics
        (session_id, ts, mask_pressure, leak, resp_rate, tidal_vol, min_vent, snore, flow_lim)
        VALUES %s"""
    with conn.cursor() as cur:
        psycopg2.extras.execute_values(cur, sql, data, page_size=5000)

6. Validation status — suggested storage

map_directory_to_sleeplab() puts validation info in session["meta"]:

session["meta"] == {"validation_status": "validated"|"needs_validation"|"unimplemented",
                     "validation_notes": "..."}

Recommend mapping "validated"parser_validated = TRUE, everything else → FALSE, and logging validation_notes at import time so operators know which devices have unverified output. The manufacturer column can be filled from directory.machine.series.


7. Minimal working run_open_cpap_import sketch

Putting it together:

def run_open_cpap_import(user_id: str, device_path: str, include_timeseries: bool = False) -> dict:
    from cpap_parser import UniversalCPAPParser, map_directory_to_sleeplab
    from import_sessions import _localize

    directory = UniversalCPAPParser().parse(device_path, include_timeseries=include_timeseries)
    result    = map_directory_to_sleeplab(directory, user_id)

    conn = get_conn()
    stats = {"imported": 0, "errors": 0}
    try:
        for session_dict in result["sessions"]:
            meta = session_dict.pop("meta", {})
            # rename SpO2 keys to match schema columns
            session_dict["avg_spo2"] = session_dict.pop("spo2_avg", None)
            session_dict["min_spo2"] = session_dict.pop("spo2_min", None)
            # localize naive datetimes
            session_dict["start_datetime"]     = _localize(session_dict["start_datetime"])
            session_dict["pld_start_datetime"] = _localize(session_dict["pld_start_datetime"])
            # store validation status
            session_dict["parser_validated"] = (meta.get("validation_status") == "validated")
            session_dict["manufacturer"]     = directory.machine.series or None

            try:
                db_id = upsert_session(conn, session_dict)
                # events
                events_for_date = [
                    (onset, dur, etype)
                    for etype, onset, dur, _ in result["events"]
                ]
                replace_session_events(conn, db_id, events_for_date, session_dict["start_datetime"])
                # metrics (if timeseries requested)
                if include_timeseries:
                    replace_session_metrics_cpap(conn, db_id, result["metrics"])
                conn.commit()
                stats["imported"] += 1
            except Exception as e:
                conn.rollback()
                print(f"  ERROR {session_dict.get('session_id')}: {e}")
                stats["errors"] += 1
    finally:
        conn.close()
    return stats

Note: events are currently per-session in the result["events"] list but keyed by session_start — the grouping by date/session block needs care if a night has multiple blocks.


8. Install

Package is now named cpap-parser (renamed from open-cpap-parser). Source install while CI wheels are pending:

pip install git+https://gitlab.com/open-cpap/cpap-parser.git

Once GitLab CI minutes reset, wheels will be published to the registry and can be referenced by version tag.

camden-bock and others added 23 commits May 23, 2026 16:26
…add cpap-parser metric/spo2 helpers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace stubs in open_cpap_import.py with working implementations of
detect_open_cpap_layout() and run_open_cpap_import(), updating imports
from open_cpap_parser to cpap_parser, renaming spo2_avg/spo2_min to
avg_spo2/min_spo2, localizing naive datetimes via MACHINE_TZ, and
populating the MANUFACTURER_VALIDATED registry. Covered by 6 new tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… event key-lookup bug; add tz and event tests

- git mv open_cpap_import.py → cpap_parser_import.py (and test file)
- Fix stale docstring: parser_validated is True for Lowenstein, False for ResMed
- Fix event key-lookup: localize session_start key in events_by_start so lookup
  matches the localized start_datetime (naive != tz-aware dict key mismatch)
- Add test_run_import_localizes_naive_datetime
- Add test_run_import_events_regrouped

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…to native ResMed path

- Update import reference to cpap_parser_import
- Add manufacturer/data_source/parser_validated/avg_spo2/min_spo2 to
  session_data in import_folder() so upsert_session() gets all required keys

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…I; fix CLI routing; support non-ResMed uploads

- import_sessions.py main(): detect cpap-parser layout before ResMed folder loop,
  so browser uploads (subprocess path) route to run_open_cpap_import correctly
- api/models.py: add parser_validated (bool, default True) and manufacturer to SessionSummary
- api/routers/sessions.py: BOOL_AND(parser_validated) + manufacturer in list/detail SQL
- api/client.ts: add parser_validated and manufacturer to SessionSummary TS interface
- pages/SessionDetail.tsx: warning banner when parser_validated is false
- pages/Import.tsx: collect all files (not just .edf) for non-ResMed SD cards;
  update description to be device-agnostic; note about metric accuracy on upload complete

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sion FK

Adds 'machine' equipment type to user_equipment (device_serial, parser_validated
columns; start_date made nullable), a partial unique index on (user_id, device_serial)
for machine records, and machine_equipment_id UUID FK on sessions. Also syncs
schema.sql with migrations 009/013 (therapy_mode, mask_type, humidity_level,
temperature_c, equipment FKs).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…hine_equipment

Auto-creates a 'machine' user_equipment record keyed by device_serial on first import,
updates parser_validated on re-import, and links the session row via machine_equipment_id FK.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Both the native ResMed EDF path (import_sessions.py) and the cpap-parser path
(cpap_parser_import.py) now call find_or_create_machine_equipment after each
upsert_session and set machine_equipment_id on the session row.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…alidated

Extends EquipmentType, EquipmentResponse, and InferredEquipment to include 'machine'.
Equipment router now selects device_serial/parser_validated in all queries and handles
machine records in the /inferred endpoint (ordered by updated_at rather than start_date).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…catalog

Adds 'machine' to Equipment/InferredEquipment types, renders a read-only CPAP Machine
section in EquipmentCatalog (showing manufacturer, serial, and parser_validated badge),
and adds machine to the equipment grid in SessionDetail.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tion

Covers find_or_create_machine_equipment (create/find/noop paths),
update_session_machine_equipment, and the three machine-linking
behaviours in run_open_cpap_import (called with correct args,
skipped when find returns None, skipped for existing sessions).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…a cpap-parser

- Detect Lowenstein/non-ResMed SD card layouts with create_parser() (fixes
  UniversalCPAPParser() having no registered adapters)
- Import timeseries metrics with include_timeseries=True; filter per-session
  by timestamp range (was writing full dataset to every session)
- Clamp INT16 sentinel values (±32767) for SpO2 and timeseries channels
- Fix datetime.UTC → timezone.utc in db.py metrics insertion
- Wire find_or_create_machine_equipment + update_session_machine_equipment
  into both native ResMed and cpap-parser import paths
- Add session_id param to /equipment/inferred for per-session machine lookup
- Fix duplicate "CPAP Machine" label in SessionDetail when brand/model absent
- Parser warning banner links to upstream validation issues
- Multi-stage Dockerfile: Rust/maturin in builder stage, wheel copied to runtime
- Update test mocks to patch cpap_parser.core instead of cpap_parser.adapters.base

Known upstream: Löwenstein Eyra timeseries channel misidentification filed at
gitlab.com/open-cpap/cpap-parser/-/work_items/27

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…k nights

Time-range filtering against daily summary start+duration missed overnight
sessions that start before midnight (e.g. 23:04) when the summary's start_datetime
was the tail end at 05:28. Replace with date-based grouping using
map_timeseries_to_metrics/map_timeseries_to_spo2 per CPAPSession block,
keyed on start_time.date() to match cpap-parser's own sessions_by_date logic.

Update tests to mock cpap_parser.adapters.sleeplab_output and unpack
the new 5-value return from _mock_cpap_parser.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…y-start boundaries

The previous date bucketing used s.start_time.date() (UTC), which misassigns
tail-end blocks that cross midnight UTC (e.g. a session starting 01:53 UTC
that belongs to the prior night). Replace with a binary-search approach that
mirrors cpap-parser's own grouping: each block maps to the daily summary
whose start_datetime is the latest one ≤ the block's start. This recovers
~96k dropped metric rows for the May 17 night in the Lowenstein sample data.

Also update test mocks to use real timezone-aware datetimes for start_time
so the .replace(tzinfo=None) comparison in _folder_for_block works correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…wenstein

- Events: regroup by folder_date instead of session_start so sub-blocks
  (block[2], block[3]...) contribute events to the right session; normalize
  onset relative to the summary csl_start
- SpO2: treat 0.0 as null (no pulse-ox attached); fix clamp from [0,100] to
  (0,100] so device-zeroed values are not stored as real data
- Summary fields: derive avg_leak, avg_resp_rate, avg_tidal_vol, avg_min_vent
  from timeseries when the device summary omits them (Lowenstein firmware)
- AHI: compute from parsed event counts when total_ahi_events=0
- db.py: skip all-null spo2 rows in replace_session_spo2_cpap
- Tests: add 4 new targeted tests covering each fix

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Chart tick labels and tooltips now respect the configured display_tz
(from /config) so sessions recorded in non-local timezones (e.g.
Europe/Prague) show the correct local time instead of browser-local time.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
camden-bock and others added 2 commits May 25, 2026 12:25
cpap-py is a direct runtime dependency (pulled in by cpap-parser as its
ResMed adapter backend). MIT license.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
SleepLab incorporates OSCAR-derived features via cpap-parser, so the
prior disclaimer ("does not implement any direct derivative") was
inaccurate. Replaced with neutral acknowledgment of the lineage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@camden-bock
Copy link
Copy Markdown
Contributor Author

@kilo-WATT — requesting your review on the chart/plotting design decisions in this PR and the integration branch. Summary below.


Chart changes in this PR (feat/open-cpap-parser)

MetricsChartSplit.tsx — timezone fix

The fmtTs formatter was calling toLocaleTimeString without a timeZone option, which silently ignored the server-configured DISPLAY_TZ and rendered tick labels in the browser's local time. Fixed to pass getDisplayTz():

// before
new Date(ts).toLocaleTimeString([], { hour: '2-digit', minute: '2-digit' })

// after
new Date(ts).toLocaleTimeString([], { hour: '2-digit', minute: '2-digit', timeZone: getDisplayTz() })

This is the only chart change in this PR itself. The rest arose during integration testing.


Chart changes developed during integration testing (integration/test-prs-50-54)

These are not part of this PR's diff but were written to validate how the cpap-parser data (which adds wearable sleep stage data and multi-source oximetry) interacts with the existing charts alongside PRs #50–54. They will need to land in their own PR or be pulled into the integration merge.

frontend/src/lib/chartBounds.ts (new file)

export function computeChartBounds(
  metrics: MetricsResponse | null,
  spo2: SpO2Response | null,
  wearable: WearableData | null,
): [number, number] | null

Unions all timestamp arrays from CPAP metrics, oximetry (CPAP + wearable), and wearable sleep stages/HR/SpO2 into a single [xMin, xMax] millisecond domain. Returns null when no data is loaded yet.

Decision: All three data sources are included in the union. The wearable data (stages, HR, SpO2) often starts slightly before or after the CPAP session — including it means the night view encompasses everything the user wore/used, which matches the mental model of "show me my full night."

MetricsChartSplit.tsxxDomain prop

Accepts xDomain?: [number, number]. When provided, tick generation uses domainMin/domainMax instead of the data's own min/max:

const domainMin = xDomain ? xDomain[0] : dataMin
const domainMax = xDomain ? xDomain[1] : dataMax
const firstTick = Math.ceil(domainMin / TICK_INTERVAL_MS) * TICK_INTERVAL_MS
for (let t = firstTick; t <= domainMax; t += TICK_INTERVAL_MS) xTicks.push(t)

Decision: Fixed 30-minute tick interval. If the unified domain extends beyond the CPAP metrics data (e.g. wearable stages start 20 minutes before therapy), the axis correctly shows the pre-therapy region as empty — no data there, blank chart. This felt more honest than clipping the axis to where data exists.

SpO2Chart.tsx — numeric timestamps + xDomain prop

Two changes:

  1. The byTs merge object now keys on numeric ms (new Date(ts).getTime()) instead of raw ISO strings. This changes deduplication: two events at the same millisecond from different sources now merge into one row; previously they merged only if their ISO strings were byte-for-byte identical.
  2. Both sub-charts (SpO₂ and Pulse/HR) accept xDomain and set type="number" scale="time" on XAxis, which is required for Recharts to correctly interpolate the numeric ms domain.

WearableSleepStageChart.tsx — same pattern

ts field changed from new Date(timestamp).toISOString() to new Date(timestamp).getTime() (numeric ms). XAxis: type="number" scale="time". Added xDomain prop.

SessionDetail.tsx — domain computation + prop threading

const xDomain = useMemo(
  () => computeChartBounds(metrics, spo2, wearableData),
  [metrics, spo2, wearableData],
)

Passed as xDomain={xDomain ?? undefined} to MetricsChartSplit, SpO2Chart, and WearableSleepStageChart.


What was intentionally excluded

EventTimeline (PR #52 — the scrollable event inspector bar) was not unified with the shared x-axis. It uses CSS left: N% positioning relative to session start/end from the CPAP session's own timestamps, not Recharts. Aligning it with the unified domain would require either converting it to a Recharts chart or recomputing the percentages from xDomain rather than the session object's start/end. Left out of scope for now — but worth discussing whether you want it aligned.


Open design questions for your review

  1. EventTimeline alignment — should the event bar's x-axis align with the unified night domain (so wearable pre/post-therapy time shows as empty space on the event bar too), or is it fine scoped to the CPAP session window only?

  2. byTs keying in SpO2Chart — switching from ISO string keys to numeric ms keys changes how sub-millisecond duplicates merge. In practice CPAP samples are at 1-second resolution and wearable at ~4-second, so collisions are unlikely, but worth a sanity check.

  3. Fallback behavior during loadxDomain is null until all three data sources resolve. During that window each chart independently uses its own ['dataMin', 'dataMax']. Alternative: show a skeleton/loading state and hold rendering until xDomain is non-null. Current behavior avoids a visible loading flash but means the axes briefly rescale when the last source loads.

  4. 30-minute tick interval — works well for a full 6–9 hour night. For short naps or very long nights it may produce too few or too many ticks. Would an adaptive interval (based on domainMax - domainMin) be preferable?

Thanks for the look!

@joshuamyers-dev
Copy link
Copy Markdown
Owner

joshuamyers-dev commented May 26, 2026

@camden-bock I think we can work around parts of open-cpap-parser#14 on the SleepLab side, with a couple of limits.

Reasonable temporary workarounds:

  1. start_time
    If the parser exposes block/session-level CPAPSession.start_time, SleepLab can use that to recover a better start_datetime and stable session grouping. The current PR is already doing some of this by matching block starts back to folder_date. I’m comfortable with that as a bridge, but I’d still prefer upstream to expose summary start_time directly so session identity is not inferred.

  2. SpO2 summary fields
    We can derive avg_spo2, min_spo2, and has_spo2 from parsed SpO2 timeseries:

    • ignore None
    • ignore 0
    • ignore out-of-range/sentinel values
    • set has_spo2 = true only if there are valid samples

That seems safe enough because it is derived from actual samples, not guessed.

  1. arousal_count
    We can derive this from parsed events if the parser emits a recognisable arousal/RERA/respiratory-effort event type. If those event names are not consistent across manufacturers, I’d rather leave arousal_count = null than invent a mapping too early.

What wouldn't be ideal:

  • Synthesising from start_time from folder_date midnight except as a visible degraded fallback.
  • Marking SpO2 as present just because a channel exists; all-zero oximetry should remain “not present”.
  • Treating derived fields as equivalent to upstream-provided summary fields unless we record/understand the derivation.

But open-cpap-parser#14 should still remain the real fix, especially for stable session identity. What do you think?

@joshuamyers-dev
Copy link
Copy Markdown
Owner

@camden-bock Adapter is looking good - few things:

1. ResMed can still hit the wrong parser path

References: import_sessions.py:274-278, 334-339

Right now the router tries detect_open_cpap_layout() first, and if open-cpap-parser recognises a ResMed SD card, it takes over, bypassing the native EDF importer completely. So MANUFACTURER_VALIDATED["ResMed"] = False only tags the data as unvalidated after the fact. It doesn't actually stop the bad path from running.

Routing and validation need to be separate decisions. Keep open-cpap-parser first for non-native manufacturers, but explicitly bail on ResMed in the routing probe until that path is validated. The parser_validated=false flag is fine for UI warnings, it just shouldn't be the safety net.


2. The probe only catches one error type

Reference: cpap_parser_import.py:146-150

detect_open_cpap_layout() runs a full parse() call and only falls back on UnsupportedDirectoryError. If the parser recognises the directory but throws something else, the native ResMed fallback won't run. That's a fragile probe, especially while this integration is new.

Either make the probe lighter (manufacturer-only if the parser supports it), or catch broader parser failures and fall back to native where needed. I wouldn't silently swallow errors for non-ResMed, but for the ResMed-native path specifically, a probe regression shouldn't break existing imports.


3. MANUFACTURER_VALIDATED will get messy fast

References: cpap_parser_import.py:318-325, 410-413

A single boolean per manufacturer is fine as a starting guardrail, but it's already conflating two things:

  • "Are we allowed to route this through the parser?"
  • "Should we show a validation warning?"

Those will diverge. Validation really depends on manufacturer, parser version, device family, and which data surfaces are covered (summaries, events, SpO2, waveforms). Keep the dict for now, but treat it as an explicit policy table with separate fields like route_enabled, summary_validated, spo2_validated, etc. At minimum, split routing from the warning flag before this gets harder to untangle.


4. The SpO2 workaround for open-cpap-parser#14 is incomplete

References: cpap_parser_import.py:301-310, 383-384

The code sanitises avg_spo2 and min_spo2, and zeroed samples get dropped. But has_spo2 still comes from the upstream adapter dict, so you can end up with:

  • has_spo2=true when all samples were bad, or
  • has_spo2=false when valid samples actually exist in spo2_by_date

Derive has_spo2 from the sanitised rows instead:

valid_spo2_rows = [
    r for r in spo2_by_date.get(folder_date, [])
    if r.get("spo2") is not None or r.get("pulse") is not None
]
session_dict["has_spo2"] = bool(valid_spo2_rows)

Then derive avg_spo2/min_spo2 from those rows if upstream didn't provide them.


5. arousal_count isn't being derived from events

Reference: cpap_parser_import.py:330-346

The PR already derives AHI counts from parsed events when summary values are missing, which is a good workaround. But arousal_count isn't getting the same treatment. If open-cpap-parser emits a stable arousal/RERA event type, worth deriving it at the same time. If event naming isn't consistent across manufacturers, leave it NULL. Either way, the current logic should make that choice explicitly rather than just not doing it.


6. --folder skips the parser routing entirely

Reference: import_sessions.py:322-332

The --folder CLI path calls import_folder() directly before the open-cpap-parser routing block, so it can't import non-ResMed layouts. Fine if --folder is intentionally native-ResMed-only, but that should be documented. Otherwise it needs equivalent parser routing, or should reject unsupported use clearly.


@camden-bock
Copy link
Copy Markdown
Contributor Author

Implementation plans addressing your review have been posted on the tracking issue:

  • Plan A (SleepLab-side changes)issue #38 comment
    9 tasks: ResMed routing guard, broader probe error handling, --folder routing, two-tier validation model, SpO2 derivation, arousal_count derivation, and open-cpap-parser#14 workarounds.

  • Plan B (Parser-side validation metadata)issue #38 comment + cpap-parser#29
    New get_validation_status() API exposing summary_validated, spo2_validated, events_validated, waveform_validated per manufacturer. route_enabled stays on the SleepLab side.

@camden-bock
Copy link
Copy Markdown
Contributor Author

3. MANUFACTURER_VALIDATED will get messy fast

References: cpap_parser_import.py:318-325, 410-413

A single boolean per manufacturer is fine as a starting guardrail, but it's already conflating two things:

* "Are we allowed to route this through the parser?"

* "Should we show a validation warning?"

Those will diverge. Validation really depends on manufacturer, parser version, device family, and which data surfaces are covered (summaries, events, SpO2, waveforms). Keep the dict for now, but treat it as an explicit policy table with separate fields like route_enabled, summary_validated, spo2_validated, etc. At minimum, split routing from the warning flag before this gets harder to untangle.

I was thinking about this, I'm wondering if it would make the most sense to put the parser validation status as data from the parser itself? Then sleeplab isn't dependent on maintaining that metadata, and it is dynamically provided. The route_enabled would live sleeplab side, essentially deciding when the parser path is preferred or sufficiently validated to include (can make judgments about how much a 'best guess' approach is worth using when we don't have files to test on).

If we went for this solution it would need either (1) a back and forth with the parser, fetching validation status after the file structure is recognized, then fetching parsed data or (2) waste some resources parsing the data, and decide when the manufacturer/model comes back whether the parser is worth using.

My intention is to keep parsers modular, so that parser development work is useful to projects outside sleeplab, and to make it easy to provide a quick method for adding new parsers for new equipment or swapping parsers out in case of stack regression/decay.

@camden-bock
Copy link
Copy Markdown
Contributor Author

camden-bock commented May 26, 2026

cpap-parser upstream changes that affect this PR (ref: open-cpap/cpap-parser !8 + !9, both targeting develop, awaiting validation against real data samples before merge to main)


🚨 Bug 1 (timestamps) — BREAKING CHANGE, action required before updating cpap-parser

MR !8 fixes the timestamp bug upstream: timestamps and timestamps_low in TimeSeriesData are now absolute UTC epoch seconds, not relative offsets. Once !8 is in your pinned version, the workaround in cpap_parser_import.py::_extract_metrics_from_timeseries that adds session.start_time.timestamp() must be removed. Keeping it after the upstream fix would produce timestamps ~1.75 trillion seconds in the future (double-counted epoch).

Safe update sequence:

  1. Remove the session.start_time.timestamp() + ts_offset workaround from _extract_metrics_from_timeseries
  2. Then bump the cpap-parser pin to include !8
  3. Re-run the timestamp sanity check: assert metrics_rows[0]["ts"] > 1_700_000_000

✅ Bug 2 (duration accuracy) — fixed upstream

usage_hours is now derived from EDF session span (num_records × duration_per_record) instead of STR.edf mask-on time. The 8.4-hour outlier nights were ghost days (no EDF) — also fixed (see below). After !8, session_dict["duration_seconds"] from cpap-parser will match OSCAR.


✅ Bug 3 (serial "Unknown") — fixed upstream

Added a direct JSON fallback in _load_machine_info that walks known AirSense 10/11 nesting paths. Device serials should now populate for both AirSense 10 (.tgt) and AirSense 11 (.json). The machine equipment deduplication NULL-serial case will be rare after !8.


✅ Bug 4 (ghost sessions) — fixed upstream

daily_summaries is now filtered to dates that have at least one EDF block. The 320+ STR.edf history entries with no corresponding DATALOG file are gone. Any SleepLab-side ghost-session guard is now redundant (but harmless as a no-op).


🆕 Issue #29get_validation_status() now available

MR !9 adds from cpap_parser import get_validation_status returning a dict of ManufacturerValidation(summary_validated, spo2_validated, events_validated, waveform_validated). Initial entries:

Manufacturer summary spo2 events waveform
Löwenstein
ResMed ✅ (after !8)

This keeps route_enabled on the SleepLab side while sourcing the *_validated signals from the parser. When !9 lands, MANUFACTURER_VALIDATED["Lowenstein"] and MANUFACTURER_VALIDATED["ResMed"] (the summary_validated component) can be replaced with get_validation_status()[manufacturer].summary_validated.


Both MRs are on develop and require validation against real device data samples before merging to main. Tracking: open-cpap/cpap-parser!8 (bugfix) and open-cpap/cpap-parser!9 (validation metadata).

@joshuamyers-dev
Copy link
Copy Markdown
Owner

I was thinking about this, I'm wondering if it would make the most sense to put the parser validation status as data from the parser itself? Then sleeplab isn't dependent on maintaining that metadata, and it is dynamically provided. The route_enabled would live sleeplab side, essentially deciding when the parser path is preferred or sufficiently validated to include (can make judgments about how much a 'best guess' approach is worth using when we don't have files to test on).

If we went for this solution it would need either (1) a back and forth with the parser, fetching validation status after the file structure is recognized, then fetching parsed data or (2) waste some resources parsing the data, and decide when the manufacturer/model comes back whether the parser is worth using.

My intention is to keep parsers modular, so that parser development work is useful to projects outside sleeplab, and to make it easy to provide a quick method for adding new parsers for new equipment or swapping parsers out in case of stack regression/decay.

Yep, I think that split makes sense.

The parser should probably own the validation metadata for its own output: summaries, SpO2, events, waveforms, parser version, and device family or model where it can recognise that. That keeps the parser useful outside SleepLab, and avoids SleepLab having to maintain parser-specific knowledge that will drift over time.

SleepLab can keep the app-level policy decision: route_enabled, meaning "should we use this parser path for this directory right now?" For ResMed, that lets us stay conservative while the native EDF importer is still the safer path. For other manufacturers, where the parser may be the only practical option, we can make a different call.

I would prefer option 1 if the parser can support it. A light recognition step would be ideal: identify the manufacturer/model, return the validation metadata, then let SleepLab decide whether to parse, fall back, or reject clearly. Only after that would we run the full import.

Option 2 is workable, but I would rather not make "parse everything, then decide whether we trust it" the normal flow. That feels like the wrong default while we are trying to protect the native ResMed path.

So I think the split is:

  • parser package: directory recognition and validation metadata
  • SleepLab: routing policy, native fallback behaviour, and import/UI warnings
  • imported sessions: the validation state that applied at import time

That keeps the parser modular, which I agree is the right direction, while still letting SleepLab make cautious routing decisions.

@camden-bock
Copy link
Copy Markdown
Contributor Author

camden-bock commented May 27, 2026 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: multi-manufacturer CPAP import via open-cpap-parser (extends #19) feat: support for lowenstein EDF files

2 participants