feat(#38): multi-manufacturer CPAP import via open-cpap-parser#44
feat(#38): multi-manufacturer CPAP import via open-cpap-parser#44camden-bock wants to merge 30 commits into
Conversation
…bs, migration, deps Adds the dependency, migration, and documented implementation skeleton for wiring open-cpap-parser into SleepLab's import infrastructure. No logic is implemented yet; this commit establishes the interfaces, cross-references, and plan for collaborative completion. Changes: - importer/open_cpap_import.py — new stub module; documents detect_open_cpap_layout(), run_open_cpap_import(), and the MANUFACTURER_VALIDATED registry with full docstrings and TODO(open-cpap-parser) markers - importer/import_sessions.py — module docstring updated with routing hook pattern; commented import stubs; TODO block in run_local_import() - importer/db.py — upsert_session() annotated with the three new columns from migration 013 (manufacturer, data_source, parser_validated) - migrations/013_add_manufacturer_and_source.sql — adds manufacturer provenance columns; defaults existing rows to 'resmed_native'/true - pyproject.toml / requirements.txt / api/requirements.txt — add open-cpap-parser git direct reference (GPLv3; adopted in PR joshuamyers-dev#42) - docs/dev/open-cpap-parser-design.md — architecture design doc - docs/dev/open-cpap-parser-implementation-plan.md — step-by-step plan cross-referencing open-cpap-parser#14 (upstream schema additions) Blocked on open-cpap/open-cpap-parser#14 for: start_time, spo2_avg, spo2_min, arousal_count, has_spo2 — all fields propagate automatically once the upstream schema additions merge. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@joshuamyers-dev — tagging you directly since this PR spans both sides of the integration. On the parser side, open-cpap-parser#14 is the main blocker (start_time, SpO2 summary fields, arousal_count). On the SleepLab side, I'd appreciate your review on the adapter wiring in |
- Add open-cpap-parser (GPL-3.0-or-later) to the Python runtime table (already noted in the SleepyHead/OSCAR prose section at the top) - Replace python-jose + ecdsa entries with PyJWT (MIT) to match the dependency change made in PR joshuamyers-dev#43 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ser to github.com/open-cpap/cpap-parser
Implementation recommendations — cpap-parser adapter wiringFollowing a full review of both codebases, here are the concrete gaps and recommended steps to wire 1. Migration needed before anything else
-- 013_add_cpap_parser_columns.sql
ALTER TABLE sessions
ADD COLUMN IF NOT EXISTS therapy_mode TEXT,
ADD COLUMN IF NOT EXISTS mask_type TEXT,
ADD COLUMN IF NOT EXISTS humidity_level SMALLINT,
ADD COLUMN IF NOT EXISTS temperature_c NUMERIC(4,1),
ADD COLUMN IF NOT EXISTS manufacturer TEXT,
ADD COLUMN IF NOT EXISTS parser_validated BOOLEAN DEFAULT TRUE;
-- arousal_count must be nullable for manufacturers that don't report it
ALTER TABLE sessions ALTER COLUMN arousal_count DROP NOT NULL;
ALTER TABLE sessions ALTER COLUMN arousal_count SET DEFAULT NULL;2. SpO2 column name mismatch
session_dict["avg_spo2"] = session_dict.pop("spo2_avg", None)
session_dict["min_spo2"] = session_dict.pop("spo2_min", None)And 3. MACHINE_TZ must be applied before upsert
from import_sessions import _localize # or inline the same logic
session_dict["start_datetime"] = _localize(session_dict["start_datetime"])
session_dict["pld_start_datetime"] = _localize(session_dict["pld_start_datetime"])4. Event format needs an adapter shim
def _adapt_events(cpap_events):
# cpap-parser: (event_type, onset_sec, duration_sec, session_start)
# db.py: (onset, duration, event_type) + separate csl_start
by_start = {}
for event_type, onset, duration, session_start in cpap_events:
by_start.setdefault(session_start, []).append((onset, duration, event_type))
return by_start # keyed by session_start datetimeThen call 5. Metrics format needs a new DB helper
{"ts": 1700000000.0, "mask_pressure": 10.2, "leak": 4.1, ...}Recommended: add a new def replace_session_metrics_cpap(conn, session_db_id: int, rows: list[dict]):
from datetime import timezone
with conn.cursor() as cur:
cur.execute("DELETE FROM session_metrics WHERE session_id = %s", (session_db_id,))
if not rows:
return
data = [
(
session_db_id,
datetime.fromtimestamp(r["ts"], tz=timezone.utc),
r.get("mask_pressure"), r.get("leak"), r.get("resp_rate"),
r.get("tidal_vol"), r.get("min_vent"), r.get("snore"), r.get("flow_lim"),
)
for r in rows
]
sql = """INSERT INTO session_metrics
(session_id, ts, mask_pressure, leak, resp_rate, tidal_vol, min_vent, snore, flow_lim)
VALUES %s"""
with conn.cursor() as cur:
psycopg2.extras.execute_values(cur, sql, data, page_size=5000)6. Validation status — suggested storage
session["meta"] == {"validation_status": "validated"|"needs_validation"|"unimplemented",
"validation_notes": "..."}Recommend mapping 7. Minimal working
|
…add cpap-parser metric/spo2 helpers Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace stubs in open_cpap_import.py with working implementations of detect_open_cpap_layout() and run_open_cpap_import(), updating imports from open_cpap_parser to cpap_parser, renaming spo2_avg/spo2_min to avg_spo2/min_spo2, localizing naive datetimes via MACHINE_TZ, and populating the MANUFACTURER_VALIDATED registry. Covered by 6 new tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nd uv error for SHA256.
… event key-lookup bug; add tz and event tests - git mv open_cpap_import.py → cpap_parser_import.py (and test file) - Fix stale docstring: parser_validated is True for Lowenstein, False for ResMed - Fix event key-lookup: localize session_start key in events_by_start so lookup matches the localized start_datetime (naive != tz-aware dict key mismatch) - Add test_run_import_localizes_naive_datetime - Add test_run_import_events_regrouped Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…to native ResMed path - Update import reference to cpap_parser_import - Add manufacturer/data_source/parser_validated/avg_spo2/min_spo2 to session_data in import_folder() so upsert_session() gets all required keys Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…I; fix CLI routing; support non-ResMed uploads - import_sessions.py main(): detect cpap-parser layout before ResMed folder loop, so browser uploads (subprocess path) route to run_open_cpap_import correctly - api/models.py: add parser_validated (bool, default True) and manufacturer to SessionSummary - api/routers/sessions.py: BOOL_AND(parser_validated) + manufacturer in list/detail SQL - api/client.ts: add parser_validated and manufacturer to SessionSummary TS interface - pages/SessionDetail.tsx: warning banner when parser_validated is false - pages/Import.tsx: collect all files (not just .edf) for non-ResMed SD cards; update description to be device-agnostic; note about metric accuracy on upload complete Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sion FK Adds 'machine' equipment type to user_equipment (device_serial, parser_validated columns; start_date made nullable), a partial unique index on (user_id, device_serial) for machine records, and machine_equipment_id UUID FK on sessions. Also syncs schema.sql with migrations 009/013 (therapy_mode, mask_type, humidity_level, temperature_c, equipment FKs). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…hine_equipment Auto-creates a 'machine' user_equipment record keyed by device_serial on first import, updates parser_validated on re-import, and links the session row via machine_equipment_id FK. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Both the native ResMed EDF path (import_sessions.py) and the cpap-parser path (cpap_parser_import.py) now call find_or_create_machine_equipment after each upsert_session and set machine_equipment_id on the session row. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…alidated Extends EquipmentType, EquipmentResponse, and InferredEquipment to include 'machine'. Equipment router now selects device_serial/parser_validated in all queries and handles machine records in the /inferred endpoint (ordered by updated_at rather than start_date). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…catalog Adds 'machine' to Equipment/InferredEquipment types, renders a read-only CPAP Machine section in EquipmentCatalog (showing manufacturer, serial, and parser_validated badge), and adds machine to the equipment grid in SessionDetail. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tion Covers find_or_create_machine_equipment (create/find/noop paths), update_session_machine_equipment, and the three machine-linking behaviours in run_open_cpap_import (called with correct args, skipped when find returns None, skipped for existing sessions). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…a cpap-parser - Detect Lowenstein/non-ResMed SD card layouts with create_parser() (fixes UniversalCPAPParser() having no registered adapters) - Import timeseries metrics with include_timeseries=True; filter per-session by timestamp range (was writing full dataset to every session) - Clamp INT16 sentinel values (±32767) for SpO2 and timeseries channels - Fix datetime.UTC → timezone.utc in db.py metrics insertion - Wire find_or_create_machine_equipment + update_session_machine_equipment into both native ResMed and cpap-parser import paths - Add session_id param to /equipment/inferred for per-session machine lookup - Fix duplicate "CPAP Machine" label in SessionDetail when brand/model absent - Parser warning banner links to upstream validation issues - Multi-stage Dockerfile: Rust/maturin in builder stage, wheel copied to runtime - Update test mocks to patch cpap_parser.core instead of cpap_parser.adapters.base Known upstream: Löwenstein Eyra timeseries channel misidentification filed at gitlab.com/open-cpap/cpap-parser/-/work_items/27 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…k nights Time-range filtering against daily summary start+duration missed overnight sessions that start before midnight (e.g. 23:04) when the summary's start_datetime was the tail end at 05:28. Replace with date-based grouping using map_timeseries_to_metrics/map_timeseries_to_spo2 per CPAPSession block, keyed on start_time.date() to match cpap-parser's own sessions_by_date logic. Update tests to mock cpap_parser.adapters.sleeplab_output and unpack the new 5-value return from _mock_cpap_parser. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…y-start boundaries The previous date bucketing used s.start_time.date() (UTC), which misassigns tail-end blocks that cross midnight UTC (e.g. a session starting 01:53 UTC that belongs to the prior night). Replace with a binary-search approach that mirrors cpap-parser's own grouping: each block maps to the daily summary whose start_datetime is the latest one ≤ the block's start. This recovers ~96k dropped metric rows for the May 17 night in the Lowenstein sample data. Also update test mocks to use real timezone-aware datetimes for start_time so the .replace(tzinfo=None) comparison in _folder_for_block works correctly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…eeds more testing
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…wenstein - Events: regroup by folder_date instead of session_start so sub-blocks (block[2], block[3]...) contribute events to the right session; normalize onset relative to the summary csl_start - SpO2: treat 0.0 as null (no pulse-ox attached); fix clamp from [0,100] to (0,100] so device-zeroed values are not stored as real data - Summary fields: derive avg_leak, avg_resp_rate, avg_tidal_vol, avg_min_vent from timeseries when the device summary omits them (Lowenstein firmware) - AHI: compute from parsed event counts when total_ahi_events=0 - db.py: skip all-null spo2 rows in replace_session_spo2_cpap - Tests: add 4 new targeted tests covering each fix Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Chart tick labels and tooltips now respect the configured display_tz (from /config) so sessions recorded in non-local timezones (e.g. Europe/Prague) show the correct local time instead of browser-local time. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
cpap-py is a direct runtime dependency (pulled in by cpap-parser as its ResMed adapter backend). MIT license. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
SleepLab incorporates OSCAR-derived features via cpap-parser, so the
prior disclaimer ("does not implement any direct derivative") was
inaccurate. Replaced with neutral acknowledgment of the lineage.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@kilo-WATT — requesting your review on the chart/plotting design decisions in this PR and the integration branch. Summary below. Chart changes in this PR (
|
|
@camden-bock I think we can work around parts of open-cpap-parser#14 on the SleepLab side, with a couple of limits. Reasonable temporary workarounds:
That seems safe enough because it is derived from actual samples, not guessed.
What wouldn't be ideal:
But open-cpap-parser#14 should still remain the real fix, especially for stable session identity. What do you think? |
|
@camden-bock Adapter is looking good - few things: 1. ResMed can still hit the wrong parser path References: Right now the router tries Routing and validation need to be separate decisions. Keep open-cpap-parser first for non-native manufacturers, but explicitly bail on ResMed in the routing probe until that path is validated. The 2. The probe only catches one error type Reference:
Either make the probe lighter (manufacturer-only if the parser supports it), or catch broader parser failures and fall back to native where needed. I wouldn't silently swallow errors for non-ResMed, but for the ResMed-native path specifically, a probe regression shouldn't break existing imports. 3. References: A single boolean per manufacturer is fine as a starting guardrail, but it's already conflating two things:
Those will diverge. Validation really depends on manufacturer, parser version, device family, and which data surfaces are covered (summaries, events, SpO2, waveforms). Keep the dict for now, but treat it as an explicit policy table with separate fields like 4. The SpO2 workaround for open-cpap-parser#14 is incomplete References: The code sanitises
Derive valid_spo2_rows = [
r for r in spo2_by_date.get(folder_date, [])
if r.get("spo2") is not None or r.get("pulse") is not None
]
session_dict["has_spo2"] = bool(valid_spo2_rows)Then derive 5. Reference: The PR already derives AHI counts from parsed events when summary values are missing, which is a good workaround. But 6. Reference: The |
|
Implementation plans addressing your review have been posted on the tracking issue:
|
I was thinking about this, I'm wondering if it would make the most sense to put the parser validation status as data from the parser itself? Then sleeplab isn't dependent on maintaining that metadata, and it is dynamically provided. The If we went for this solution it would need either (1) a back and forth with the parser, fetching validation status after the file structure is recognized, then fetching parsed data or (2) waste some resources parsing the data, and decide when the manufacturer/model comes back whether the parser is worth using. My intention is to keep parsers modular, so that parser development work is useful to projects outside sleeplab, and to make it easy to provide a quick method for adding new parsers for new equipment or swapping parsers out in case of stack regression/decay. |
|
cpap-parser upstream changes that affect this PR (ref: open-cpap/cpap-parser !8 + !9, both targeting 🚨 Bug 1 (timestamps) — BREAKING CHANGE, action required before updating cpap-parserMR !8 fixes the timestamp bug upstream: Safe update sequence:
✅ Bug 2 (duration accuracy) — fixed upstream
✅ Bug 3 (serial "Unknown") — fixed upstreamAdded a direct JSON fallback in ✅ Bug 4 (ghost sessions) — fixed upstream
🆕 Issue #29 —
|
| Manufacturer | summary | spo2 | events | waveform |
|---|---|---|---|---|
| Löwenstein | ✅ | ✅ | ✅ | ❌ |
| ResMed | ✅ (after !8) | ❌ | ❌ | ❌ |
This keeps route_enabled on the SleepLab side while sourcing the *_validated signals from the parser. When !9 lands, MANUFACTURER_VALIDATED["Lowenstein"] and MANUFACTURER_VALIDATED["ResMed"] (the summary_validated component) can be replaced with get_validation_status()[manufacturer].summary_validated.
Both MRs are on develop and require validation against real device data samples before merging to main. Tracking: open-cpap/cpap-parser!8 (bugfix) and open-cpap/cpap-parser!9 (validation metadata).
Yep, I think that split makes sense. The parser should probably own the validation metadata for its own output: summaries, SpO2, events, waveforms, parser version, and device family or model where it can recognise that. That keeps the parser useful outside SleepLab, and avoids SleepLab having to maintain parser-specific knowledge that will drift over time. SleepLab can keep the app-level policy decision: I would prefer option 1 if the parser can support it. A light recognition step would be ideal: identify the manufacturer/model, return the validation metadata, then let SleepLab decide whether to parse, fall back, or reject clearly. Only after that would we run the full import. Option 2 is workable, but I would rather not make "parse everything, then decide whether we trust it" the normal flow. That feels like the wrong default while we are trying to protect the native ResMed path. So I think the split is:
That keeps the parser modular, which I agree is the right direction, while still letting SleepLab make cautious routing decisions. |
|
Sounds good - I'll patch the parser in that direction.
…On Wed, May 27, 2026, at 03:41, Joshua Myers wrote:
*joshuamyers-dev* left a comment (joshuamyers-dev/sleeplab#44) <#44 (comment)>
> I was thinking about this, I'm wondering if it would make the most sense to put the parser validation status as data from the parser itself? Then sleeplab isn't dependent on maintaining that metadata, and it is dynamically provided. The `route_enabled` would live sleeplab side, essentially deciding when the parser path is preferred or sufficiently validated to include (can make judgments about how much a 'best guess' approach is worth using when we don't have files to test on).
>
> If we went for this solution it would need either (1) a back and forth with the parser, fetching validation status after the file structure is recognized, then fetching parsed data or (2) waste some resources parsing the data, and decide when the manufacturer/model comes back whether the parser is worth using.
>
> My intention is to keep parsers modular, so that parser development work is useful to projects outside sleeplab, and to make it easy to provide a quick method for adding new parsers for new equipment or swapping parsers out in case of stack regression/decay.
>
Yep, I think that split makes sense.
The parser should probably own the validation metadata for its own output: summaries, SpO2, events, waveforms, parser version, and device family or model where it can recognise that. That keeps the parser useful outside SleepLab, and avoids SleepLab having to maintain parser-specific knowledge that will drift over time.
SleepLab can keep the app-level policy decision: `route_enabled`, meaning "should we use this parser path for this directory right now?" For ResMed, that lets us stay conservative while the native EDF importer is still the safer path. For other manufacturers, where the parser may be the only practical option, we can make a different call.
I would prefer option 1 if the parser can support it. A light recognition step would be ideal: identify the manufacturer/model, return the validation metadata, then let SleepLab decide whether to parse, fall back, or reject clearly. Only after that would we run the full import.
Option 2 is workable, but I would rather not make "parse everything, then decide whether we trust it" the normal flow. That feels like the wrong default while we are trying to protect the native ResMed path.
So I think the split is:
• parser package: directory recognition and validation metadata
• SleepLab: routing policy, native fallback behaviour, and import/UI warnings
• imported sessions: the validation state that applied at import time
That keeps the parser modular, which I agree is the right direction, while still letting SleepLab make cautious routing decisions.
—
Reply to this email directly, view it on GitHub <#44?email_source=notifications&email_token=ACVO7UF7VDSSZL7Y6SYS37T442LY5A5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTINJVGI2DQOBUGQZKM4TFMFZW63VHNVSW45DJN5XKKZLWMVXHJLDGN5XXIZLSL5RWY2LDNM#issuecomment-4552488442>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACVO7UHIVIY6VOMJLRW4O7T442LY5AVCNFSM6AAAAACZH6KKUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DKNJSGQ4DQNBUGI>.
Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Closes #38, closes #19.
Prerequisites: This branch is based on
feat/licensing(PR #42, GPLv3 adoption) and requires PR #43 (PyJWT dep fix) to also merge before this can land on main.What this is
Full implementation of multi-manufacturer CPAP import via cpap-parser. Adds manufacturer detection, per-session machine equipment registration, and provenance tracking (
manufacturer,data_source,parser_validated) for all import paths.What's included
importer/cpap_parser_import.py— detect + import via cpap-parser (create_parser(),map_directory_to_sleeplab())importer/import_sessions.py— routing hook: cpap-parser path whendetect_open_cpap_layout()returns True, else native ResMed pathimporter/db.py—find_or_create_machine_equipment(),update_session_machine_equipment(), INT16 sentinel clamp for timeseries metricsmigrations/013_add_manufacturer_and_source.sql—manufacturer,data_source,parser_validatedon sessions;device_serial,parser_validatedon user_equipment;machine_equipment_idFK on sessions; 'machine' equipment typeapi/models.py+api/routers/equipment.py— machine equipment type,device_serial/parser_validatedfields,session_idparam on/equipment/inferredfor per-session machine lookupfrontend/— machine card in SessionDetail, parser warning banner with link to upstream validation issuesDockerfile— multi-stage build: Rust/maturin in builder stage, pre-built wheel copied to slim runtimeBug fixes (post-initial-review)
session_starttagged at block level, not the summarystart_datetime. Redesigned event grouping to key byfolder_dateand normalize onset relative to summary CSL start. All 263 Löwenstein events now stored correctly.0 <= v <= 100range check. Fixed sentinel to0 < v <= 100; also treats0asNonein the timeseries extractor.metrics_by_date) and event counts (event_counts_by_date) when the firmware-provided values are absent.MetricsChartSplit.tsxwas callingtoLocaleTimeStringwithout atimeZoneoption, ignoring the server-configuredDISPLAY_TZ. Fixed to passgetDisplayTz()so chart tick labels reflect the user's local timezone (e.g.Europe/Prague).Known upstream issues
Löwenstein Eyra timeseries channels (
leak,resp_rate,tidal_vol) contain misidentified flow-rate data from the Prisma Line binary parser.mask_pressureis correct. Filed upstream: open-cpap/cpap-parser#27. Workaround: physiological range clamping indb.pynulls out out-of-range values before DB insertion.ResMed adapter bugs found during validation: open-cpap/cpap-parser#28. See ResMed validation section below.
Integration testing (branch
integration/test-prs-50-54)Integration-tested against PRs #50, #51, #52, #53, and #54 in a combined branch (
integration/test-prs-50-54). All five PRs plus #44 were merged and validated against a fresh Postgres + app container. The chart x-axis unification work (unified[xMin, xMax]domain across Night Metrics, Oximetry, and Wearable Sleep Stage charts) was developed during this integration pass and lives on the integration branch — it is not part of this PR but is a direct outcome of the combined testing.Conflicts resolved:
frontend/src/App.tsximporter/import_sessions.py(imports)parse_brp/replace_session_waveform/get_session_db_id; #44 addsfind_or_create_machine_equipment/session_exists/update_session_machine_equipmentimporter/import_sessions.py(function)backfill_waveform_for_block; #44 usesstr | Nonetype syntaxbackfill_waveform_for_block, usedstr | Noneimporter/db.pyreplace_session_waveform; #44 addsreplace_session_metrics_cpap/machine equipment helpersapi/models.pyWaveformResponse/EventWindowResponse; #44 adds"machine"toEquipmentTypefrontend/src/components/EquipmentCatalog.tsxreplacementDaysvsDEFAULT_REPLACEMENT_DAYS[t] ?? null?? nullformMigration renumbering: #44 (
013_add_manufacturer_and_source.sql) and #52 (013_add_session_waveform.sql) both claim migration number 013 from the samemainbase. In the integration branch, #52's migrations were renumbered:013_add_session_waveform.sql→014_add_session_waveform.sql014_add_event_inspector_indexes.sql→015_add_event_inspector_indexes.sqlIntegration test results:
ResMed + cpap-parser validation
Routing ResMed through cpap-parser is explicitly out of scope for this PR.
detect_open_cpap_layout()correctly falls through to the native EDF path for ResMed SD cards, andMANUFACTURER_VALIDATED["ResMed"] = Falseis set to reflect unvalidated status. This section documents why that flag is correct and what must be fixed upstream before it can becomeTrue.The issues below are scoped to the ResMed adapter's dependency on
cpap-py. They do not affect the Löwenstein adapter (native Rust) or other manufacturers currently implemented in cpap-parser.Validation methodology
Side-by-side import of the same ResMed AirSense 10 DATALOG (72 nights, 2026-02-03 – 2026-05-15) into two user accounts using native path vs. forced cpap-parser path. Compared against OSCAR v1.7.x summary export as ground truth.
Full validation report:
sleepData/resmed-cpap-parser-validation-2026-05-25.mdFindings
Bug 1 — Metrics timestamps anchored to Unix epoch (BLOCKER)
timestamps_lowfrom cpap-py contains relative offsets (elapsed seconds since session start), not absolute timestamps. The current integration stores them directly, resulting in all metrics rows timestamped at1970-01-01T00:00:00Z. Charts are completely unusable.Second-order impact on PR #52 (Event Inspector): the window query
WHERE sm.ts >= event_datetime - N secondsreturns zero rows because the 56-year gap between metric timestamps (epoch) and event timestamps (wall-clock) means no metrics ever fall in the event window. This must be fixed before the Event Inspector can work for any cpap-parser-sourced ResMed session.The fix is in
cpap_parser_import.py_extract_metrics_from_timeseries: addsession.start_time.timestamp()to eachtsoffset. Filed upstream: open-cpap/cpap-parser#28.Bug 2 — Duration accuracy (BLOCKER)
cpap-parser reads duration from
STR.edfdaily summaries; native reads from PLD EDFnum_records × duration_per_record. Median offset vs OSCAR: +60s; worst case: +30,060s (8.4 hours) on a night with a fragmented session.Bug 3 — Device serial returns "Unknown"
directory.machine.serial_numberreturns the string"Unknown"for all ResMed imports. Native importer reads the correct serial from EDF signal headers.Bug 4 — Ghost sessions from STR.edf history
cpap-parser returns 439
daily_summaries(full STR.edf device history, 2025-02-04 onward) while only 119 DATALOG EDF session blocks exist. The 346 summary-only dates would create empty sessions in the DB — correct AHI/duration from STR.edf but no metrics, events, or waveform. This is a ResMed-specific architectural issue; Löwenstein's native Rust adapter exposes exactly as many summaries as actual session blocks (no ghost sessions).What's correct
AHI accuracy is roughly equivalent between native and cpap-parser (88% vs 87% of nights within ±0.1 of OSCAR). The sensor data itself is parsed correctly — the issues are timestamp handling, metadata extraction, and the STR.edf/DATALOG boundary mismatch.
Path forward
After upstream bugs 1–3 are fixed in cpap-parser:
MANUFACTURER_VALIDATED["ResMed"] = Truedaily_summarieswith dates that have session blocks before upserting (SleepLab-side guard)detect_open_cpap_layout()to returnTruefor ResMed — and remove the native EDF path when confidence is high enoughNotes for the reviewer (Joshua)
A few design decisions worth calling out explicitly:
1. cpap-parser pinned to
maintarball, not a tagged releaserequirements.txtreferencescpap-parser @ https://gitlab.com/open-cpap/cpap-parser/-/archive/main/cpap-parser-main.tar.gz. There are no PyPI releases or version tags. This means the pin is reproducible only by hash, not by version.uv.lockcaptures the exact hash at lock time, so builds are deterministic until someone runsuv lock --upgrade. Worth tracking upstream whether tagged releases are planned.2.
parser_validatedis a SleepLab-layer flag, not upstream metadataMANUFACTURER_VALIDATEDincpap_parser_import.pyis a dict SleepLab maintains. Currently{"Lowenstein": True, "ResMed": False}. When cpap-parser fixes the ResMed bugs, SleepLab must update this dict and re-run validation — cpap-parser itself does not expose a "this manufacturer is validated" signal. The flag drives both the warning banner in SessionDetail and theparser_validatedcolumn on sessions and user_equipment rows.3. Machine equipment deduplication uses a partial unique index
find_or_create_machine_equipment()does an upsert against(user_id, equipment_type, device_serial)with a partial unique index (WHERE equipment_type = 'machine'). If a device has no serial (e.g. ResMed returning"Unknown"), the serial is stored asNULLand the partial index doesn't apply — two imports from the same machine with no serial will create two machine equipment rows. This is intentional: we can't deduplicate what we can't identify. If upstream fixes Bug 3, this self-corrects.4.
MACHINE_TZenv var is required for correct EDF header timestampscpap-parser reads EDF timestamp fields as local time but stores them naively (no tzinfo).
MACHINE_TZ(defaulting toUTC) is passed as context tocreate_parser()to shift them to UTC before storage. If this var is wrong, all session start/end times will be off by the timezone offset. This is the same issue the native ResMed importer has always had — it's not new here, but it's worth knowing that a wrongMACHINE_TZwill silently produce incorrect session timestamps for both paths.5. INT16 sentinel clamping in
db.pycpap-parser occasionally emits out-of-range timeseries values from the Löwenstein Prisma Line binary parser (see upstream issue #27).
replace_session_metrics_cpap()clamps values against physiological ranges before insertion rather than failing loudly. The ranges are conservative — anything outside them is almost certainly a parser artifact, not real physiology. If upstream fixes the Prisma parser, these values will simply stop appearing and the clamping becomes a no-op.6. Dockerfile build size
The multi-stage build compiles cpap-parser's Rust extension (
maturin build) in a full Rust+Python builder image and copies only the resulting wheel into the slim runtime image. The Rust toolchain is ~700 MB; the final image stays under 400 MB. If cpap-parser eventually ships a pre-built wheel to PyPI, the builder stage can be dropped entirely.Test plan
data_source='resmed_native',parser_validated=true,machine_equipment_idset/equipment/and/equipment/inferred?session_id=return correct machine per sessionparser_validated+manufacturerin list and detail responsesparser_validated=falsesessionsdata_source='open_cpap_parser', machine equipment registeredsession_eventsDISPLAY_TZ(e.g. Europe/Prague)🤖 Generated with Claude Code