Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
093ec76
feat: add cursor overlay pipeline
EtienneLescot Mar 15, 2026
7a9597a
feat: add unified native bridge foundation
EtienneLescot Mar 16, 2026
e85a2a3
feat: add windows native cursor capture and rendering
EtienneLescot Mar 16, 2026
96e2829
feat: add cursor overlay pipeline for high-fidelity cursor recording …
EtienneLescot Mar 26, 2026
c8bf3f7
fix: restore cursor pipeline build after rebase
EtienneLescot May 3, 2026
73960d9
feat: add windows cursor preview diagnostics
EtienneLescot May 5, 2026
e48dd9c
feat: complete windows cursor assets
EtienneLescot May 5, 2026
f6ef620
fix: avoid unsupported display media min constraint
EtienneLescot May 5, 2026
58a1a79
fix: align native cursor preview and export
EtienneLescot May 5, 2026
e414f6a
fix: restore native cursor preview and export
EtienneLescot May 5, 2026
7166160
feat: add native Windows recorder helper
EtienneLescot May 5, 2026
1b285c2
feat: add native Windows microphone capture
EtienneLescot May 5, 2026
793380a
fix: align native mixed audio timeline
EtienneLescot May 5, 2026
c7c4fea
feat: add native Windows window capture
EtienneLescot May 5, 2026
67fa0aa
feat: add native Windows webcam composition
EtienneLescot May 5, 2026
779dee5
fix: honor selected native Windows webcam
EtienneLescot May 5, 2026
6d859cc
fix: support DirectShow virtual webcams
EtienneLescot May 5, 2026
c13e8a1
fix: skip black webcam warmup frames
EtienneLescot May 5, 2026
dcb402f
fix: gate Windows cursor settings
EtienneLescot May 5, 2026
d310e91
test: add Windows native checklist smoke test
EtienneLescot May 5, 2026
8a1d7e4
fix: resolve selected Windows microphone
EtienneLescot May 5, 2026
d076a9b
fix: address native capture review feedback
EtienneLescot May 5, 2026
1c92110
feat: apply native cursor visual effects
EtienneLescot May 5, 2026
0f26eac
fix: capture quick native cursor clicks
EtienneLescot May 5, 2026
f661a39
fix: record native cursor click events
EtienneLescot May 5, 2026
657b4a7
docs: backlog native cursor click bounce
EtienneLescot May 5, 2026
159b24e
test: harden Windows cursor diagnostic
EtienneLescot May 5, 2026
f609611
fix: address native cursor review findings
EtienneLescot May 5, 2026
eab205a
fix: harden native recorder review paths
EtienneLescot May 5, 2026
5b7f380
fix: make native cursor click bounce visible
EtienneLescot May 5, 2026
bdbb67c
fix: preserve native cursor click interactions
EtienneLescot May 5, 2026
249658d
fix: address maintainer platform regressions
EtienneLescot May 6, 2026
6e8932a
fix: restore source copy export fast path
EtienneLescot May 6, 2026
fbcbf8f
fix: guard source copy while native cursor data loads
EtienneLescot May 6, 2026
f802395
fix: preserve cursor and audio in exports
EtienneLescot May 6, 2026
3d79a09
fix: downmix multichannel export audio
EtienneLescot May 6, 2026
4420f5b
feat: add Windows cursor capture mode
EtienneLescot May 6, 2026
cce35ae
fix: preserve Windows system audio on export
EtienneLescot May 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 52 additions & 45 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,45 +1,52 @@
# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*
lerna-debug.log*

node_modules
dist
dist-electron
dist-ssr
*.local
.env

# Editor directories and files
.vscode/*
.zed/
!.vscode/extensions.json
.idea
.DS_Store
*.suo
*.ntvs*
*.njsproj
*.sln
*.sw?
release/**
*.kiro/
.claude/
# npx electron-builder --mac --win

# Playwright
test-results
playwright-report/

# Vitest browser mode screenshots
__screenshots__/

# shell files
/shell.sh
# Nix
result
result-*
.direnv/
# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*
lerna-debug.log*

node_modules
dist
dist-electron
dist-ssr
*.local
.env

# Native helper build outputs
/electron/native/wgc-capture/build/
/electron/native/bin/

# Editor directories and files
.vscode/*
.zed/
!.vscode/extensions.json
.idea
.DS_Store
*.suo
*.ntvs*
*.njsproj
*.sln
*.sw?
release/**
*.kiro/
.claude/
# npx electron-builder --mac --win

# Playwright
test-results
playwright-report/

# Vitest browser mode screenshots
__screenshots__/

# shell files
/shell.sh
# Nix
result
result-*
.direnv/

#kilocode
.kilo/
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,9 @@ _I'm new to open source, idk what I'm doing lol. If something is wrong please ra
See the documentation here:
[OpenScreen Docs](https://deepwiki.com/siddharthvaddem/openscreen)

Developer notes:
- [Windows native cursor test pipeline](docs/testing/windows-native-cursor.md)

Comment on lines +104 to +106
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove this

## Contributing

Contributions are welcome! If you’d like to help out or see what’s currently being worked on, take a look at the open issues and the [project roadmap](https://github.com/users/siddharthvaddem/projects/3) to understand the current direction of the project and find ways to contribute.
Expand Down
39 changes: 39 additions & 0 deletions docs/architecture/native-bridge.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Native Bridge Architecture

## Goal

Provide a single, resilient source of truth for platform-native capabilities while keeping Electron transport thin and renderer APIs unified.

## Layers

1. Native adapters
Platform-specific providers implement stable domain interfaces such as cursor telemetry or system asset discovery.

2. Main-process services
Services orchestrate adapters, own runtime state, and expose domain-level operations.

3. Unified IPC transport
Renderer code talks to a single `native-bridge:invoke` channel using versioned contracts.

4. Renderer client
React code should consume `src/native/client.ts` rather than binding directly to ad hoc Electron APIs.

## Principles

- Single source of truth: runtime-native state lives in the Electron main process.
- Capability-first: renderer can query support before attempting native behavior.
- Versioned contracts: requests and responses are explicit and evolve predictably.
- Resilience: every response uses a consistent result envelope with stable error codes.

## Current rollout

This repository now contains the initial scaffold:

- shared contracts in `src/native/contracts.ts`
- renderer SDK in `src/native/client.ts`
- main-process state store in `electron/native-bridge/store.ts`
- cursor telemetry adapter in `electron/native-bridge/cursor/telemetryCursorAdapter.ts`
- domain services in `electron/native-bridge/services/*`
- unified handler registration in `electron/ipc/nativeBridge.ts`

The legacy `window.electronAPI` surface still exists for backward compatibility. New native-facing features should prefer the unified bridge client.
248 changes: 248 additions & 0 deletions docs/engineering/windows-native-recorder-roadmap.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,248 @@
# Windows Native Recorder Roadmap

OpenScreen's Windows recorder should be owned by one native backend. Electron capture can remain available for non-Windows platforms and temporary developer diagnostics, but Windows production recording should not silently fall back to `getDisplayMedia` / `MediaRecorder`.

## Goals

- Capture displays and windows through Windows Graphics Capture (WGC).
- Render the native Windows cursor as OpenScreen's high-quality scalable cursor overlay.
- Capture system audio through WASAPI loopback.
- Capture microphone audio through WASAPI.
- Mix system audio and microphone audio into the primary screen recording.
- Capture webcam video natively and compose it into the Windows helper MP4 during the native-recording migration.
- Keep preview/export aligned because screen video, audio, webcam, and cursor share one native timing origin.
- Keep exported MP4s Windows-friendly: H.264 video plus AAC audio. Opus-in-MP4 is not an acceptable Windows export target.
- Package the native helper with the Windows app.

## Non-Goals

- Replacing the editor/export pipeline.
- Replacing the editor/export pipeline. A later pass can reintroduce a separate editable native `webcamVideoPath`; the current Windows-native milestone prioritizes a helper-owned multi-flux MP4 with deterministic screen/audio/mic/webcam sync.
- Adding a native fallback for macOS or Linux in this branch.

## Target Architecture

The renderer keeps the existing recording controls. On Windows, `useScreenRecorder` sends a complete recording request to Electron and does not assemble Windows `MediaStream` tracks with `MediaRecorder`.

Electron owns the native recording session:

- resolves the selected source;
- resolves output paths;
- starts cursor sampling;
- starts the helper process;
- sends pause/resume/stop/cancel commands;
- writes `RecordingSession` manifests;
- reports explicit errors when a Windows-native capability is unavailable.

The helper owns Windows media capture:

- WGC screen/window frames;
- WASAPI system loopback;
- WASAPI microphone input;
- Media Foundation webcam capture;
- DirectShow webcam fallback for virtual cameras not visible to Media Foundation;
- Media Foundation encoding/muxing;
- stream timestamp normalization.

## Helper Contract V2

The helper receives a single JSON argument:

```json
{
"schemaVersion": 2,
"recordingId": 1234567890,
"source": {
"type": "display",
"sourceId": "screen:0:0",
"displayId": 123,
"windowHandle": null,
"bounds": { "x": 0, "y": 0, "width": 1920, "height": 1080 }
},
"video": {
"fps": 60,
"width": 1920,
"height": 1080,
"bitrate": 18000000
},
"audio": {
"system": { "enabled": true },
"microphone": { "enabled": true, "deviceId": "default", "gain": 1.4 }
},
"webcam": {
"enabled": true,
"deviceId": "default",
"deviceName": "Camera (NVIDIA Broadcast)",
"width": 1280,
"height": 720,
"fps": 30,
"bitrate": 18000000
},
"outputs": {
"screenPath": "C:\\Users\\me\\recording-123.mp4",
"manifestPath": "C:\\Users\\me\\recording-123.session.json"
}
}
```

The helper emits newline-delimited JSON events to stdout:

```json
{ "event": "ready", "schemaVersion": 2 }
{ "event": "recording-started", "timestampMs": 1234567890 }
{ "event": "warning", "code": "audio-device-unavailable", "message": "..." }
{ "event": "recording-stopped", "screenPath": "..." }
{ "event": "error", "code": "unsupported-window-source", "message": "..." }
```

During migration, Electron also accepts the current textual helper messages so existing display-only smoke tests keep working.

## Implementation Phases

### 1. Native Session Boundary

- Add a structured Windows native recording request type.
- Pass source kind, audio flags, microphone device, webcam flags, and output paths into the helper.
- On Windows, do not silently fall back to Electron capture. If the helper is unavailable or a native feature is missing, show a clear error.
- Keep Electron fallback only for non-Windows and optional developer diagnostics.

Acceptance:

- Display-only recording still works.
- Enabling an unsupported native feature returns an explicit native error instead of recording through Electron.

### 2. WASAPI System Audio

Status: initial implementation landed. The helper captures the default render endpoint with WASAPI loopback, passes the runtime mix format into `MFEncoder`, and muxes AAC audio into the primary MP4. Long-run drift correction and explicit silence insertion remain follow-up hardening work.

- Add `WasapiLoopbackCapture`.
- Capture the default render endpoint in shared loopback mode.
- Keep `WasapiLoopbackCapture` responsible only for device activation, packet capture, and packet timestamps.
- Keep `MFEncoder` responsible for all Media Foundation stream definitions and muxing.
- Feed the endpoint mix format into `MFEncoder` as the single source of truth for audio stream shape: sample rate, channel count, bits per sample, block alignment, average bytes/sec, and subtype (`PCM` or `Float`).
- Encode the primary screen MP4 with H.264 video and AAC audio through one `IMFSinkWriter`.
- Timestamp audio from the captured frame count in 100ns units. The first implementation uses the WASAPI packet timeline; later drift correction will add explicit silence or resampling if long recordings show measurable clock skew.
- Treat microphone mixing as a later phase. System loopback must land first without introducing renderer-side audio code.

Acceptance:

- Screen MP4 has an AAC audio track when system audio is enabled.
- A 5-minute recording has audio/video duration drift below one frame.

SSOT rules for this phase:

- `src/lib/nativeWindowsRecording.ts` is the renderer/main TypeScript request contract.
- `docs/engineering/windows-native-recorder-roadmap.md` is the feature-level contract and phase checklist.
- `WgcSession::captureWidth()/captureHeight()` is the encoded screen frame size until a dedicated native scaling stage exists.
- `WasapiLoopbackCapture::inputFormat()` is the runtime audio format source used by `MFEncoder`.
- The renderer passes both the browser webcam `deviceId` and selected display label as `deviceName`; `electron/native/wgc-capture/src/webcam_capture.*` is the only place that maps those values to Media Foundation devices.
- Electron resolves the selected label to a DirectShow filter CLSID once and passes it as `webcamDirectShowClsid`; the helper must not independently guess among DirectShow filters.
- No duplicated hard-coded audio format assumptions in `main.cpp`.

### 3. WASAPI Microphone

Status: initial implementation in progress. The helper can open the default WASAPI capture endpoint, apply the OpenScreen microphone gain, encode mic-only audio, and mix system loopback plus microphone through a single queued `AudioMixer` timeline when both endpoints expose the same runtime format. Audio endpoints are warmed before WGC starts, the mixer drops pre-roll and begins its paced timeline on the first encoded video frame, then cuts queued tail audio on stop so the MP4 does not drift past the video. Browser `deviceId` to MMDevice id mapping, resampling between mismatched endpoint formats, and drift correction remain follow-up hardening work.

- Add microphone device enumeration and stable device-id mapping.
- Capture selected/default microphone through WASAPI.
- Apply OpenScreen's current mic gain policy.
- Mix microphone and system audio before AAC encoding.

Acceptance:

- Mic-only, system-only, and mixed audio recordings produce a valid AAC track.
- Device unplug/permission failure produces an explicit error or warning.

### 4. Webcam Capture

- Add Media Foundation webcam source reader.
- Select requested dimensions/fps or the nearest format accepted by Media Foundation.
- Convert webcam samples to BGRA and compose them into the primary helper MP4 as an initial bottom-right picture-in-picture overlay.
- Ignore black webcam warmup frames and keep the overlay hidden until the first visible frame is available, so virtual cameras do not flash a black picture-in-picture rectangle at recording start.
- Keep the helper process as the SSOT for screen/window, WASAPI system audio, microphone, webcam, and mux timing.
- Match the requested webcam through Media Foundation friendly names first, then browser device ids/symbolic links, so UI selection remains stable across Chromium and Windows native device namespaces.
- Use the Electron-resolved DirectShow CLSID when the selected virtual camera, for example NVIDIA Broadcast, is registered for DirectShow but absent from Media Foundation enumeration.
- Later: promote the same webcam capture source to a separate editable native `webcamVideoPath` if product requirements need post-recording layout edits.

Acceptance:

- Native display/window recordings can include webcam without returning to Electron capture.
- `npm run test:wgc-webcam:win` validates the helper path when a webcam is available and skips explicitly when no webcam device exists.
- Combined webcam + system audio + microphone produces one MP4 with H.264 video and AAC audio.

### 5. Native Window Capture

Status: initial implementation in progress. Electron parses the `window:<HWND>:...` desktop source id through the shared native Windows recording contract and passes `windowHandle` to the helper. The helper resolves the `HWND`, validates it with `IsWindow`, and creates the WGC item with `CreateForWindow(HWND)`. Resize/minimize/move hardening and protected-window diagnostics remain follow-up work.

- Resolve Electron `window:*` selections to an `HWND`.
- Use WGC `CreateForWindow(HWND)`.
- Handle window close, minimize, resize, DPI scaling, and monitor moves.
- Return clear errors for unsupported protected windows.

Acceptance:

- Capturing a normal app window works with cursor/audio/mic/webcam.
- Window resize and movement do not corrupt the recording.

### 6. Runtime Controls

- Add pause/resume commands to the helper.
- Add cancel command that removes partial screen/webcam outputs.
- Keep restart as stop-discard-start from Electron until the helper supports a native restart event.

Acceptance:

- Pause/resume keeps preview duration coherent.
- Cancel leaves no stale media/session/cursor files.

### 7. Test Pipeline

- `npm run test:wgc-helper:win`: display-only helper smoke test.
- `npm run test:wgc-audio:win`: validates AAC track presence and duration.
- `npm run test:wgc-window:win`: captures a fixture window by HWND.
- `npm run test:wgc-webcam:win`: validates webcam output when a webcam is available, otherwise skips explicitly.
- Packaging check: confirms the helper is in `app.asar.unpacked`.
- Export check: exported MP4s generated from native recordings keep an AAC audio track when the source has audio.
- `npm run test:wgc-mic:win`: validates default-microphone capture writes an AAC track when an input endpoint is available.
- `npm run test:wgc-mixed-audio:win`: validates system loopback plus microphone writes one mixed AAC track when endpoint formats are compatible.

## Backlog

### Native Cursor Click Bounce Is Not Visibly Applied

Status: open. Do not treat Windows native cursor `Click Bounce` as shipped.

Problem:

- The cursor settings UI exposes `Size`, `Smoothing`, `Motion Blur`, and `Click Bounce`.
- On Windows native cursor recordings, `Size`, `Smoothing`, and `Motion Blur` are visibly applied in preview/export.
- `Click Bounce` still has no visible effect in manual packaged-app testing, even after adding click-related sample metadata.

What has already been tried:

- Added `interactionType: "click" | "mouseup" | "move"` to native cursor samples.
- Added polling-based left-button state through `GetAsyncKeyState`.
- Added the `GetAsyncKeyState` low-bit path to catch quick clicks between samples.
- Added a PowerShell/C# `WH_MOUSE_LL` mouse hook experiment and launched the sampler through a temporary `.ps1` file to avoid Windows command-line length limits.
- Updated `npm run test:cursor-native:win` so the diagnostic can observe a synthetic short click and emit `clickSampleCount`.

Current diagnosis:

- The diagnostic can observe synthetic click events, but this has not translated into a visible `Click Bounce` effect in the real packaged app.
- The test currently proves that some click metadata can be recorded, not that the full OpenScreen record -> preview -> export path displays a bounce at the expected time.
- The current native implementation may be animating from metadata that is not present in the real recording session, may be using the wrong timestamp origin, or may be applying a scale change too subtle to notice on the DOM/native cursor path.

Next investigation when resumed:

- Inspect the actual `.cursor.json`/session sidecar generated by a packaged-app manual recording and confirm whether real clicks produce `interactionType: "click"` at the right `timeMs`.
- Add a targeted end-to-end fixture that records a known click, loads the generated project, and asserts the preview/export cursor scale changes across adjacent frames.
- Compare the native DOM cursor path against the older `PixiCursorOverlay` click visual state and decide whether native cursor bounce should be a scale-only animation, an additional click ring, or a short explicit keyframe animation independent of sample cadence.
- If event capture remains unreliable in the PowerShell sampler, move click events into a small native cursor helper instead of PowerShell/C# script injection.

## Ship Criteria

- Windows display capture works with cursor, system audio, microphone, and webcam.
- Windows window capture works with cursor, system audio, microphone, and webcam.
- Preview and export show no cursor position drift.
- Preview and export show no measurable audio/video/webcam drift.
- Windows production builds do not depend on Electron capture fallback.
Loading
Loading