Skip to content

refactor: split dictation cleanup from agent invocations + inference scope & provider registries#677

Open
gabrielste1n wants to merge 2 commits intomainfrom
refactor/separate-cleanup-and-agent
Open

refactor: split dictation cleanup from agent invocations + inference scope & provider registries#677
gabrielste1n wants to merge 2 commits intomainfrom
refactor/separate-cleanup-and-agent

Conversation

@gabrielste1n
Copy link
Copy Markdown
Collaborator

Summary

Three layered abstractions that together let the wake-word agent path use its own model and prompt, separate from text cleanup.

1. Prompt registry (`src/config/prompts/`)

  • New `PROMPT_KINDS` map (cleanup, dictationAgent, chatAgent) with i18n key + fallback per kind.
  • Single `customPrompts: Record<PromptKind, string>` store field, store-backed and reactive.
  • One read API: `resolvePrompt(kind, opts)`.
  • Migration runs only if legacy `customUnifiedPrompt` / `agentSystemPrompt` keys exist; fresh installs are no-ops, no empty values written.

2. Inference scope resolver (`src/config/inferenceScopes.ts`)

  • `INFERENCE_SCOPES` map with `fallbackScope` baked in (noteFormatting → dictationCleanup).
  • `selectResolvedLLMConfig(state, scope)` + `setResolvedLLMConfig(scope, patch)`.
  • `selectResolvedMeetingReasoning` rewritten as a thin wrapper.

3. Dictation-agent split (user-facing feature)

  • New `dictationAgent` scope with its own model, provider, mode, custom API key, etc.
  • Wake-word detection extracted to `config/agentDetection.ts`.
  • `audioManager.processTranscription` routes to dictationAgent model + prompt when wake word fires AND a model is configured. Falls back to existing single-model behavior otherwise.
  • New 4th LLM settings tab "Dictation Agent" with `` panel + `<PromptStudio kind="dictationAgent">`.
  • i18n keys added across all 10 locales (English authoritative; others fall back to English per project i18n rule).

4. Provider registry (`src/services/ai/inferenceProviders/`)

  • 7 of 8 providers extracted into single-purpose modules with shared `InferenceProvider` interface.
  • `ReasoningService.processText` switch reduced to a registry lookup (~510 LOC removed from ReasoningService).
  • OpenAI/custom path stays inline pending careful relocation of OpenAI's endpoint-discovery state (cache, probing, base resolution) into its own module.
  • All `logger.logReasoning` event names preserved.

Backward compatibility

  • Existing flat store fields (`reasoningModel`, `agentModel`, `meetingReasoningModel`, etc.) untouched.
  • Existing localStorage keys, IPC channels, service classes, and logger event names preserved.
  • Migration is idempotent and conditional. `agentSystemPrompt` field removal is the only breaking shape change; all consumers updated to use `customPrompts.chatAgent`.

Test plan

  • Fresh install: no `_promptsMigrated` flag until store init; afterwards no other prompt keys written.
  • Upgrade with custom `customUnifiedPrompt`: parsed once, copied to `customPrompts.cleanup` AND `customPrompts.dictationAgent`, legacy key removed.
  • Upgrade with custom `agentSystemPrompt`: copied to `customPrompts.chatAgent`, legacy key removed.
  • Upgrade with neither: migration runs once, writes nothing, sets flag.
  • PromptStudio edits propagate to chat overlay live (Zustand reactivity).
  • Dictate "Hey {agentName}, summarize this" → routes to dictationAgent model + prompt when configured; routes to cleanup model + FULL_PROMPT when not (existing behavior).
  • Smoke each provider through dictation pipeline: OpenAI, Anthropic, Gemini, Groq, Local llama, OpenWhispr cloud, LAN, plus one enterprise (Bedrock/Azure/Vertex).
  • Each provider logs the same event names as before (grep `ANTHROPIC_START`, etc.).
  • Streaming paths (chat overlay) still work — untouched.
  • `tsc --noEmit` clean, `npm run lint` clean, `npm run format:check` clean.

Follow-ups (deliberately out of scope)

  • Relocate OpenAI endpoint-discovery state (`openAiEndpointPreference`, `probedBases`, `getConfiguredOpenAIBase`, `detectReasoningServerType`, `getOpenAIEndpointCandidates`) into `inferenceProviders/openai.ts`.
  • Generic `<InferenceConfigEditor scope="...">` UI to replace 16-prop `` plumbing across SettingsPage, AgentModeSettings, MeetingSettings, DictationAgentSettings (~400 LOC removable).
  • Provider override in `ReasoningConfig` so dictationAgent and cleanup can use different non-heuristic providers (currently the model heuristic in `getModelProvider` covers standard model IDs).

… introduce inference scope and provider registries

- Add `PROMPT_KINDS` registry (cleanup, dictationAgent, chatAgent) with one
  store-backed `customPrompts` record and a single `resolvePrompt(kind, opts)`
  read path. Migration runs only if legacy `customUnifiedPrompt` /
  `agentSystemPrompt` keys exist; fresh installs are no-ops.

- Add `INFERENCE_SCOPES` map (dictationCleanup, dictationAgent,
  noteFormatting, chatIntelligence) with `selectResolvedLLMConfig` and
  `setResolvedLLMConfig`. Fallback semantics (noteFormatting →
  dictationCleanup) baked into the scope definition.

- Wake-word agent split: new dictationAgent scope + store fields + 4th LLM
  settings tab + DictationAgentSettings panel. audioManager routes to the
  dictationAgent model + prompt when the wake word fires AND a model is
  configured; falls back to existing single-model behavior otherwise.

- Provider registry: extract 7 of 8 providers (Anthropic, Gemini, Groq,
  Local, Enterprise, OpenWhispr, LAN) into `services/ai/inferenceProviders/`
  with a shared `InferenceProvider` interface. ReasoningService dispatcher
  shrinks from a 40-line switch to a registry lookup. OpenAI/custom path
  stays inline pending endpoint-discovery state relocation.

- PromptStudio gets a `kind` prop so it can edit any prompt kind.
  AgentModeSettings textarea now reactive via store-backed customPrompts.

- detectAgentName extracted to `config/agentDetection.ts`.
…figEditor

Rename the cleanup/note-formatting/chat-agent LLM settings to scope-specific
names (cleanup*, noteFormatting*, chatAgent*) so the four inference scopes are
distinct in storage, env vars, and IPC. Adds a one-time localStorage migration
and env-var fallbacks for legacy keys.

Extracts the shared mode-selector / model-picker UI from four near-identical
copies into InferenceConfigEditor scoped by InferenceScope, and pulls
processWithOpenAI out of ReasoningService into a standalone openai provider
behind PROVIDER_REGISTRY.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant