Skip to content

Releases: hrygo/hotplex

v1.29.1

16 Jun 01:11

Choose a tag to compare

Summary

v1.29.1 是一次 patch 版本更新,聚焦于 API Key 安全约束加固。新增 user_id 与 API Key 的 1:1 唯一映射约束(数据库层 UNIQUE INDEX + 应用层预检 + 409 Conflict),防止同一用户绑定多个 API Key 带来的权限边界模糊风险。同时修正 API Key 查询的错误码映射(此前将数据库瞬时故障误报为 404 Not Found,掩盖真实错误)。附带 config 包职责拆分(1659 行 god file → 5 文件)与 EventStore/gate/checkers 的内部去重重构(零行为变更,对最终用户无影响)。

Security

  • Security: Enforce 1:1 user_id ↔ API Key mapping — migration 016 adds a UNIQUE INDEX on api_key_users(user_id) for both SQLite and PostgreSQL, backed by an application-layer requireUniqueUserID pre-check that returns 409 Conflict on duplicate attempts. The DB constraint serves as defense-in-depth. Includes standalone dedup scripts (SQLite + PG) with fail-closed guards for resolving pre-existing duplicates before migration. (#741)

Fixed

  • Security: API Key lookup error-code accuracy — HandleAPIKeyUserGet/Update/Delete previously returned a blanket 404 on any get() failure, masking transient DB/connection errors as not-found. Now maps sql.ErrNoRows → 404 while other DB errors → 500; update()/delete() not-found (concurrent-delete window) also aligned to 404 via %w sql.ErrNoRows wrapping. Swagger updated with matching 500 responses. (#741)

Contributors

@aaronwong1989 @hotplex-ai

v1.29.0

12 Jun 22:54

Choose a tag to compare

Summary

v1.29.0 是一次 minor 版本更新,聚焦于 发布产物归档化会话存储可靠性。GitHub Release 资产从裸二进制迁移至 tar.gz/zip 归档(压缩率 ~31%,71MB→22MB),自更新和安装脚本均已完成适配并保持向后兼容。EventStore 合并 resolveGeneration 为 CTE 单次查询(2→1 DB round-trip),超长 turn 截断改为换行边界保留而非整条丢弃。ACP Worker 新增 sessionId 空值防御,防止历史遗留 session 数据导致 "session not found" 错误。

Added

  • CLI: Release archive format — GitHub Release assets now publish as hotplex-{os}-{arch}.tar.gz (unix) / .zip (windows) with ~31% compression ratio. Updater and install scripts download, verify checksum, then extract. Legacy raw binary fallback for pre-archive releases. (#729)

Changed

  • Gateway Core: EventStore CTE merge — combine resolveGeneration + SELECT into single CTE for QueryTurns and QueryTurnStats, reducing DB round-trips from 2 to 1 (both SQLite and PostgreSQL). (#716, #727)
  • Gateway Core: Oversized turn truncation now cuts at newline boundary instead of discarding entire turn, preserving partial content for the newest turn. (#715, #727)

Fixed

  • Worker: ACP sessionId empty guard — reject empty sessionId from agent in callSessionMethod and Prompt, preventing "session not found" errors when resuming sessions with missing worker_session_id in DB. (#732)

Contributors

@aaronwong1989 @hotplex-ai @hrygo

v1.28.0

12 Jun 13:37

Choose a tag to compare

Summary

v1.28.0 是一次 minor 版本更新,聚焦于 会话历史压缩跨平台消息质量。Gateway 新增基于 Brain LLM 的异步历史压缩(替代硬截断),Slack 和飞书共享统一的段落分割器提升移动端阅读体验,Codex Worker 支持 agent-configs 系统提示词注入。同时修复 CodexCLI idle drain 死锁导致 manager 完全不可用的关键缺陷。

Added

  • Gateway Core: Async history compression with Brain LLM — TruncateHistory fast path (sub-ms) injects truncated history immediately, async Brain compression produces cached summary for next resume. (#714)
  • Messaging: Shared ParagraphBreaker for Slack + Feishu — unified threshold (150 chars) and sentence-end detection, extracted into textutil package. Slack gains paragraph breaks in streaming deltas. (#719)
  • Worker: Codex agent-configs injection — SystemPrompt (merged B/C channel prompt) now forwarded to codex app-server via thread/start baseInstructions parameter. Previously agent-configs had no effect on Codex Worker. (#722)

Changed

  • Messaging: Paragraph break threshold reduced from 200 → 150 chars based on mobile readability research (~8–10 lines produces better reading rhythm).

Fixed

  • Worker: CodexCLI idle drain deadlock — timer callback held CodexAppServerManager.mu while calling proc.Kill(), which blocked acquiring proc.mu already held by monitorProcess. Fix: snapshot pgid under lock, unlock, then SIGKILL directly via proc.ForceKill(pgid) without touching proc mutex. (#725)
  • Gateway Core: Async history compression edge cases — empty slice replaces nil to prevent meaningless async compression, extracted resolveCachedHistory with full test coverage. (#720)

Contributors

@aaronwong1989 @hotplex-ai @hrygo

v1.27.0

11 Jun 14:41

Choose a tag to compare

Summary

v1.27.0 是一次 minor 版本更新,聚焦于 Worker 会话恢复消息质量。CodexCLI 新增基于 turns 表的会话历史恢复 + Brain LLM 智能压缩(替代硬截断),飞书适配器引入智能段落分割(累计字符阈值 + 句末触发)。同时修复 ACP WorkerSessionID 持久化竞态条件、CodexCLI 崩溃恢复、ACP 通知丢包等多项可靠性问题。

Added

  • Worker: CodexCLI conversation history recovery — query persistent turns table on session resume, inject structured history prefix into new thread via crypto/rand boundary ID. (#704)
  • Worker: Smart history compression via Brain LLM — new HistoryCompressor module replaces hard truncation with intelligent summarization when history exceeds budget (60k chars), with graceful truncation fallback when Brain is unavailable. (#704)
  • Gateway Core: pull_request webhook trigger for fork PR support — dedup'd dual-path design (check_suite preferred, pull_request fallback) with 5min cooldown. (#704)
  • Messaging: Feishu smart paragraph break — cumulative char threshold (200 chars) + sentence-end trigger replaces naive single-newline append, producing proper double-newline paragraph separation. (#707)
  • Observability: hotplex.session.transition.guard_repersist_overwrites and hotplex.session.transition.concurrent_overwrite Prometheus counters for WorkerSessionID persist monitoring. (#710)

Changed

  • Worker: CodexCLI YOLO mode — default sandbox changed to danger-full-access for unrestricted network and filesystem access with never approval. (#704)
  • Worker: CodexCLI shutdown() no longer calls KillIfIdle — singleton process lifecycle managed by idle drain or explicit ShutdownSingleton, aligned with OCS pattern. (#704)
  • Configuration: ACP auto_approve default changed from false to true — sandboxed agents don't need manual approval.

Fixed

  • Worker: ACP WorkerSessionID persistence race condition — three-layer defense: targeted SQL UPDATE in createAndLaunchWorker, transitionState guard preserving concurrent updates, forwardEvents safety-net with forced persist. (#710)
  • Worker: ACP readLoop burst drain increased from 16→128 to prevent notification drops under high throughput (1402 observed in 2min). (#702)
  • Worker: ACP fatal JSONRPC errors now classified as ErrKindUnavailable to correctly trigger Bridge crash recovery; business errors (rate limit) continue returning nil. (#702)
  • Worker: CodexCLI crash recovery extended to fresh sessions (previously only resumed), using !doneReceived instead of exitCode as trigger for robust crash detection. (#702)
  • Worker: ACP rawInput path key normalized to file_path for toolfmt — file operations now show descriptive status instead of generic placeholder.
  • Gateway Core: ForwardEvents turn count restore uses bgCtx() (shutdown-scoped) instead of request ctx, preventing DB query failures on user disconnect. (#702)
  • Messaging: Feishu id_convert retries 3x with exponential backoff (100ms→200ms→400ms) for transient API failures, with permanent error skip (auth/404). (#702)

Contributors

@aaronwong1989 @hotplex-ai @hrygo

v1.26.3

09 Jun 14:12

Choose a tag to compare

Summary

v1.26.3 是一次 patch 版本更新,聚焦于 CodexCLI 和 OCS Worker 稳定性。修复 CodexCLI Wait() 永久阻塞导致 goroutine 泄漏、elicitation 请求被静默丢弃、terminated session 无效 resume 三个缺陷;修复 OCS Worker SSE 读取器静默死亡导致会话挂起、关键事件被丢弃等六项并发安全缺陷。同时新增 CanResumeTerminated() 能力接口,统一了 Worker 类型级别的 terminated session 恢复策略。

Changed

  • Worker: Add CanResumeTerminated() capability to Worker interface — CodexCLI returns false (singleton killed on release), all others return true. Bridge uses this instead of hard-coded type switch to skip resume for terminated sessions. (#692)
  • Worker: Register-time capability cache (capCache) in registry — CanResumeTerminated() queries cached value instead of creating temporary worker instances.
  • Infrastructure: DRY extractions across messaging STT/TTS PID tracking, worker base lock pattern, session store Upsert JSON helpers, and admin API key store CRUD. (#695)

Fixed

  • Worker: CodexCLI Wait() permanent block — release() niled doneCh, causing Wait() to receive from nil channel forever. Fix: atomic capture under mutex, release() closes but does not nil. (#691)
  • Worker: CodexCLI mcpServer/elicitation/request silently dropped — mapper had no case for this notification type, causing Codex agent permission prompts to hang indefinitely. (#698)
  • Worker: CodexCLI terminated session dead resume path — resume always failed (singleton killed), producing WARN spam and double config load before falling back to fresh start. (#699)
  • Worker: OCS SSE reader silent death — fatal errors now close all subscriber channels via closeAllSubscribers(), unblocking forwardEvents goroutines. (#697)
  • Worker: OCS critical events silently dropped — two-hop pipeline (singleton → worker) now classifies events as droppable/critical; critical events use blocking send with 5s timeout. (#697)
  • Worker: OCS crashSub false positive — Wait() checks IsRunning() when crashSub fires; returns 0 if singleton recovered. (#697)
  • Worker: OCS channel panic race — forwardBusEvents checks conn.closed under mutex before writing to recvCh. (#697)
  • Worker: OCS duplicate Done events — handleSessionIdle returns nil when stats already cleared by prior error handler. (#697)
  • Worker: OCS sync.Once reentrant deadlock — Wait() called release() via releaseOnce.Do, but release() itself called releaseOnce.Do. Fixed by calling release() directly (idempotent). (#697)

Contributors

@aaronwong1989 @hotplex-ai @hrygo

v1.26.2

08 Jun 15:35

Choose a tag to compare

Summary

v1.26.2 是一次 patch 版本更新,聚焦于 Cron 投递可靠性OpenCode Server 生命周期加固。新增 Cron 投递重试机制(指数退避,最多 3 次),OCS Worker 实现 SystemPromptUpdater 接口支持动态刷新系统提示词。同时包含大量并发安全修复(OCS sessionID 数据竞争、platform_writer 通道防护、bridge 会话泄漏防护)和 Slack 适配器错误处理改进。

Added

  • Cron: Delivery retry with exponential backoff — in-memory retry queue (max 100 entries) for transient failures (429, timeout, 5xx), up to 3 retries with 30s→1m→2m backoff, new hotplex.cron.delivery.result metric. (#577)
  • Worker: OCS SystemPromptUpdater interface — UpdateSystemPrompt method updates conn.systemPrompt under mutex, enables bridge to push refreshed system prompt after /reset without session recreation. (#664)
  • Observability: hotplex.cron.delivery.result metric with {status,platform} labels — records all delivery outcomes (success, exhausted, permanent, transient).

Fixed

  • Worker: OCS sessionID data race — 6 read sites accessed conn.sessionID without mutex; unified via getSessionID() helper, symmetric with write-side lock discipline. (#664)
  • Gateway Core: Bridge lifecycle hardening — delete orphaned session on transition-to-running failure, rollback resume attach-failure to TERMINATED, 5s timeout on rollback context. (#577)
  • Gateway Core: Platform writer send-on-closed-channel — recover() guard eliminates TOCTOU window, atomic.Bool closed flag prevents writes after disconnect. (#577)
  • Messaging: Slack adapter clears Thinking status on error and shows generic error feedback; fallback message for empty error events. (#577)
  • Worker: OCS singleton goroutine leak — close stdout on discoverPort timeout; server-side session DELETE in release() for resource cleanup; non-blocking Wait() crash check eliminates 2s goroutine leak. (#664)

Contributors

@aaronwong1989 @hotplex-ai

v1.26.1

07 Jun 18:10

Choose a tag to compare

Summary

v1.26.1 是一次 patch 版本更新,修复 ACP Worker 启动后首轮 prompt 返回空文本的关键 bug,以及 make dev-reset 的日志清理竞态问题。

Fixed

  • Worker: ACP readLoop goroutine exited prematurely after initial handshake notifications — burst drain loop used return (exits entire function) instead of labeled break, killing the notification consumer before any prompt text arrived.
  • Infrastructure: make dev-reset log cleanup raced with running gateway — restructured as three-phase stop→clean→start sequence with confirmed process termination before file deletion.

v1.26.0

07 Jun 15:59

Choose a tag to compare

Summary

v1.26.0 是一次 minor 版本更新,聚焦于 多 Bot 配置体验Worker 架构现代化。核心变更将 Agent 配置路径从平台运行时 ID(ou_xxx/U04ABC)迁移为 YAML 配置名(my-bot),使路径可读且不受 Bot 重命名影响。Gateway 层会话状态编排下沉至 Worker 层,消除 3 个 gateway 接口和 7 处类型断言。安全方面新增可配置 CORS origins 和 RFC 9116 security.txt 端点。ACP Worker 新增通知排空机制解决输出丢失问题。

Added

  • Configuration: Agent config path resolution uses YAML bots[].name instead of platform runtime IDs — readable paths (feishu/my-bot/SOUL.md), stable across Bot renames, with ValidateBotName path-traversal guard. (#678, #679)
  • Configuration: GATEWAY_BOT_NAME environment variable injected into Worker processes; Cron --bot-name flag for multi-Bot agent-config isolation.
  • Worker: SystemPromptUpdater interface — workers reload agent config on /reset without session recreation. (#659)
  • Worker: ForceKillTree — kill orphaned child processes that escape PGID (e.g. MCP servers spawned by Codex). Platform-native /proc (Linux) and sysctl (macOS) child discovery. (#659)
  • Security: Configurable CORS origins (security.allowed_origins), RFC 9116 security.txt endpoint (security.contact), tightened docs CSP connect-src to 'self'. (#663)
  • Worker: SessionStartParams struct replacing 11–13 positional parameters across StartSession/StartPlatformSession interfaces. (#679)

Changed

  • Gateway Core: Session state orchestration pushed from Bridge into Worker layer — workers return ResetResult{ConnReplaced}, emit internal_reset events. ResetGenerationer atomic guard prevents stale goroutine interference. (#659)
  • Worker: CodexCLI exec mode removed — app-server singleton is the sole implementation. use_app_server config deprecated and auto-normalized. (#659)
  • Infrastructure: Temp file management unified — os.TempDir() with hotplex/worker/ subdirectory, gateway startup cleanup for orphaned files (>2h worker, >24h media). (#675)
  • CLI: Setup skill rewritten for v1.25.0+ production alignment — 4-phase decision tree, 48% shorter. (#677)

Fixed

  • Session: GC deadlock — worker.Terminate() moved outside session mutex, preventing blocking all concurrent reads during graceful shutdown. (#656)
  • Worker: ACP notification drain — channel handshake ensures MessageDelta reaches bridge before Done, fixing empty text output for all ACP turns. (#685)
  • Gateway Core: TERMINATED session now attempts resume with fallback to fresh start — idle_timeout/gc no longer lose conversation context. (#683)
  • Events: ToInt64/ToFloat64 handle int32 type — atomic counter values from sessionAccumulator no longer cause turn summary loss and feishu card instability. (#688)
  • Cron: Timeout errors now logged with error_type + state confirmation in executor. (#667)
  • CLI: doctor config.required checker now recognizes multi-bot YAML configurations. (#671)
  • Configuration: Config watcher ~ expansion — fsnotify failed with "no such file" when home directory was specified with tilde. (#673, #674)
  • Docs: API Console localized Scalar JS — CDN unreachable from Chinese servers caused infinite spinner. Layout redesigned to fit header + Scalar in one viewport.

Contributors

@aaronwong1989 @hotplex-ai @hrygo

v1.25.0

04 Jun 15:44

Choose a tag to compare

Summary

v1.25.0 是一次 minor 版本更新,聚焦于 可观测性现代化API 文档体系Brain 模块瘦身

Highlights

  • OTel-native 可观测性: 统一 internal/observability/ 包取代分散的 metrics + tracing,~55 个指标覆盖 12 个域,W3C TraceContext 传播,AEP envelope 注入 trace_id (#642)
  • API 文档 + Scalar 控制台: swaggo 注解混合生成 Swagger JSON,嵌入式 Scalar API Console 支持在线调试 Gateway/Admin 双端口接口 (#632)
  • API Console 品牌化: HotPlex 品牌头部、亮色主题、CSP 字体白名单(jsDelivr + fonts.scalar.com)修复、文档首页快捷入口
  • Brain 净删除 6,221 行: 移除 SafetyGuard / IntentRouter / ContextCompressor 三个死子系统,提取 RedactSensitive() 到独立包 (#638)
  • 文档中心性能: 条件 mermaid.js 加载(55/56 页省 3.2 MB)、gzip 压缩、Cache-Control、Google Fonts 本地化 (#644)
  • 文档构建缓存: mermaid.js / 字体跨构建复用,跳过 ~25s 网络 I/O

Changed

  • Observability: internal/metrics/ + internal/tracing/internal/observability/,promauto → OTel Meter API,hotplex.* 命名空间 (#642)
  • Brain: brain.Close() 替代 brain.GlobalGuard().Close(),移除 GetRouter()/GetRateLimiter() (#638)
  • Docs Builder: 构建时缓存 assets 目录,Makefile 增加 .html 变更检测、排除 swagger/ 生成目录
  • Security CSP: DefaultDocsCSP 增加 https://fonts.scalar.comfont-src,修复 Scalar 字体加载阻止错误
  • WebChat: Worker 图标重绘、NewSessionModal 卡片整理

Fixed

  • Cron: Webhook 触发时注入 TARGET_PR 前缀到 worker prompt,防止 LLM 忽略指定 PR 而枚举全部 (#647)
  • SDKs: 1.24.x 兼容性修复(examples + client)
  • Docs: CLI 参考表格布局优化

Contributors

@hrygo @hotplex-ai

v1.24.4

03 Jun 14:11

Choose a tag to compare

Summary

v1.24.4 是一次 patch 版本更新,聚焦于 session identity 迁移的安全加固。PR #635 补全了 WS init 路径缺失的输入清洗和长度校验,隔离了 session 恢复/启动的 context 超时预算,并统一了前端 session ID 生成逻辑。同步更新了 specs 索引和 WebSocket 集成文档。

Changed

  • Gateway Core: WS init path now sanitizes title and sessionID via messaging.SanitizeText() — REST API path already had this; the WS path was unguarded. (#635)
  • Gateway Core: Resume→Start session fallback uses independent 30s timeout contexts instead of sharing a single budget — a slow resume no longer starves the heavier StartSession. (#635)
  • Gateway Core: Replace hardcoded 256 with session.MaxClientKeyLen constant in API handler for consistency with session manager's validation. (#635)
  • WebChat UI: Replace inline crypto.randomUUID() with existing newSessionId() utility — provides cross-browser fallback instead of silently failing. (#635)
  • WebChat UI: Consolidate MAIN_SESSION_CLIENT_ID/MAIN_SESSION_TITLE into single ANCHOR_SESSION_ID constant. (#635)

Fixed

  • Session: Title length validation missing in WS init path — client_session_id had a 256-char guard but title did not, allowing unbounded input. (#635)

Contributors

@hrygo @hotplex-ai