Releases: hrygo/hotplex
v1.29.1
Summary
v1.29.1 是一次 patch 版本更新,聚焦于 API Key 安全约束加固。新增 user_id 与 API Key 的 1:1 唯一映射约束(数据库层 UNIQUE INDEX + 应用层预检 + 409 Conflict),防止同一用户绑定多个 API Key 带来的权限边界模糊风险。同时修正 API Key 查询的错误码映射(此前将数据库瞬时故障误报为 404 Not Found,掩盖真实错误)。附带 config 包职责拆分(1659 行 god file → 5 文件)与 EventStore/gate/checkers 的内部去重重构(零行为变更,对最终用户无影响)。
Security
- Security: Enforce 1:1 user_id ↔ API Key mapping — migration 016 adds a
UNIQUE INDEXonapi_key_users(user_id)for both SQLite and PostgreSQL, backed by an application-layerrequireUniqueUserIDpre-check that returns 409 Conflict on duplicate attempts. The DB constraint serves as defense-in-depth. Includes standalone dedup scripts (SQLite + PG) with fail-closed guards for resolving pre-existing duplicates before migration. (#741)
Fixed
- Security: API Key lookup error-code accuracy —
HandleAPIKeyUserGet/Update/Deletepreviously returned a blanket 404 on anyget()failure, masking transient DB/connection errors as not-found. Now mapssql.ErrNoRows→ 404 while other DB errors → 500;update()/delete()not-found (concurrent-delete window) also aligned to 404 via%w sql.ErrNoRowswrapping. Swagger updated with matching 500 responses. (#741)
Contributors
v1.29.0
Summary
v1.29.0 是一次 minor 版本更新,聚焦于 发布产物归档化 和 会话存储可靠性。GitHub Release 资产从裸二进制迁移至 tar.gz/zip 归档(压缩率 ~31%,71MB→22MB),自更新和安装脚本均已完成适配并保持向后兼容。EventStore 合并 resolveGeneration 为 CTE 单次查询(2→1 DB round-trip),超长 turn 截断改为换行边界保留而非整条丢弃。ACP Worker 新增 sessionId 空值防御,防止历史遗留 session 数据导致 "session not found" 错误。
Added
- CLI: Release archive format — GitHub Release assets now publish as
hotplex-{os}-{arch}.tar.gz(unix) /.zip(windows) with ~31% compression ratio. Updater and install scripts download, verify checksum, then extract. Legacy raw binary fallback for pre-archive releases. (#729)
Changed
- Gateway Core: EventStore CTE merge — combine
resolveGeneration+SELECTinto single CTE forQueryTurnsandQueryTurnStats, reducing DB round-trips from 2 to 1 (both SQLite and PostgreSQL). (#716, #727) - Gateway Core: Oversized turn truncation now cuts at newline boundary instead of discarding entire turn, preserving partial content for the newest turn. (#715, #727)
Fixed
- Worker: ACP sessionId empty guard — reject empty sessionId from agent in
callSessionMethodandPrompt, preventing "session not found" errors when resuming sessions with missing worker_session_id in DB. (#732)
Contributors
v1.28.0
Summary
v1.28.0 是一次 minor 版本更新,聚焦于 会话历史压缩 和 跨平台消息质量。Gateway 新增基于 Brain LLM 的异步历史压缩(替代硬截断),Slack 和飞书共享统一的段落分割器提升移动端阅读体验,Codex Worker 支持 agent-configs 系统提示词注入。同时修复 CodexCLI idle drain 死锁导致 manager 完全不可用的关键缺陷。
Added
- Gateway Core: Async history compression with Brain LLM — TruncateHistory fast path (sub-ms) injects truncated history immediately, async Brain compression produces cached summary for next resume. (#714)
- Messaging: Shared
ParagraphBreakerfor Slack + Feishu — unified threshold (150 chars) and sentence-end detection, extracted intotextutilpackage. Slack gains paragraph breaks in streaming deltas. (#719) - Worker: Codex agent-configs injection —
SystemPrompt(merged B/C channel prompt) now forwarded to codex app-server viathread/startbaseInstructionsparameter. Previously agent-configs had no effect on Codex Worker. (#722)
Changed
- Messaging: Paragraph break threshold reduced from 200 → 150 chars based on mobile readability research (~8–10 lines produces better reading rhythm).
Fixed
- Worker: CodexCLI idle drain deadlock — timer callback held
CodexAppServerManager.muwhile callingproc.Kill(), which blocked acquiringproc.mualready held bymonitorProcess. Fix: snapshot pgid under lock, unlock, then SIGKILL directly viaproc.ForceKill(pgid)without touching proc mutex. (#725) - Gateway Core: Async history compression edge cases — empty slice replaces nil to prevent meaningless async compression, extracted
resolveCachedHistorywith full test coverage. (#720)
Contributors
v1.27.0
Summary
v1.27.0 是一次 minor 版本更新,聚焦于 Worker 会话恢复 和 消息质量。CodexCLI 新增基于 turns 表的会话历史恢复 + Brain LLM 智能压缩(替代硬截断),飞书适配器引入智能段落分割(累计字符阈值 + 句末触发)。同时修复 ACP WorkerSessionID 持久化竞态条件、CodexCLI 崩溃恢复、ACP 通知丢包等多项可靠性问题。
Added
- Worker: CodexCLI conversation history recovery — query persistent turns table on session resume, inject structured history prefix into new thread via crypto/rand boundary ID. (#704)
- Worker: Smart history compression via Brain LLM — new
HistoryCompressormodule replaces hard truncation with intelligent summarization when history exceeds budget (60k chars), with graceful truncation fallback when Brain is unavailable. (#704) - Gateway Core:
pull_requestwebhook trigger for fork PR support — dedup'd dual-path design (check_suite preferred, pull_request fallback) with 5min cooldown. (#704) - Messaging: Feishu smart paragraph break — cumulative char threshold (200 chars) + sentence-end trigger replaces naive single-newline append, producing proper double-newline paragraph separation. (#707)
- Observability:
hotplex.session.transition.guard_repersist_overwritesandhotplex.session.transition.concurrent_overwritePrometheus counters for WorkerSessionID persist monitoring. (#710)
Changed
- Worker: CodexCLI YOLO mode — default sandbox changed to
danger-full-accessfor unrestricted network and filesystem access withneverapproval. (#704) - Worker: CodexCLI
shutdown()no longer callsKillIfIdle— singleton process lifecycle managed by idle drain or explicitShutdownSingleton, aligned with OCS pattern. (#704) - Configuration: ACP
auto_approvedefault changed fromfalsetotrue— sandboxed agents don't need manual approval.
Fixed
- Worker: ACP WorkerSessionID persistence race condition — three-layer defense: targeted SQL UPDATE in
createAndLaunchWorker,transitionStateguard preserving concurrent updates,forwardEventssafety-net with forced persist. (#710) - Worker: ACP readLoop burst drain increased from 16→128 to prevent notification drops under high throughput (1402 observed in 2min). (#702)
- Worker: ACP fatal JSONRPC errors now classified as
ErrKindUnavailableto correctly trigger Bridge crash recovery; business errors (rate limit) continue returning nil. (#702) - Worker: CodexCLI crash recovery extended to fresh sessions (previously only resumed), using
!doneReceivedinstead ofexitCodeas trigger for robust crash detection. (#702) - Worker: ACP rawInput
pathkey normalized tofile_pathfor toolfmt — file operations now show descriptive status instead of generic placeholder. - Gateway Core: ForwardEvents turn count restore uses
bgCtx()(shutdown-scoped) instead of requestctx, preventing DB query failures on user disconnect. (#702) - Messaging: Feishu
id_convertretries 3x with exponential backoff (100ms→200ms→400ms) for transient API failures, with permanent error skip (auth/404). (#702)
Contributors
v1.26.3
Summary
v1.26.3 是一次 patch 版本更新,聚焦于 CodexCLI 和 OCS Worker 稳定性。修复 CodexCLI Wait() 永久阻塞导致 goroutine 泄漏、elicitation 请求被静默丢弃、terminated session 无效 resume 三个缺陷;修复 OCS Worker SSE 读取器静默死亡导致会话挂起、关键事件被丢弃等六项并发安全缺陷。同时新增 CanResumeTerminated() 能力接口,统一了 Worker 类型级别的 terminated session 恢复策略。
Changed
- Worker: Add
CanResumeTerminated()capability to Worker interface — CodexCLI returnsfalse(singleton killed on release), all others returntrue. Bridge uses this instead of hard-coded type switch to skip resume for terminated sessions. (#692) - Worker: Register-time capability cache (
capCache) in registry —CanResumeTerminated()queries cached value instead of creating temporary worker instances. - Infrastructure: DRY extractions across messaging STT/TTS PID tracking, worker base lock pattern, session store Upsert JSON helpers, and admin API key store CRUD. (#695)
Fixed
- Worker: CodexCLI
Wait()permanent block —release()nileddoneCh, causingWait()to receive from nil channel forever. Fix: atomic capture under mutex,release()closes but does not nil. (#691) - Worker: CodexCLI
mcpServer/elicitation/requestsilently dropped — mapper had no case for this notification type, causing Codex agent permission prompts to hang indefinitely. (#698) - Worker: CodexCLI terminated session dead resume path — resume always failed (singleton killed), producing WARN spam and double config load before falling back to fresh start. (#699)
- Worker: OCS SSE reader silent death — fatal errors now close all subscriber channels via
closeAllSubscribers(), unblockingforwardEventsgoroutines. (#697) - Worker: OCS critical events silently dropped — two-hop pipeline (singleton → worker) now classifies events as droppable/critical; critical events use blocking send with 5s timeout. (#697)
- Worker: OCS
crashSubfalse positive —Wait()checksIsRunning()when crashSub fires; returns 0 if singleton recovered. (#697) - Worker: OCS channel panic race —
forwardBusEventschecksconn.closedunder mutex before writing to recvCh. (#697) - Worker: OCS duplicate Done events —
handleSessionIdlereturns nil when stats already cleared by prior error handler. (#697) - Worker: OCS
sync.Oncereentrant deadlock —Wait()calledrelease()viareleaseOnce.Do, butrelease()itself calledreleaseOnce.Do. Fixed by callingrelease()directly (idempotent). (#697)
Contributors
v1.26.2
Summary
v1.26.2 是一次 patch 版本更新,聚焦于 Cron 投递可靠性 和 OpenCode Server 生命周期加固。新增 Cron 投递重试机制(指数退避,最多 3 次),OCS Worker 实现 SystemPromptUpdater 接口支持动态刷新系统提示词。同时包含大量并发安全修复(OCS sessionID 数据竞争、platform_writer 通道防护、bridge 会话泄漏防护)和 Slack 适配器错误处理改进。
Added
- Cron: Delivery retry with exponential backoff — in-memory retry queue (max 100 entries) for transient failures (429, timeout, 5xx), up to 3 retries with 30s→1m→2m backoff, new
hotplex.cron.delivery.resultmetric. (#577) - Worker: OCS
SystemPromptUpdaterinterface —UpdateSystemPromptmethod updates conn.systemPrompt under mutex, enables bridge to push refreshed system prompt after/resetwithout session recreation. (#664) - Observability:
hotplex.cron.delivery.resultmetric with{status,platform}labels — records all delivery outcomes (success, exhausted, permanent, transient).
Fixed
- Worker: OCS sessionID data race — 6 read sites accessed
conn.sessionIDwithout mutex; unified viagetSessionID()helper, symmetric with write-side lock discipline. (#664) - Gateway Core: Bridge lifecycle hardening — delete orphaned session on transition-to-running failure, rollback resume attach-failure to TERMINATED, 5s timeout on rollback context. (#577)
- Gateway Core: Platform writer send-on-closed-channel —
recover()guard eliminates TOCTOU window,atomic.Boolclosed flag prevents writes after disconnect. (#577) - Messaging: Slack adapter clears Thinking status on error and shows generic error feedback; fallback message for empty error events. (#577)
- Worker: OCS singleton goroutine leak — close stdout on
discoverPorttimeout; server-side session DELETE inrelease()for resource cleanup; non-blocking Wait() crash check eliminates 2s goroutine leak. (#664)
Contributors
v1.26.1
Summary
v1.26.1 是一次 patch 版本更新,修复 ACP Worker 启动后首轮 prompt 返回空文本的关键 bug,以及 make dev-reset 的日志清理竞态问题。
Fixed
- Worker: ACP readLoop goroutine exited prematurely after initial handshake notifications — burst drain loop used
return(exits entire function) instead of labeledbreak, killing the notification consumer before any prompt text arrived. - Infrastructure:
make dev-resetlog cleanup raced with running gateway — restructured as three-phase stop→clean→start sequence with confirmed process termination before file deletion.
v1.26.0
Summary
v1.26.0 是一次 minor 版本更新,聚焦于 多 Bot 配置体验 和 Worker 架构现代化。核心变更将 Agent 配置路径从平台运行时 ID(ou_xxx/U04ABC)迁移为 YAML 配置名(my-bot),使路径可读且不受 Bot 重命名影响。Gateway 层会话状态编排下沉至 Worker 层,消除 3 个 gateway 接口和 7 处类型断言。安全方面新增可配置 CORS origins 和 RFC 9116 security.txt 端点。ACP Worker 新增通知排空机制解决输出丢失问题。
Added
- Configuration: Agent config path resolution uses YAML
bots[].nameinstead of platform runtime IDs — readable paths (feishu/my-bot/SOUL.md), stable across Bot renames, withValidateBotNamepath-traversal guard. (#678, #679) - Configuration:
GATEWAY_BOT_NAMEenvironment variable injected into Worker processes; Cron--bot-nameflag for multi-Bot agent-config isolation. - Worker:
SystemPromptUpdaterinterface — workers reload agent config on/resetwithout session recreation. (#659) - Worker:
ForceKillTree— kill orphaned child processes that escape PGID (e.g. MCP servers spawned by Codex). Platform-native/proc(Linux) andsysctl(macOS) child discovery. (#659) - Security: Configurable CORS origins (
security.allowed_origins), RFC 9116security.txtendpoint (security.contact), tightened docs CSPconnect-srcto'self'. (#663) - Worker:
SessionStartParamsstruct replacing 11–13 positional parameters acrossStartSession/StartPlatformSessioninterfaces. (#679)
Changed
- Gateway Core: Session state orchestration pushed from Bridge into Worker layer — workers return
ResetResult{ConnReplaced}, emitinternal_resetevents.ResetGenerationeratomic guard prevents stale goroutine interference. (#659) - Worker: CodexCLI exec mode removed — app-server singleton is the sole implementation.
use_app_serverconfig deprecated and auto-normalized. (#659) - Infrastructure: Temp file management unified —
os.TempDir()withhotplex/worker/subdirectory, gateway startup cleanup for orphaned files (>2h worker, >24h media). (#675) - CLI: Setup skill rewritten for v1.25.0+ production alignment — 4-phase decision tree, 48% shorter. (#677)
Fixed
- Session: GC deadlock —
worker.Terminate()moved outside session mutex, preventing blocking all concurrent reads during graceful shutdown. (#656) - Worker: ACP notification drain — channel handshake ensures
MessageDeltareaches bridge beforeDone, fixing empty text output for all ACP turns. (#685) - Gateway Core: TERMINATED session now attempts resume with fallback to fresh start — idle_timeout/gc no longer lose conversation context. (#683)
- Events:
ToInt64/ToFloat64handleint32type — atomic counter values fromsessionAccumulatorno longer cause turn summary loss and feishu card instability. (#688) - Cron: Timeout errors now logged with
error_type+ state confirmation in executor. (#667) - CLI:
doctor config.requiredchecker now recognizes multi-bot YAML configurations. (#671) - Configuration: Config watcher
~expansion —fsnotifyfailed with "no such file" when home directory was specified with tilde. (#673, #674) - Docs: API Console localized Scalar JS — CDN unreachable from Chinese servers caused infinite spinner. Layout redesigned to fit header + Scalar in one viewport.
Contributors
v1.25.0
Summary
v1.25.0 是一次 minor 版本更新,聚焦于 可观测性现代化、API 文档体系 和 Brain 模块瘦身。
Highlights
- OTel-native 可观测性: 统一
internal/observability/包取代分散的 metrics + tracing,~55 个指标覆盖 12 个域,W3C TraceContext 传播,AEP envelope 注入trace_id(#642) - API 文档 + Scalar 控制台: swaggo 注解混合生成 Swagger JSON,嵌入式 Scalar API Console 支持在线调试 Gateway/Admin 双端口接口 (#632)
- API Console 品牌化: HotPlex 品牌头部、亮色主题、CSP 字体白名单(jsDelivr + fonts.scalar.com)修复、文档首页快捷入口
- Brain 净删除 6,221 行: 移除 SafetyGuard / IntentRouter / ContextCompressor 三个死子系统,提取
RedactSensitive()到独立包 (#638) - 文档中心性能: 条件 mermaid.js 加载(55/56 页省 3.2 MB)、gzip 压缩、Cache-Control、Google Fonts 本地化 (#644)
- 文档构建缓存: mermaid.js / 字体跨构建复用,跳过 ~25s 网络 I/O
Changed
- Observability:
internal/metrics/+internal/tracing/→internal/observability/,promauto → OTel Meter API,hotplex.*命名空间 (#642) - Brain:
brain.Close()替代brain.GlobalGuard().Close(),移除GetRouter()/GetRateLimiter()(#638) - Docs Builder: 构建时缓存 assets 目录,
Makefile增加.html变更检测、排除swagger/生成目录 - Security CSP:
DefaultDocsCSP增加https://fonts.scalar.com到font-src,修复 Scalar 字体加载阻止错误 - WebChat: Worker 图标重绘、NewSessionModal 卡片整理
Fixed
- Cron: Webhook 触发时注入
TARGET_PR前缀到 worker prompt,防止 LLM 忽略指定 PR 而枚举全部 (#647) - SDKs: 1.24.x 兼容性修复(examples + client)
- Docs: CLI 参考表格布局优化
Contributors
v1.24.4
Summary
v1.24.4 是一次 patch 版本更新,聚焦于 session identity 迁移的安全加固。PR #635 补全了 WS init 路径缺失的输入清洗和长度校验,隔离了 session 恢复/启动的 context 超时预算,并统一了前端 session ID 生成逻辑。同步更新了 specs 索引和 WebSocket 集成文档。
Changed
- Gateway Core: WS init path now sanitizes
titleandsessionIDviamessaging.SanitizeText()— REST API path already had this; the WS path was unguarded. (#635) - Gateway Core: Resume→Start session fallback uses independent 30s timeout contexts instead of sharing a single budget — a slow resume no longer starves the heavier StartSession. (#635)
- Gateway Core: Replace hardcoded
256withsession.MaxClientKeyLenconstant in API handler for consistency with session manager's validation. (#635) - WebChat UI: Replace inline
crypto.randomUUID()with existingnewSessionId()utility — provides cross-browser fallback instead of silently failing. (#635) - WebChat UI: Consolidate
MAIN_SESSION_CLIENT_ID/MAIN_SESSION_TITLEinto singleANCHOR_SESSION_IDconstant. (#635)
Fixed
- Session: Title length validation missing in WS init path —
client_session_idhad a 256-char guard buttitledid not, allowing unbounded input. (#635)


