InterviewOps SDK is a standalone Node.js package for running an interview-note collection pipeline on top of:
openclifor Xiaohongshu collectionoh-my-codex/omxfor stable Codex-side orchestration
It is designed for the workflow you built earlier in opencli, but split into a dedicated SDK repository with:
- reusable domain types
- OpenCLI adapters
- OMX stabilization wrapper
- a nightly interview collection pipeline
- seller / lead-gen note marking
- JSON + HTML exports
Current release:
v0.1.0- Git tag:
v0.1.0
For Xiaohongshu interview notes, the SDK can:
- incrementally search notes by query
- hydrate note detail content
- enrich comments
- extract interview questions
- mark likely seller / lead-gen accounts or notes
- export:
xhs_notes.jsonxhs_questions.json(loose / compatibility)xhs_questions_strict.json(default reporting set)- topic buckets
- company / round summary
- filterable HTML overview
- optionally auto-commit each cycle
git clone https://github.com/jerry609/InterviewOps-SDK.git
cd InterviewOps-SDK
npm install
npm run buildQuick sanity checks:
npm run typecheck
npm test
npm run buildExternal prerequisites:
- Node.js 20+
opencliomx- Chrome logged into
xiaohongshu.com
Package notes:
- The published SDK is
ESM-only - The primary supported surface is the bundled CLI plus ESM imports
- TypeScript consumers should use a Node runtime config (
moduleResolution: "Node16"or"NodeNext")
After build:
node dist/cli.js --helpMain commands:
npm run dev -- init
npm run dev -- template
npm run dev -- sources
npm run dev -- seed-import --source-notes /path/to/xhs_notes.json
npm run dev -- harvest
npm run dev -- hydrate --limit 12
npm run dev -- comments --limit 8
npm run dev -- normalize
npm run dev -- questions
npm run dev -- overview
npm run dev -- status
npm run dev -- doctor
npm run dev -- export
npm run dev -- seller-summary
npm run dev -- ralph "analyze the current dataset"
npm run dev -- ralph-loop 6 --workspace ./workspaces/xhs-agent-algo-feb2026
node dist/cli.js stats
node dist/cli.js template
node dist/cli.js doctor
node dist/cli.js export
node dist/cli.js seller-summary
node dist/cli.js cycle
node dist/cli.js nightly 8
node dist/cli.js ralph-loop 6 --workspace ./workspaces/xhs-agent-algo-feb2026
node dist/cli.js validate
node dist/cli.js omx-safe doctorDuring development:
npm run dev -- init
npm run dev -- harvest
npm run dev -- hydrate --limit 12
npm run dev -- comments --limit 8
npm run dev -- normalize
npm run dev -- questions
npm run dev -- overview
npm run dev -- status
npm run dev -- doctor
npm run dev -- stats
npm run dev -- export
npm run dev -- seller-summary
npm run dev -- cycle
npm run dev -- nightly 8
npm run dev -- omx-safe doctorCommand notes:
template: copies the bundled LaTeX interview template into the workspacesources: lists currently built-in source adaptersseed-import: imports scoped seed notes from an existingxhs_notes.jsoninto the target workspaceharvest: runs incremental search onlyhydrate: fills note detail content onlycomments: enriches comments onlynormalize: refreshes question extraction and seller flags onlyquestions: rebuilds both loose and strictxhs_questions*.jsonoutputsoverview: rebuilds strict overview HTML/summary plus seller reportsstatus: shows current stats plus last recorded stage runsdoctor: verifiesnode,opencli,omx, config path, data dir, and report direxport: rebuilds question/topic/overview/seller outputs from existing note dataseller-summary: refreshes seller-tagged reports from current note dataralph: shortcut foromx-safe exec --full-auto '$ralph "..."'ralph-loop: repeatedly runs bounded Ralph collection cycles for a dedicated workspace
The control plane is the typed orchestration layer that OMX/Codex reads before deciding the next action.
control-status prints the current typed orchestration snapshot:
npm run dev -- control-status --workspace ./workspaces/xhs-agent-algo-feb2026run-operation executes exactly one typed operation and persists the result in workspace state and operation_journal.jsonl:
npm run dev -- run-operation hydrate --workspace ./workspaces/xhs-agent-algo-feb2026 --limit 12 --reason "pending_hydrate backlog dominates current cycle"ralph-loop now uses control-status plus run-operation internally instead of a fixed stage sequence.
By default the SDK calls:
opencli xiaohongshu search ...
opencli xiaohongshu note-detail ...
opencli xiaohongshu comments ...If your working opencli is a local checkout instead of a globally installed binary, you can point the SDK at it:
export INTERVIEWOPS_OPENCLI_BINARY=npm
export INTERVIEWOPS_OPENCLI_ARGS_JSON='["-C","/path/to/opencli","run","dev","--"]'That makes the SDK run commands like:
npm -C /path/to/opencli run dev -- xiaohongshu search ...If live XHS search is unstable, you can seed a dedicated workspace from an existing notes library first:
npm run dev -- seed-import \
--workspace ./workspaces/xhs-agent-algo-feb2026 \
--source-notes /home/master1/opencli/interview_data/xhs_notes.jsonomx-safe wraps omx with a stable policy:
- removes common proxy environment variables
- forces
USE_OMX_EXPLORE_CMD=0 - creates
.omx/stateautomatically
Example:
npm run dev -- omx-safe doctorThe SDK writes into the current workspace:
interview_data/
xhs_notes.json
xhs_questions.json
xhs_questions_strict.json
xhs_questions_nlp.json
xhs_questions_nlp_strict.json
xhs_questions_backend.json
xhs_questions_backend_strict.json
xhs_questions_algo.json
xhs_questions_algo_strict.json
company_round_summary.json
reports/xhs-miangjing/
index.html
status.json
run_history.jsonl
xhs_questions_nlp.html
xhs_questions_backend.html
xhs_questions_algo.html
seller_candidates.json
author_seller_summary.json
seller_summary.md
progress.log
templates/
interview-note-template.tex
interview-note-template.pdf
Default reporting behavior:
xhs_questions.jsonandxhs_questions_{topic}.jsonstay as loose compatibility exportsxhs_questions_strict.jsonandxhs_questions_{topic}_strict.jsonare the cleaned question-bank exportsindex.html, topic HTML files, andcompany_round_summary.jsondefault to the strict set
Create a local config and output directories in the current workspace:
npm run dev -- initThat writes:
./interviewops.xhs.json
./interview_data/
./reports/xhs-miangjing/
You can also initialize another workspace:
npm run dev -- init --workspace /data/interviewopsSee:
By default the CLI uses:
./interviewops.xhs.jsonif it exists in the target workspace- otherwise the packaged example file
You can override it explicitly:
npm run dev -- cycle --prd ./examples/xhs-miangjing.prd.jsonThe PRD now includes:
source- query list
sellerWhitelist- data/report/state paths
- search/detail/comment batch and timeout policy
- harvest/sleep cadence
The SDK does not drop seller-team notes.
It keeps them and marks them with:
seller_flagseller_tagsseller_confidence
These fields are propagated into:
xhs_notes.jsonxhs_questions.json- topic exports
- overview HTML
seller_candidates.jsonauthor_seller_summary.jsonseller_summary.md
Whitelist config example:
{
"sellerWhitelist": {
"authors": ["可信作者A"],
"note_ids": ["69c9d37b0000000023007921"],
"title_keywords": ["内部分享"],
"urls": ["example.com/trusted"]
}
}Whitelisted notes keep raw seller tags/confidence for debugging, but:
seller_flagwill be forced tofalseseller_whitelistedwill betrueseller_whitelist_reasonrecords the match source
The SDK also marks notes that appear to contain purchase links.
Current outputs include:
purchase_link_flagpurchase_linkspurchase_link_tagspurchase_link_confidence
Detection combines:
- explicit commerce URLs
- purchase-link phrases
- e-commerce platform mentions
These fields are surfaced in:
- note JSON
- question JSON
- topic HTML
- overview HTML
- seller summary markdown
Current built-in adapters:
xiaohongshu
The pipeline now resolves a source adapter from config:
{
"source": "xiaohongshu"
}That keeps the CLI stable while making it possible to add more sources later without rewriting pipeline orchestration.
Bundled workspace:
This workspace is scoped to:
- Xiaohongshu
2026-02-01onward- internet major companies
Agent / 智能体 / LLM / 大模型应用开发算法岗 / NLP / 大模型算法
Run the dedicated Ralph loop:
npm run dev -- ralph-loop 6 --workspace ./workspaces/xhs-agent-algo-feb2026What this dedicated loop is meant to collect:
- Xiaohongshu notes only
- internet major companies
Agent / 智能体 / LLM / 大模型应用开发算法岗 / NLP / 大模型算法- interview-note style content
It now uses a two-step strategy:
- broad Xiaohongshu search queries
- local
scopeFilternarrowing to the exactAgent/LLM + 算法岗 + 大厂 + 2026-02-01+slice
This is intentional because overly narrow XHS queries like
腾讯 agent 算法 面经 were timing out in practice.
Each ralph-loop cycle now reads control-status, selects one operation, and dispatches it through run-operation rather than stepping through a fixed stage list.
Primary broad query family includes:
腾讯 面经字节 面经阿里 面经美团 面经百度 面经京东 面经快手 面经LLM 算法 面经LLM 面经智能体 面经Agent 面经算法 面经NLP 面经
Persisted outputs land in that workspace:
interview_data/xhs_notes.jsoninterview_data/xhs_questions.jsonreports/xhs-agent-algo-feb2026/
Curated filtered outputs land here:
reports/xhs-agent-algo-feb2026/scope_candidates.jsonreports/xhs-agent-algo-feb2026/scope_candidates.md
If live search keeps timing out, recommended flow is:
seed-importfrom an existingxhs_notes.jsonexportstatus- resume
ralph-looponly afteropencli xiaohongshu searchis stable again
Recommended ways to monitor it:
tmux attach -t interviewops-agent-llm-algosed -n '1,120p' /tmp/interviewops/agent-llm-algo-loop.log
sed -n '1,120p' ./workspaces/xhs-agent-algo-feb2026/reports/xhs-agent-algo-feb2026/ralph-loop.logAnd inspect structured status:
npm run dev -- status --workspace ./workspaces/xhs-agent-algo-feb2026Bundled assets:
Copy them into your workspace:
npm run dev -- templateBy default the SDK does not auto-commit.
Enable it per command:
npm run dev -- cycle --auto-commit
npm run dev -- nightly 8 --auto-commitnpm test
npm run typecheck