Merge Dev - Add LangSmith integration and bump cli version (0.3.17) by aliroberts · Pull Request #116 · WecoAI/weco-cli

aliroberts · 2026-03-09T12:12:52Z

Summary

Add LangSmith as a pluggable eval backend, enabling Weco to optimize code against LangSmith datasets and evaluators
Add an interactive browser-based setup wizard that launches when LangSmith args are missing in a TTY session
Add dataset split support (--langsmith-splits) following LangSmith best practices
Bump version to 0.3.17, bump minimum Python to 3.9, remove unused fastapi/slowapi deps

LangSmith Integration

Eval Backend (`weco/integrations/langsmith/backend.py`)

Pluggable backend implementing register_args(), validate_args(), and build_eval_command(). Adds 12 --langsmith-* CLI flags covering dataset, target, evaluators, splits, adapters, dashboard evaluators, and custom metric functions.

Evaluation Bridge (`weco/integrations/langsmith/bridge.py`)

Subprocess runner that dynamically imports user code, resolves evaluators (custom module:function specs, LangSmith built-ins, or local evaluators.py), runs client.evaluate(), polls for async dashboard evaluator scores, and prints
metrics in Weco's key: value format. Supports target adapters for LangChain runnables and single-input functions.

Setup Wizard (`weco/integrations/langsmith/wizard/`)

Browser-based single-page app served from a local HTTP server. Guides users through API key setup, dataset selection (with split picker), source file selection, target function discovery (AST-based), and evaluator configuration. Mutates the
CLI args namespace on submit.

Dataset Splits

All layers support --langsmith-splits to filter evaluation to specific dataset splits (e.g. opt, holdout) instead of requiring separate datasets per split. The wizard fetches available splits and shows a chip picker.

CLI Changes (`weco/cli.py`)

Added --eval-backend flag with backend registry (_EVAL_BACKENDS, _load_backend())
Made --source, --eval-command, --metric, --goal non-required when using a backend
Backend args registered dynamically via backend.register_args()

Example

examples/langsmith-zeph-hr-qa/ — HR QA agent optimized against a LangSmith dataset with custom evaluators, dashboard LLM judges, and a gated metric function. Uses dataset splits for train/holdout separation.

aliroberts added 2 commits March 9, 2026 11:14

Add LangSmith integration + wizard. Add langsmith-zeph-hr-qa example

2b4b97f

Update to support langsmith dataset splits

e36dfed

aliroberts force-pushed the dev branch 3 times, most recently from 7b1e198 to 8b4155b Compare March 9, 2026 12:39

Escape langsmith command when constructing + bump version

edeec3a

aliroberts force-pushed the dev branch from dd00b2b to edeec3a Compare March 9, 2026 12:41

aliroberts merged commit 323ed2e into main Mar 9, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge Dev - Add LangSmith integration and bump cli version (0.3.17)#116

Merge Dev - Add LangSmith integration and bump cli version (0.3.17)#116
aliroberts merged 3 commits intomainfrom
dev

aliroberts commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aliroberts commented Mar 9, 2026

Summary

LangSmith Integration

Eval Backend (weco/integrations/langsmith/backend.py)

Evaluation Bridge (weco/integrations/langsmith/bridge.py)

Setup Wizard (weco/integrations/langsmith/wizard/)

Dataset Splits

CLI Changes (weco/cli.py)

Example

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Eval Backend (`weco/integrations/langsmith/backend.py`)

Evaluation Bridge (`weco/integrations/langsmith/bridge.py`)

Setup Wizard (`weco/integrations/langsmith/wizard/`)

CLI Changes (`weco/cli.py`)