feat(langsmith): add @temporalio/langsmith package#2099
Open
xumaple wants to merge 2 commits into
Open
Conversation
Port the LangSmith observability plugin into the sdk-typescript monorepo
as a new contrib package, converted to repo conventions:
- src/ plugin sources copied verbatim (logic unchanged), adapted from ESM
to CommonJS for the repo's module setup
- tests converted from vitest to ava under src/__tests__/
- workspace:* deps, tsconfig project references, registered in
pnpm-workspace.yaml
Status: tsc --build clean; 37/52 ava tests pass. The 15 workflow-bundle
E2E tests fail because langsmith's CommonJS build pulls node:fs /
node:path / node:worker_threads into the Workflow isolate via
run_trees -> client -> utils/fs (langsmith's browser-field redirect to
isolate-safe variants only covers its ESM files, not its .cjs files).
Isolate-bundling fix tracked separately.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fix workflow-bundle build: redirect langsmith CJS node-utils to browser variants
langsmith's run_trees -> client chain pulls dist/utils/fs.cjs and
worker_threads.cjs into the Workflow isolate, importing node:fs/node:path/
node:worker_threads and aborting the bundle build with UnhandledSchemeError.
langsmith's browser field already maps these to isolate-safe siblings but only
for its ESM modules, not the .cjs files webpack resolves on this CommonJS
package's path. Extend the existing bundler shim to apply the same swap to the
.cjs variants, scoped to langsmith.
Restores all 15 workflow-bundle E2E tests: full suite now 52/52.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address LangSmith audit findings and format to repo lint config
- Close the workflow-side signal marker (end + patchRun) so SignalChild/
ExternalWorkflow markers don't stay open; add a regression test.
- Set tracingEnabled on the Nexus RunTree to match the activity/client paths
(consistent emission); drop the LANGSMITH_TRACING crutch from test-nexus.
- Fix a stale isReplaying -> isReplayingHistoryEvents comment.
- Apply repo eslint/prettier (trailingComma, import order, type-only imports).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
langsmith: remove orphaned comment in plugin.ts
An import reorder left a comment describing the SimplePlugin base-class
merge behavior floating after the import block with nothing adjacent to
describe. The configure* methods already document the merge-then-append
behavior in their own JSDoc, so the comment is redundant; delete it.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
langsmith: default tracing OFF, matching langsmith's isTracingEnabled
The plugin's tracing gate previously defaulted ON (tracing ran unless an
env var was explicitly falsy). Flip it to default OFF, matching the
underlying langsmith library: tracing is enabled only when one of
LANGSMITH_TRACING_V2 / LANGCHAIN_TRACING_V2 / LANGSMITH_TRACING /
LANGCHAIN_TRACING is explicitly the string "true".
The plugin invents no env semantics of its own. isTracingEnabled now
delegates to a private langsmithIsTracingEnabled replica that reproduces
langsmith 0.7.x's dist/env.cjs logic verbatim, behind a single boundary.
langsmith 0.7.x does not publicly export isTracingEnabled (no ./env
entry), so a TODO documents the one-line import that replaces the replica
once it does.
Tests updated for the new default-off contract:
- test-unit.ts: assert default-off (no var -> false), explicit "true"
enables across all four recognized vars, and non-"true" values
(including "TRUE"/"1") stay disabled.
- test-nexus.ts: re-add LANGSMITH_TRACING='true' (the Nexus interceptor
gates on isTracingEnabled, which is now off by default).
- test-env.ts: add a default-off regression test asserting the plugin
emits nothing when no recognized tracing env var is set; save/restore
the recognized vars and run both env cases serially so they don't leak.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fix crypto crash for root Workflow-body traceable (seed synthetic root run)
A traceable() inside a Workflow body with no enclosing LangSmith parent run
crashed the Workflow Task with "ReferenceError: crypto is not defined": with no
ambient parent, langsmith mints a uuid7 via crypto.getRandomValues, but the
Workflow V8 isolate has no crypto global (deliberately excluded for
determinism). Hit the common case (addTemporalRuns: false, Workflow not started
from inside a client-side traceable).
When no parent is propagated, install a synthetic anchor
(_RootReplaySafeRunTreeFactory) as the ambient run so langsmith takes its
deterministic createChild branch instead. The anchor never emits, and its
createChild produces an independent root (no parent link) so the user's run
isn't orphaned under a phantom parent -- mirroring the Python integration's
_RootReplaySafeRunTreeFactory. Covers execute / signal / query / update (via
runInbound) and the synchronous validateUpdate path.
Adds a regression test: a root Workflow-body traceable (addTemporalRuns: false,
no client wrapper) completes without crashing and emits only the user's run.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AI-4: Finalize langsmith package — langchain plugin-name parity, README, langsmith range
- Plugin name langsmith.LangSmithPlugin -> langchain.LangSmithPlugin, matching
the Python plugin (sdk-python temporalio/contrib/langsmith). LangSmith Cloud
groups telemetry by this string, so the two SDKs must report the same name.
- Add the package README (it was dropped during the port), updated for current
behavior: tracing is OFF by default (set LANGSMITH_TRACING=true to enable),
and only langsmith is a peer dependency (the @temporalio/* packages are
workspace-locked dependencies).
- Tighten the langsmith peer/dev range ^0.7.0 -> ^0.7.1 (the validated version;
the plugin relies on 0.7.x internals and the .browser.cjs variants).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AI-4: Emit no run for continue-as-new (Python parity)
The Python plugin's continue_as_new creates no trace — it only injects the
ambient trace context into the continue-as-new headers ("No trace created, but
inject context from ambient run"). The TS plugin was emitting an extra
ContinueAsNew: marker. Drop it: continueAsNew now only propagates the context
header, so the successor stays on the same trace with no spurious marker run.
Removes the now-unused continueAsNewRunName builder; updates the continue-as-new
test to assert no marker is emitted (the trace-continuity assertions stand).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Rewrite langsmith contrib tests as a comprehensive suite; fix interceptor bugs it surfaced
Replace the narrow per-interceptor tests (test-client-interceptor, test-nexus)
with a comprehensive Workflow-tree suite (test-comprehensive, test-comprehensive-tree)
plus focused integration tests for Signals, Queries, Updates, Nexus,
continue-as-new, replay, side effects, and flush behavior.
The broader coverage surfaced several real bugs, fixed here:
- client-interceptor: the Query interceptor read `input.queryName`, but the SDK
passes the Query name as `input.queryType`, so Query runs were emitted with an
undefined name. Read `queryType`.
- activity-interceptor (Nexus): the Nexus inbound interceptor read `service`,
`operation`, and `headers` off the top-level input, but the SDK nests them
under `input.ctx`. Context propagation never decoded (headers were always
undefined) and run names used an undefined service/operation. Read from
`input.ctx`.
- workflow-interceptors: install the inbound run for Signal/Query/Update
handlers on the synchronous stack (`ctx.run`) instead of a persistent ambient
(`ctx.withAmbient`), so a handler running concurrently with the Workflow body
cannot leak its run into the body's ambient context; `execute` keeps the
ambient install. validateUpdate now installs the reconstructed parent (or a
synthetic anchor when none was propagated) so a `traceable` in the validator
body nests under the Update's trace.
All 46 tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AI-4: Clean up langsmith contrib package before release
Nothing is released yet, so no backwards compatibility is preserved.
Public surface: reduce to LangSmithPlugin + LangSmithPluginOptions; the
sink-construction helpers (createLangSmithSinks, LangSmithSinks,
SerializedRun) become internal. Mark both public symbols @experimental,
matching the sibling contrib plugins.
Cleanup: delete the unused newRun helper; inline the newRunId and
workflow-side updateName one-line wrappers; de-duplicate the asOutputs,
run-header, and marker-emit helpers into run-tree; collapse the
isTracingEnabled pass-through; and strip restating / implementation-narrating
comments, including an inaccurate claim that the context manager persists
across Workflow Executions.
Fix: on continue-as-new with addTemporalRuns off and no propagated parent,
the successor's user runs dangled under a never-emitted synthetic root. Skip
propagation when the ambient is a synthetic root so the successor keeps its
runs as proper roots. Adds a regression test.
Lockfile: record contrib/langsmith's nexus-rpc devDependency, which was
declared in package.json but missing from the lockfile importer.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AI-4: Purge comment slop from langsmith contrib package
Delete history-narration comments, collapse the triple-explained
async_hooks handling into a single explanation, and trim JSDoc on
self-evident one-line methods. Also two behavior-preserving dead-code
removals: the redundant 'authorization' scrub prefix (already covered
by 'auth') and the nowMs() wrapper (inlined to Date.now()).
No behavior change; 47 tests green, tsc clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AI-4: Dedup run-tree construction and tighten langsmith API surface
Collapse four near-identical RunTree construction sites into two
context-appropriate helpers: buildRunTree (emitter side — real client,
crypto UUID, direct emit) and buildReplaySafeRunTree (workflow isolate —
deterministic uuid4, NOOP_CLIENT, sink emission), plus a shared
runTreeFromContext parse for anchor reconstruction.
Type the EmitterConfig.client seam as the real langsmith Client, deleting
the as-unknown-as casts and the now-unused interface; restore type-checking
on the sink params (documenting langsmith's tags-on-create type gap);
replace the flush feature-probe with a direct awaitPendingTraceBatches().
Mark the ./workflow-interceptors entry and run-tree internals @internal
(worker/bundler machinery, not a hand-import API); the . entry still
exports only LangSmithPlugin + LangSmithPluginOptions.
No behavior change; 47 tests green, tsc clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AI-4: Isolate read-only-handler run ids from the workflow PRNG
Read-only handlers (handleQuery, validateUpdate) minted LangSmith run ids
via uuid4(), drawing from the workflow's main deterministic PRNG. On a
cached instance that advanced the PRNG for the next real task while a fresh
replay (which never runs the handler) did not — latent nondeterminism.
Thread a per-invocation PRNG, seeded from the handler's queryId/updateId
via prngFromInputId, through ReplaySafeRunTree (root + every createChild),
so read-only run ids are minted off an independent stream and never touch
the main PRNG. Recorded paths (execute/signal/update-handler) are
unchanged. Per-invocation seeds keep read-only ids unique across reloads.
Adds test-readonly-determinism: a maxCachedWorkflows live-cache replay test
that fails before this fix and passes after.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AI-4: Clean up langsmith test suite and README
Hygiene: rename the misleadingly-swapped trace-tree tests
(test-comprehensive-tree -> test-comprehensive; the old edge-case
test-comprehensive -> test-parenting-edge-cases), drop the dead ava
`**/*.test.js` glob, trim never-invoked HandlersWorkflow fixtures,
centralize the SIMPLE_TREE constant in helpers, fix an orphaned comment,
and add the experimental caveat to the README.
Make the read-only determinism test non-flaky by design: drive the
perturbation through a validator-rejected update (durably delivered, so it
never races worker readiness) instead of a live query, run replay inside the
live env scope so it reuses the Runtime singleton (avoids a native
finalization race), and collapse the comment essays to one-line WHYs.
Tests/README only; no production code change. tsc clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AI-4: Rename coined vocabulary in langsmith contrib package
Replace insider jargon with clearer terms grounded in what each thing is:
syntheticRoot -> placeholderRoot, asReplaySafeAnchor -> asReplaySafeParent,
the activity-side anchor() -> reconstructParentRun, anchor vars -> parent,
client parentMessage -> parentMarker. Sweep the matching prose in comments
and tests too (anchor -> parent, synthetic -> placeholder,
kill-switch -> tracing gate, footgun -> pitfall).
Behavior-preserving rename; tsc clean. _RootReplaySafeRunTreeFactory kept
for Python parity.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AI-4: Tighten langsmith API surface and bundler error handling
Trim the @temporalio/langsmith public surface and harden the plugin
before release:
- Enable stripInternal so the ./workflow-interceptors subpath no longer
publishes the internal AsyncLocalStorage shim or WorkflowLangSmithConfig
in its .d.ts.
- Un-export encodeContextPayload, decodeContextPayload, and serializeRun
(each used only within its own module); drive the codec's never-throws
path through readContextHeader in the unit test.
- Narrow the three webpack require() catch blocks to swallow only
MODULE_NOT_FOUND and rethrow every other failure, so a real
DefinePlugin/alias error surfaces instead of silently shipping a
workflow bundle missing config injection and the async_hooks shim.
- Drop the stored configJson field; compute the double-encoded config
inline at bundler time.
- Remove restating JSDoc/banner comments and the README "advanced
extension" bullet that contradicted the code's bare-identifier access.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AI-4: Add regression test for emitter-side metadata scrub
Activity-side runs emit via the RunTree's native postRun/patchRun and never pass
through serializeRun, so credential scrubbing on that path comes solely from
buildRunTree. Pin it so a future change can't silently drop the scrub on the path
that bypasses serializeRun.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AI-4: Type langsmith interceptor factories with real SDK interfaces
The client, activity, and Nexus interceptor factories returned
Record<string, unknown> with hand-rolled local views of the SDK
interceptor-input types, forcing three `as unknown as` casts in plugin.ts.
Implement the real WorkflowClientInterceptor, ActivityInboundCallsInterceptorFactory,
and NexusInboundCallsInterceptor types so the casts and the local interfaces
are gone.
This surfaced an args-capture bug: per-call args live at different input
fields (start at options.args, signalWithStart at signalArgs, update at args),
but the code read a flat input.args off start inputs and silently captured [].
Read the correct field per call; test-client-args.ts pins it.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AI-4: Format two pre-existing langsmith files with prettier
test-comprehensive.ts and workflow-interceptors.ts failed CI's repo-wide
`prettier --check .` (they fail at the branch base too, untouched by the
cleanup). Reflow only, no behavior change.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AI-4: Match Python's langsmith header key for cross-SDK propagation
The TS plugin carried trace context under `__temporal_langsmith_context`
while the Python plugin uses `_temporal-langsmith-context`, so a context
written by one SDK was invisible to the other. Python is the parity
reference; change TS to its key. The value encoding is already byte-compatible
both ways, so this is purely a key fix. The old comment claiming the underscore
form was required for Nexus-transport safety is false (the Nexus path only
lowercases keys; both forms are lowercase) and is removed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AI-4: Use langsmith's exported isTracingEnabled
langsmith 0.7.9 exports a context-aware isTracingEnabled that checks the active
run tree before falling back to the LANGSMITH_/LANGCHAIN_ env vars. Drop the
local env-only reimplementation, import langsmith's, and bump the dependency to
^0.7.9. This matches the Python plugin's gate: a client call made inside an
active trace now continues tracing even when LANGSMITH_TRACING is unset, rather
than being gated on the env var alone.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
7d8f198 to
42e9670
Compare
- Rename plugin options defaultTags/defaultMetadata to tags/metadata for Python parity - Move the worker sink to the reserved __temporal_langsmith name via a single constant, allowlisted in @temporalio/common's reserved-name guard - Make user-provided Worker sinks win on a sink-key collision - Guard context-header encoding so a converter failure can't fail the user's call - Scrub metadata on the workflow-side run builder to match the documented contract - Drop a needless double-cast and remove porting/slop comments Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.