A production-grade MCP server that connects any MCP-compatible AI agent to your Langfuse observability data.
Query traces, debug errors, inspect sessions, manage prompts, run evaluations, annotate data, and configure models — all through natural language.
Transport: Streamable HTTP on port 8080, compatible with Cursor, Claude Desktop, VS Code / GitHub Copilot, and any MCP client that supports HTTP transport.
| Capability | This server | Official Langfuse MCP |
|---|---|---|
| Traces & Observations | ✅ | ❌ |
| Sessions & Users | ✅ | ❌ |
| Exception tracking | ✅ | ❌ |
| Prompt management (read + write) | ✅ | ✅ read-only |
| Dataset & run management | ✅ | ❌ |
| Scores & score configs | ✅ | ❌ |
| Annotation queues | ✅ | ❌ |
| Comments | ✅ | ❌ |
| Model definitions | ✅ | ❌ |
| LLM connections | ✅ | ❌ |
| Project introspection | ✅ | ❌ |
| Schema introspection | ✅ | ❌ |
| Java / Spring AI | ✅ | ❌ (Python) |
- Java 21 or later
- Maven 3.9+ (or use the Docker build — no local Maven required)
- A Langfuse account with an API key pair (
public-key+secret-key)
# 1. Build
mvn clean package -DskipTests
# 2. Set credentials
export LANGFUSE_PUBLIC_KEY=pk-lf-...
export LANGFUSE_SECRET_KEY=sk-lf-...
export LANGFUSE_HOST=https://cloud.langfuse.com
# 3. Run (Streamable HTTP transport — port 8080)
java -jar target/langfuse-mcp-1.0.0.jar
# 4. Verify
curl http://localhost:8080/actuator/health
# 5. Inspect all tools
npx @modelcontextprotocol/inspector http://localhost:8080/mcpGet credentials from Langfuse Cloud → Settings → API Keys.
Self-hosted Langfuse? Set LANGFUSE_HOST to your instance URL.
All configuration is driven by environment variables (or application.yml for local overrides).
| Property | Env var | Required | Default | Description |
|---|---|---|---|---|
langfuse.public-key |
LANGFUSE_PUBLIC_KEY |
✅ | — | Langfuse project public key |
langfuse.secret-key |
LANGFUSE_SECRET_KEY |
✅ | — | Langfuse project secret key |
langfuse.host |
LANGFUSE_HOST |
✅ | — | Langfuse base URL, e.g. https://cloud.langfuse.com |
langfuse.timeout |
LANGFUSE_TIMEOUT |
❌ | 30s |
HTTP request timeout — Spring Duration format, e.g. 30s, 1m, 90s |
langfuse.read-only |
— | ❌ | true |
Informational flag; write operations are available through specific tools |
LANGFUSE_HOST may be specified with or without a trailing slash — the server normalises it automatically.
{
"mcpServers": {
"langfuse": {
"url": "http://localhost:8080/mcp"
}
}
}{
"mcpServers": {
"langfuse": {
"url": "http://localhost:8080/mcp"
}
}
}On macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
URL mode:
{
"github.copilot.chat.mcp.servers": {
"langfuse": {
"url": "http://localhost:8080/mcp"
}
}
}Command mode (stdio-only clients):
{
"github.copilot.chat.mcp.servers": {
"langfuse": {
"command": "java",
"args": ["-jar", "/absolute/path/to/langfuse-mcp-1.0.0.jar"],
"env": {
"LANGFUSE_PUBLIC_KEY": "pk-lf-...",
"LANGFUSE_SECRET_KEY": "sk-lf-...",
"LANGFUSE_HOST": "https://cloud.langfuse.com"
}
}
}
}Note: The MCP endpoint is
/mcp(streamable HTTP). The legacy SSE/sseendpoint is not used by this server.
The Dockerfile is a multi-stage build: it compiles the Spring Boot jar inside Docker and runs the MCP server on port 8080. No local Maven installation is needed.
# Build image (compiles inside Docker)
docker build -t langfuse-mcp:latest .
# Run
docker run --rm -p 8080:8080 \
-e LANGFUSE_PUBLIC_KEY=pk-lf-... \
-e LANGFUSE_SECRET_KEY=sk-lf-... \
-e LANGFUSE_HOST=https://cloud.langfuse.com \
langfuse-mcp:latestAfter the container starts:
| Endpoint | URL |
|---|---|
| Health check | http://localhost:8080/actuator/health |
| Ping | http://localhost:8080/ping |
| MCP endpoint | http://localhost:8080/mcp |
Langfuse running in another container on the same host:
-e LANGFUSE_HOST=http://host.docker.internal:3000Every tool returns a consistent ApiResponse<T> envelope:
{ "success": true, "data": { ... }, "timestamp": "2025-01-15T10:30:00Z" }
{ "success": false, "errorCode": "TRACE_NOT_FOUND", "errorMessage": "...", "timestamp": "..." }Paginated list responses wrap their items in a PagedResponse<T>:
{
"data": [ ... ],
"meta": { "page": 1, "limit": 20, "totalItems": 142, "totalPages": 8 }
}Pagination is 1-based (page defaults to 1). limit defaults to 20 and is capped at 100 where noted. To page through results, increment page while keeping limit fixed.
| Tool | Description |
|---|---|
fetch_traces |
Paginated list of traces. Filter by userId, name, sessionId, tags, fromTimestamp, toTimestamp. |
fetch_trace |
Full detail of a single trace including nested observations, input/output, metadata, latency, and token usage. Requires traceId. |
find_exceptions |
Traces whose level equals ERROR. Supports time range and pagination. |
find_exceptions_in_file |
Error-level traces whose metadata contains a given file name substring. Requires fileName. |
get_exception_details |
Full detail of a single error trace. Requires traceId. |
get_error_count |
Count of ERROR-level traces in a time range (scans up to 500 traces). |
delete_trace |
Permanently deletes a single trace by ID. Irreversible. |
delete_traces |
Permanently deletes multiple traces. Pass a comma-separated list of trace IDs. Irreversible. |
| Tool | Description |
|---|---|
fetch_sessions |
Paginated list of sessions with optional time range filter. |
get_session_details |
Full session detail including all its traces. Requires sessionId. |
get_user_sessions |
All sessions for a specific user with pagination. Requires userId. |
| Tool | Description |
|---|---|
list_prompts |
Paginated list of all prompts in the project. |
get_prompt |
Fetch a prompt by name. Optionally pin to a version number or a label (e.g. production, staging). |
create_prompt |
Create a new prompt or append a new version to an existing prompt. type is text (plain string) or chat (JSON array of {role, content} messages). Supports comma-separated labels and tags. |
delete_prompt |
Delete prompt versions by name. Scope to a specific label or version; omit both to delete all versions. Irreversible. |
update_prompt_labels |
Replace the full label set on a specific prompt version. Supply an empty string to remove all labels. The latest label is reserved by Langfuse. |
| Tool | Description |
|---|---|
list_datasets |
Paginated list of all evaluation datasets. |
get_dataset |
Fetch a dataset by exact name. |
create_dataset |
Create a new dataset. Optionally supply description, metadataJson, inputSchemaJson, and expectedOutputSchemaJson (all as JSON strings). |
list_dataset_items |
Paginated list of items in a dataset. Requires datasetName. |
get_dataset_item |
Fetch a single dataset item by ID. |
create_dataset_item |
Create or upsert a dataset item. Optionally link to a sourceTraceId or sourceObservationId. Supports itemId for upsert semantics. |
delete_dataset_item |
Permanently delete a dataset item by ID. Irreversible. |
| Tool | Description |
|---|---|
list_dataset_runs |
Paginated list of experiment runs for a dataset. Requires datasetName. |
get_dataset_run |
Full run detail including all run items. Requires datasetName and runName. |
delete_dataset_run |
Delete a run and all its items. Irreversible. Requires datasetName and runName. |
list_dataset_run_items |
Paginated list of items in a run. Requires datasetId and runName. |
create_dataset_run_item |
Create a run item linking a dataset item to a trace/observation. Creates the run automatically if it does not yet exist. |
| Tool | Description |
|---|---|
get_cost_metrics |
Query Langfuse cost, token, latency, and usage analytics via the Metrics API v1. Mirrors: GET /api/public/metrics?query=. Pass the full query as a JSON string. All aggregation is server-side. |
This tool accepts a single required parameter query which must be a JSON-serialised string matching the Metrics API schema. Examples (pass these as a single JSON string):
-
Total cost last 7 days:
{"view":"traces","metrics":[{"measure":"totalCost","aggregation":"sum"}],"fromTimestamp":"2026-03-18T00:00:00Z","toTimestamp":"2026-03-25T23:59:59Z"}
-
Daily cost trend this week:
{"view":"traces","metrics":[{"measure":"totalCost","aggregation":"sum"},{"measure":"count","aggregation":"count"}],"timeDimension":{"granularity":"day"},"fromTimestamp":"2026-03-18T00:00:00Z","toTimestamp":"2026-03-25T23:59:59Z"}
-
Cost by model:
{"view":"observations","dimensions":[{"field":"providedModelName"}],"metrics":[{"measure":"totalCost","aggregation":"sum"},{"measure":"totalTokens","aggregation":"sum"}],"fromTimestamp":"2026-03-18T00:00:00Z","toTimestamp":"2026-03-25T23:59:59Z"}
-
Cost for a specific user:
{"view":"traces","metrics":[{"measure":"totalCost","aggregation":"sum"}],"filters":[{"column":"userId","operator":"=","value":"user-123","type":"string"}],"fromTimestamp":"2026-03-18T00:00:00Z","toTimestamp":"2026-03-25T23:59:59Z"}
-
Production environment only:
filters: [{"column":"environment","operator":"=","value":"production","type":"string"}]
| Tool | Description |
|---|---|
get_scores |
Paginated list of evaluation scores. Filter by traceId, observationId, name, dataType (NUMERIC|CATEGORICAL|BOOLEAN), and time range. |
get_score |
Fetch a single score by ID. |
get_score_configs |
Paginated list of score config schemas. |
get_score_config |
Fetch a single score config by ID. |
create_score_config |
Create a score config. NUMERIC supports optional minValue/maxValue. CATEGORICAL accepts a categoriesJson array of {label, value} objects. |
update_score_config |
Update an existing score config. Optionally set isArchived to archive it. |
| Tool | Description |
|---|---|
list_annotation_queues |
Paginated list of annotation queues. |
get_annotation_queue |
Fetch a single queue by ID. |
create_annotation_queue |
Create a queue for human-in-the-loop review. Optionally link a scoreConfigId. |
list_annotation_queue_items |
Paginated list of items in a queue. Optionally filter by status (PENDING|COMPLETED). Requires queueId. |
get_annotation_queue_item |
Fetch a specific queue item by queueId and itemId. |
create_annotation_queue_item |
Add a trace, observation, or session to a queue for review. objectType is TRACE, OBSERVATION, or SESSION. |
update_annotation_queue_item |
Update the status of a queue item (PENDING|COMPLETED). |
delete_annotation_queue_item |
Remove an item from a queue. Irreversible. |
| Tool | Description |
|---|---|
get_comments |
Paginated list of comments. Optionally filter by objectType (TRACE|OBSERVATION) and objectId. |
get_comment |
Fetch a single comment by ID. |
create_comment |
Attach a comment to a trace, observation, session, or prompt. objectType values: TRACE, OBSERVATION, SESSION, PROMPT. |
| Tool | Description |
|---|---|
list_models |
Paginated list of all model definitions (Langfuse-managed and custom). |
get_model |
Fetch a model definition by ID. |
create_model |
Create a custom model for cost tracking. Requires modelName, matchPattern (regex), and unit (TOKENS|CHARACTERS|MILLISECONDS|SECONDS|IMAGES|REQUESTS). Optionally set per-unit USD prices. |
delete_model |
Delete a custom model definition. Langfuse-managed models cannot be deleted. Irreversible. |
| Tool | Description |
|---|---|
list_llm_connections |
Paginated list of LLM provider connections (secret keys are masked in the response). |
upsert_llm_connection |
Create or update a provider connection by provider name (e.g. openai, anthropic, azure, google). Upserts by provider — if a connection already exists it is updated. |
| Tool | Description |
|---|---|
get_projects_for_api_key |
Returns the project(s) visible to the configured API key. Useful for confirming credentials and project metadata. |
| Tool | Description |
|---|---|
get_user_traces |
All traces for a specific Langfuse user ID with pagination. Requires userId. |
| Tool | Description |
|---|---|
get_data_schema |
Returns the full Langfuse data model: all entity types, fields, and valid enum values. Call this first to understand the available data structures before running queries. |
MCP Client (Cursor / Claude Desktop / Copilot / other)
│ Streamable HTTP transport (/mcp)
▼
Tool class (@McpTool — validates required params, delegates to service)
▼
Service interface + impl (business logic, filtering, error mapping)
▼
LangfuseApiClient (HTTP gateway — GET / POST / PATCH / DELETE, typed exceptions)
▼
Langfuse Public REST API
The architecture is strictly layered:
client/— Langfuse integration boundary: HTTP with Basic-Auth (Apache HttpComponents 5), typed exceptions,UriComponentsBuilderfor query paramsservice/— domain logic: filtering, mapping, pagination, error translation intoApiResponsetools/— MCP surface: agent-friendly descriptions, parameter validation, delegation to services- Spring Boot — runtime and transport wrapper only
LangfuseApiClient supports four HTTP methods. All methods throw LangfuseApiException or ResourceNotFoundException on error, which the service layer converts into structured ApiResponse.error(...) responses — agents never see raw stack traces.
| Method | Used for |
|---|---|
GET |
All read operations |
POST |
Create operations |
PATCH |
Update operations |
DELETE |
Delete operations |
com.langfuse.mcp
├── LangfuseMcpApplication.java @SpringBootApplication @ConfigurationPropertiesScan
├── config/
│ ├── LangfuseProperties.java @ConfigurationProperties — publicKey, secretKey, host, timeout, readOnly
│ ├── LangfuseClientConfig.java RestClient bean — Basic-Auth, Apache HttpComponents 5, configurable timeout
│ └── JacksonConfig.java Primary ObjectMapper (JSR310, ignore unknown fields)
├── client/
│ └── LangfuseApiClient.java HTTP gateway (GET/POST/PATCH/DELETE); typed exceptions; UriComponentsBuilder queries
├── controller/
│ └── PingController.java GET /ping → {"status":"ok"}
├── exception/
│ ├── LangfuseApiException.java Wraps HTTP/connectivity errors — statusCode + endpoint
│ └── ResourceNotFoundException.java Thrown on HTTP 404
├── dto/
│ ├── common/ ApiResponse · PagedResponse · PaginationMeta
│ ├── request/ Filter/get request classes (12 classes)
│ └── response/ Response classes (19 classes — JsonNode for open-schema fields)
├── service/ Interfaces (15): Trace · Session · Prompt · PromptWrite · Dataset · DatasetRun
│ │ · Score · AnnotationQueue · Comment · Model · LlmConnection · Project · User · Schema · CostMetrics
│ └── impl/ *ServiceImpl (15) — business logic, filtering, error mapping
├── tools/ @McpTool classes (15) — param validation, delegation, agent-friendly descriptions
│ ├── TraceTools.java (8 tools)
│ ├── SessionTools.java (3 tools)
│ ├── PromptTools.java (2 tools)
│ ├── PromptWriteTools.java (3 tools)
│ ├── DatasetTools.java (7 tools)
│ ├── DatasetRunTools.java (5 tools)
│ ├── ScoreTools.java (6 tools)
│ ├── AnnotationQueueTools.java (8 tools)
│ ├── CommentTools.java (3 tools)
│ ├── ModelTools.java (4 tools)
│ ├── LlmConnectionTools.java (2 tools)
│ ├── ProjectTools.java (1 tool)
│ ├── UserTools.java (1 tool)
│ ├── SchemaTools.java (1 tool)
│ └── CostMetricsTools.java (1 tool)
└── util/
└── JsonPageMapper.java Centralised JSON → PagedResponse mapper (no duplication)
mvn testTest coverage includes:
LangfusePropertiesBindingTest— config binding fromapplication-test.ymland property-level validationPromptWriteServiceImplTest— service logic for prompt create / delete / label updateProjectServiceImplTest— project API response mappingObservationServiceImplTest— observation fetch and field mappingMetricsServiceImplTest— metrics aggregation logic
Tests run with spring.ai.mcp.server.enabled=false (set in src/test/resources/application-test.yml) so no MCP transport is started during test execution.
Connectivity issue — not a code bug. Check:
LANGFUSE_HOSTpoints to a running Langfuse instance- The host is reachable from the JVM process
- For Docker: use
host.docker.internalinstead oflocalhost - The scheme matches your server (
http://vshttps://) - Confirm the API is up:
curl $LANGFUSE_HOST/api/public/health
A required parameter was not provided. All required = true parameters are validated at the tool layer before any HTTP call is made.
Increase the timeout:
export LANGFUSE_TIMEOUT=60s- Confirm the server is running:
curl http://localhost:8080/actuator/health - Confirm the MCP endpoint is reachable:
curl http://localhost:8080/ping - Check that the client config URL points to
http://localhost:8080/mcp - Inspect all available tools:
npx @modelcontextprotocol/inspector http://localhost:8080/mcp
delete_model only works for custom model definitions you have created. To override a Langfuse-managed model's pricing, create a new custom model with the same modelName.