Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions cookbook/_routes.json
Original file line number Diff line number Diff line change
Expand Up @@ -364,6 +364,11 @@
"docsPath": "integrations/frameworks/llamaindex",
"isGuide": true
},
{
"notebook": "integration_alibaba_cloud_model_studio.ipynb",
"docsPath": "integrations/model-providers/alibaba-cloud-model-studio",
"isGuide": true
},
{
"notebook": "integration_amazon_bedrock.ipynb",
"docsPath": "integrations/model-providers/amazon-bedrock",
Expand Down
368 changes: 368 additions & 0 deletions cookbook/integration_alibaba_cloud_model_studio.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,368 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<!-- NOTEBOOK_METADATA source: \"\\u26a0\\ufe0f Jupyter Notebook\" title: \"Integrate Alibaba Cloud Model Studio (DashScope) with Langfuse\" sidebarTitle: \"Alibaba Cloud Model Studio\" description: \"Guide on integrating Alibaba Cloud Model Studio (DashScope) Qwen models with Langfuse for observability, tracing, and cost tracking.\" category: \"Integrations\" -->"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Observability for Alibaba Cloud Model Studio (DashScope) with Langfuse\n",
"\n",
"This cookbook demonstrates how to integrate [Alibaba Cloud Model Studio](https://www.alibabacloud.com/en/product/model-studio) (DashScope) with [Langfuse](https://langfuse.com) for full observability of your Qwen model calls.\n",
"\n",
"DashScope provides an **OpenAI-compatible API endpoint**, which means you can use Langfuse's drop-in replacement for the OpenAI SDK to automatically trace all your calls with zero additional code.\n",
"\n",
"We will cover:\n",
"- Basic chat completions with automatic tracing\n",
"- Streaming responses\n",
"- Embedding model tracing\n",
"- `@observe()` decorator for grouping multiple generations\n",
"- Multi-model comparison (qwen-plus / qwen-turbo / qwen-max) in a single trace\n",
"- Viewing traces in the Langfuse dashboard"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> **What is Alibaba Cloud Model Studio (DashScope)?** [Model Studio](https://www.alibabacloud.com/en/product/model-studio) is Alibaba Cloud's AI model service platform, providing access to the Qwen family of large language models (qwen-plus, qwen-turbo, qwen-max) and embedding models via an OpenAI-compatible API."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> **What is Langfuse?** [Langfuse](https://langfuse.com) is an open source LLM engineering platform that helps teams trace API calls, monitor performance, and debug issues in their AI applications."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<!-- STEPS_START -->\n",
"## Step 1: Install Dependencies"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install openai langfuse --quiet"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": "## Step 2: Set Up Environment Variables\n\nYou need:\n- A **DashScope API key** from [Alibaba Cloud Model Studio console](https://bailian.console.alibabacloud.com/) (international) or [DashScope console](https://dashscope.console.aliyun.com/) (China)\n- **Langfuse credentials** from [Langfuse Cloud](https://cloud.langfuse.com) or your self-hosted instance\n\nAlibaba Cloud Model Studio is available in multiple regions. Use the corresponding API endpoint:\n\n| Region | Endpoint |\n|---|---|\n| International (Singapore) | `https://dashscope-intl.aliyuncs.com/compatible-mode/v1` |\n| US (Virginia) | `https://dashscope-us.aliyuncs.com/compatible-mode/v1` |\n| China (Beijing) | `https://dashscope.aliyuncs.com/compatible-mode/v1` |\n\n> **Note:** API keys are region-specific and not interchangeable. Ensure your key matches the endpoint region."
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": "import os\n\n# Get keys for your project from the project settings page\n# https://cloud.langfuse.com\nos.environ[\"LANGFUSE_PUBLIC_KEY\"] = \"pk-lf-...\"\nos.environ[\"LANGFUSE_SECRET_KEY\"] = \"sk-lf-...\"\nos.environ[\"LANGFUSE_BASE_URL\"] = \"https://cloud.langfuse.com\" # 🇪🇺 EU region\n# os.environ[\"LANGFUSE_BASE_URL\"] = \"https://us.cloud.langfuse.com\" # 🇺🇸 US region\n\n# Get your API key from Alibaba Cloud Model Studio console\n# https://bailian.console.alibabacloud.com/\nos.environ[\"DASHSCOPE_API_KEY\"] = \"sk-...\""
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Verify the Langfuse connection:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langfuse import get_client\n",
"\n",
"langfuse = get_client()\n",
"\n",
"if langfuse.auth_check():\n",
" print(\"Langfuse client is authenticated and ready!\")\n",
"else:\n",
" print(\"Authentication failed. Please check your credentials and host.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 3: OpenAI Drop-in Replacement for DashScope\n",
"\n",
"Since DashScope exposes an OpenAI-compatible endpoint, we use the Langfuse drop-in replacement for the OpenAI SDK. Simply import `openai` from `langfuse.openai` and point the `base_url` to DashScope's endpoint.\n",
"\n",
"All API calls are then automatically traced by Langfuse."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": "# Instead of: from openai import OpenAI\nfrom langfuse.openai import openai\n\nclient = openai.OpenAI(\n api_key=os.environ.get(\"DASHSCOPE_API_KEY\"),\n base_url=\"https://dashscope-intl.aliyuncs.com/compatible-mode/v1\",\n)"
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Basic Chat Completion\n",
"\n",
"A simple chat completion request using `qwen-plus`. The trace is automatically captured in Langfuse."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"response = client.chat.completions.create(\n",
" model=\"qwen-plus\",\n",
" messages=[\n",
" {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n",
" {\"role\": \"user\", \"content\": \"What is Langfuse and how does it help with LLM observability?\"},\n",
" ],\n",
" name=\"qwen-plus-basic\",\n",
")\n",
"\n",
"print(response.choices[0].message.content)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 4: Streaming Chat Completion\n",
"\n",
"Streaming works out of the box. Langfuse captures the full streamed response and token usage."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"stream = client.chat.completions.create(\n",
" model=\"qwen-plus\",\n",
" messages=[\n",
" {\"role\": \"system\", \"content\": \"You are a creative storyteller.\"},\n",
" {\"role\": \"user\", \"content\": \"Tell me a short story about a cloud that wanted to become a river.\"},\n",
" ],\n",
" stream=True,\n",
" name=\"qwen-plus-streaming\",\n",
")\n",
"\n",
"for chunk in stream:\n",
" if chunk.choices[0].delta.content:\n",
" print(chunk.choices[0].delta.content, end=\"\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": "## Step 5: Embedding Model Tracing\n\nDashScope also provides embedding models. Langfuse traces these calls just like chat completions.\n\n> **Note:** Embedding models (e.g., `text-embedding-v3`) are currently available in the **China (Beijing)** and **International (Singapore)** regions only. They are not available in the US (Virginia) region."
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"embedding_response = client.embeddings.create(\n",
" model=\"text-embedding-v3\",\n",
" input=[\"Langfuse provides open source LLM observability.\"],\n",
" name=\"dashscope-embedding\",\n",
")\n",
"\n",
"print(f\"Embedding dimension: {len(embedding_response.data[0].embedding)}\")\n",
"print(f\"First 5 values: {embedding_response.data[0].embedding[:5]}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 6: Group Multiple Generations with `@observe()`\n",
"\n",
"Use the `@observe()` decorator to group multiple LLM calls into a single trace. This is useful when your application logic involves chaining several model calls."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langfuse import observe\n",
"\n",
"@observe()\n",
"def translate_and_summarize(text: str) -> str:\n",
" # Generation 1: Translate to English\n",
" translation = client.chat.completions.create(\n",
" model=\"qwen-plus\",\n",
" messages=[\n",
" {\"role\": \"system\", \"content\": \"You are a translator. Translate the following text to English.\"},\n",
" {\"role\": \"user\", \"content\": text},\n",
" ],\n",
" name=\"translate\",\n",
" ).choices[0].message.content\n",
"\n",
" # Generation 2: Summarize the translation\n",
" summary = client.chat.completions.create(\n",
" model=\"qwen-plus\",\n",
" messages=[\n",
" {\"role\": \"system\", \"content\": \"You are a summarizer. Provide a one-sentence summary.\"},\n",
" {\"role\": \"user\", \"content\": translation},\n",
" ],\n",
" name=\"summarize\",\n",
" ).choices[0].message.content\n",
"\n",
" return summary\n",
"\n",
"result = translate_and_summarize(\n",
" \"\\u4e91\\u8ba1\\u7b97\\u662f\\u4e00\\u79cd\\u901a\\u8fc7\\u4e92\\u8054\\u7f51\\u63d0\\u4f9b\\u8ba1\\u7b97\\u670d\\u52a1\\u7684\\u6280\\u672f\\uff0c\\u5305\\u62ec\\u670d\\u52a1\\u5668\\u3001\\u5b58\\u50a8\\u3001\\u6570\\u636e\\u5e93\\u548c\\u7f51\\u7edc\\u7b49\\u8d44\\u6e90\\u3002\"\n",
")\n",
"print(result)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 7: Multi-Model Comparison\n",
"\n",
"Compare responses from different Qwen models (qwen-plus, qwen-turbo, qwen-max) in a single trace. This is useful for evaluating model quality, latency, and cost."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langfuse import observe\n",
"\n",
"@observe()\n",
"def compare_qwen_models(prompt: str) -> dict:\n",
" models = [\"qwen-turbo\", \"qwen-plus\", \"qwen-max\"]\n",
" results = {}\n",
"\n",
" for model in models:\n",
" response = client.chat.completions.create(\n",
" model=model,\n",
" messages=[\n",
" {\"role\": \"system\", \"content\": \"You are a helpful assistant. Be concise.\"},\n",
" {\"role\": \"user\", \"content\": prompt},\n",
" ],\n",
" name=f\"{model}-comparison\",\n",
" temperature=0.7,\n",
" max_tokens=200,\n",
" )\n",
" results[model] = response.choices[0].message.content\n",
"\n",
" return results\n",
"\n",
"comparison = compare_qwen_models(\"Explain what makes a good API design in three bullet points.\")\n",
"\n",
"for model, answer in comparison.items():\n",
" print(f\"\\n--- {model} ---\")\n",
" print(answer)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 8: Enhance Tracing (Optional)\n",
"\n",
"You can enrich your DashScope traces with additional context:\n",
"\n",
"- Add [metadata](https://langfuse.com/docs/tracing-features/metadata), [tags](https://langfuse.com/docs/tracing-features/tags), [log levels](https://langfuse.com/docs/tracing-features/log-levels), and [user IDs](https://langfuse.com/docs/tracing-features/users) to traces\n",
"- Group traces by [sessions](https://langfuse.com/docs/tracing-features/sessions)\n",
"- Use [Langfuse Prompt Management](https://langfuse.com/docs/prompts/get-started) and link prompts to traces\n",
"- Add [scores](https://langfuse.com/docs/scores/custom) to traces\n",
"\n",
"Visit the [OpenAI SDK cookbook](https://langfuse.com/guides/cookbook/integration_openai_sdk) to see more examples on passing additional parameters."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"response = client.chat.completions.create(\n",
" model=\"qwen-plus\",\n",
" messages=[\n",
" {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n",
" {\"role\": \"user\", \"content\": \"What are the benefits of observability in LLM applications?\"},\n",
" ],\n",
" name=\"qwen-plus-enhanced\",\n",
" metadata={\n",
" \"langfuse_tags\": [\"dashscope\", \"qwen-plus\", \"demo\"],\n",
" \"langfuse_user_id\": \"user-123\",\n",
" \"langfuse_session_id\": \"session-abc\",\n",
" \"langfuse_metadata\": {\"use_case\": \"observability-demo\", \"region\": \"cn\"},\n",
" },\n",
")\n",
"\n",
"print(response.choices[0].message.content)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 9: View Traces in Langfuse\n",
"\n",
"After running the examples above, navigate to [Langfuse Cloud](https://cloud.langfuse.com) (or your self-hosted instance) to see detailed traces including:\n",
"\n",
"- Request and response content\n",
"- Token usage and latency metrics\n",
"- Cost tracking per model\n",
"- Nested traces from `@observe()` decorated functions\n",
"\n",
"You can define custom price information via the Langfuse dashboard ([see docs](https://langfuse.com/docs/model-usage-and-cost)) to configure the exact pricing for your DashScope models.\n",
"<!-- STEPS_END -->"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<!-- MARKDOWN_COMPONENT name: \"LearnMore\" path: \"@/components-mdx/integration-learn-more.mdx\" -->"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.6"
}
},
"nbformat": 4,
"nbformat_minor": 4
}