🌐 English | 中文
This document explains how to integrate the SenseNova LLM API on SenseTime's SenseCore platform.
- 1. Sign up and obtain an API Key
- 2. Using with Agent Frameworks
- 3. Models
- 4. Basic invocation
- 5. Recommended sampling parameters
- 6. Multi-turn dialogue
- 7. Image (multimodal) input
- 8. Streaming output
- 9. Calling via the OpenAI SDK
- 10. Error codes
Visit the SenseNova platform and complete registration plus identity verification:
https://platform.sensenova.cn/console
In the console sidebar, go to Management Center → API Key Management → Create API Key.
After creation, copy and store the key immediately — the full value is shown only once. If a key leaks, delete or disable it on the same page and create a new one.
In the examples below, replace every <YOUR_API_KEY> with the key you created.
SenseNova 6.7 Flash-Lite needs an agent runtime + the official skill library to deliver an end-to-end office-task workflow.
- Recommended runtime: OpenClaw or hermes-agent.
- Recommended LLM: pair it with the SenseNova platform API — use the API Key from Section 1 (free token plan available).
- Install & setup: see SenseNova-Skills INSTALL.md.
Recommended: just ask the agent to install them for you. Hand it the repo URL and let it clone and copy the contents into the right directory, e.g.:
"Please install https://github.com/OpenSenseNova/SenseNova-Skills into your skills directory."
After installation you may need to restart the agent service manually before the new skills are picked up.
| Agent | Target directory |
|---|---|
| OpenClaw | ~/.openclaw/skills/ |
| hermes-agent | ~/.hermes/skills/ |
Prefer to install manually?
Clone the repo, then copy (or symlink) the subdirectories under skills/ into the target directory:
git clone https://github.com/OpenSenseNova/SenseNova-Skills.git --depth=1
mkdir -p ~/.openclaw/skills
cp -r SenseNova-Skills/skills/* ~/.openclaw/skills/For Hermes, just swap the directory to ~/.hermes/skills/.
SenseNova 6.7 Flash-Lite — a lightweight multimodal agent model built for real-world workflows.
- Lightweight & efficient, balancing quality, cost, and deployability
- Office-tuned, reliably powering complex long-horizon tasks
- Native multimodal architecture, well suited to real office content
- Better token efficiency, keeping complex tasks affordable
curl --location 'https://token.sensenova.cn/v1/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <YOUR_API_KEY>' \
--data '{
"model": "sensenova-6.7-flash-lite",
"max_tokens": 2000,
"messages": [
{"role": "user", "content": "Hi, please briefly introduce yourself."}
]
}'import os
import requests
API_KEY = os.environ["SENSENOVA_API_KEY"]
URL = "https://token.sensenova.cn/v1/chat/completions"
resp = requests.post(
URL,
headers={
"Authorization": f"Bearer {API_KEY}", # Bearer token auth
"Content-Type": "application/json",
},
json={
"model": "sensenova-6.7-flash-lite",
"max_tokens": 2000, # Max output tokens (incl. reasoning)
"messages": [
{"role": "user", "content": "Hi, please briefly introduce yourself."},
],
},
timeout=60,
)
resp.raise_for_status()
data = resp.json()
print(data["choices"][0]["message"]){
"id": "da48c12a-...",
"request_id": "da48c12a-...",
"model": "sensenova-6.7-flash-lite",
"object": "chat.completion",
"created": 1776952631,
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Hi! I'm SenseNova...",
"reasoning": "Thinking Process: ..."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 37,
"completion_tokens": 762,
"total_tokens": 799,
"prompt_tokens_details": {"cached_tokens": 0, "audio_tokens": 0}
}
}finish_reason:stopfor normal completion,lengthifmax_tokensis hit.message.content: the final answer.message.reasoning: the chain of thought from a reasoning model.total_tokens = prompt_tokens + completion_tokens. The upper bound is the model's context window — there is no request parameter to cap it directly.
We suggest the following parameter combinations by mode and task type:
| Mode | Task type | temperature |
top_p |
top_k |
min_p |
presence_penalty |
repetition_penalty |
|---|---|---|---|---|---|---|---|
| Thinking mode | General | 1.0 | 0.95 | 20 | 0.0 | 1.5 | 1.0 |
| Thinking mode | Precise coding (e.g. WebDev) | 0.6 | 0.95 | 20 | 0.0 | 0.0 | 1.0 |
| Instruct (non-thinking) mode | General | 0.7 | 0.8 | 20 | 0.0 | 1.5 | 1.0 |
| Instruct (non-thinking) mode | Reasoning | 1.0 | 1.0 | 40 | 0.0 | 2.0 | 1.0 |
Example — "general task + thinking mode":
curl 'https://token.sensenova.cn/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <YOUR_API_KEY>' \
-d '{
"model": "sensenova-6.7-flash-lite",
"max_tokens": 2000,
"temperature": 1.0,
"top_p": 0.95,
"top_k": 20,
"min_p": 0.0,
"presence_penalty": 1.5,
"repetition_penalty": 1.0,
"messages": [
{"role": "user", "content": "Write a poem about spring."}
]
}'resp = requests.post(URL, headers=headers, timeout=60, json={
"model": "sensenova-6.7-flash-lite",
"max_tokens": 2000,
"temperature": 1.0,
"top_p": 0.95,
"top_k": 20,
"min_p": 0.0,
"presence_penalty": 1.5,
"repetition_penalty": 1.0,
"messages": [
{"role": "user", "content": "Write a poem about spring."},
],
})
print(resp.json()["choices"][0]["message"]["content"])curl 'https://token.sensenova.cn/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <YOUR_API_KEY>' \
-d '{
"model": "sensenova-6.7-flash-lite",
"max_tokens": 2000,
"messages": [
{"role": "system", "content": "You are a concise assistant. Reply in 20 words or fewer."},
{"role": "user", "content": "What is the capital of France?"},
{"role": "assistant", "content": "Paris."},
{"role": "user", "content": "And Germany?"}
]
}'# Maintain dialogue history: append in chronological order.
history = [
{"role": "system", "content": "You are a concise assistant. Reply in 20 words or fewer."},
]
def chat(user_msg: str) -> str:
history.append({"role": "user", "content": user_msg})
resp = requests.post(URL, headers=headers, timeout=60, json={
"model": "sensenova-6.7-flash-lite",
"max_tokens": 2000,
"messages": history,
})
reply = resp.json()["choices"][0]["message"].get("content", "")
# When echoing history back, only keep `content` — never include `reasoning`.
history.append({"role": "assistant", "content": reply})
return reply
print(chat("What is the capital of France?")) # -> Paris.
print(chat("And Germany?")) # -> Berlin.Notes:
rolemay besystem,user, orassistant.- When echoing the previous reply, include only
content— do not echo backreasoning. - Multi-turn significantly increases
prompt_tokens. For long histories, summarize or truncate.
SenseNova 6.7 Flash-Lite accepts images in the OpenAI Vision-compatible format, supporting both URL and Base64.
curl 'https://token.sensenova.cn/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <YOUR_API_KEY>' \
-d '{
"model": "sensenova-6.7-flash-lite",
"max_tokens": 2000,
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "What is in the image?"},
{"type": "image_url", "image_url": {
"url": "https://example.com/photo.jpg"
}}
]
}]
}'The server downloads the URL — the image must be anonymously accessible. If unreachable, you'll get an
image down failederror; switch to Base64 or upload to an accessible object store.
resp = requests.post(URL, headers=headers, timeout=120, json={
"model": "sensenova-6.7-flash-lite",
"max_tokens": 2000,
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "What is in the image?"},
{"type": "image_url", "image_url": {
"url": "https://example.com/photo.jpg",
}},
],
}],
})
print(resp.json()["choices"][0]["message"]["content"])# Build a Data URL (Linux/macOS).
B64=$(base64 -w 0 photo.png 2>/dev/null || base64 -i photo.png)
curl 'https://token.sensenova.cn/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <YOUR_API_KEY>' \
-d '{
"model": "sensenova-6.7-flash-lite",
"max_tokens": 2000,
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image."},
{"type": "image_url", "image_url": {
"url": "data:image/png;base64,'"${B64}"'"
}}
]
}]
}'import base64
import mimetypes
import requests
def to_data_url(path: str) -> str:
mime = mimetypes.guess_type(path)[0] or "image/png"
with open(path, "rb") as f:
b64 = base64.b64encode(f.read()).decode()
return f"data:{mime};base64,{b64}"
resp = requests.post(URL, headers=headers, timeout=120, json={
"model": "sensenova-6.7-flash-lite",
"max_tokens": 2000,
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image."},
{"type": "image_url", "image_url": {
"url": to_data_url("./photo.png"),
}},
],
}],
})
print(resp.json()["choices"][0]["message"]["content"])Append additional image_url objects within the same message's content array:
resp = requests.post(URL, headers=headers, timeout=120, json={
"model": "sensenova-6.7-flash-lite",
"max_tokens": 2000,
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "Compare the differences between the two images."},
{"type": "image_url", "image_url": {"url": "https://example.com/a.jpg"}},
{"type": "image_url", "image_url": {"url": "https://example.com/b.jpg"}},
],
}],
})Image guidelines:
- Common formats are supported (PNG, JPEG, …).
- Compress to a long edge of at most 2048 px before upload to save tokens and latency.
- Prefer URL for large images; Base64 inflates the request body considerably.
Set stream to true and the server pushes incremental results via Server-Sent Events (SSE).
curl -N 'https://token.sensenova.cn/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <YOUR_API_KEY>' \
-d '{
"model": "sensenova-6.7-flash-lite",
"stream": true,
"max_tokens": 2000,
"messages": [{"role": "user", "content": "Write a poem about spring."}]
}'Event-stream excerpt:
data: {"choices":[{"delta":{"reasoning":"Thinking"},"finish_reason":""}], ...}
data: {"choices":[{"delta":{"content":"Spring"},"finish_reason":""}], ...}
data: {"choices":[{"delta":{"content":" wind"},"finish_reason":""}], ...}
...
data: {"choices":[{"delta":{},"finish_reason":"stop"}], ...}
data: {"choices":[], "usage":{"prompt_tokens":38,"completion_tokens":441,"total_tokens":479}}
data: [DONE]
import json
import requests
with requests.post(URL, headers=headers, stream=True, timeout=120, json={
"model": "sensenova-6.7-flash-lite",
"stream": True,
"max_tokens": 2000,
"messages": [{"role": "user", "content": "Write a poem about spring."}],
}) as r:
for line in r.iter_lines(decode_unicode=True):
if not line or not line.startswith("data:"):
continue
payload = line[5:].strip()
if payload == "[DONE]": # End-of-stream marker
break
chunk = json.loads(payload)
# Before finishing, the server pushes a usage-only event (choices is empty).
if not chunk.get("choices"):
print("\n[usage]", chunk.get("usage"))
continue
delta = chunk["choices"][0].get("delta", {})
# `delta` may carry `reasoning` (thinking) or `content` (final text).
# Most front-ends only display `content` and ignore `reasoning`.
if "content" in delta:
print(delta["content"], end="", flush=True)Key points:
- Each event begins with
data:and is separated by a blank line. deltamay containreasoning(incremental thinking) orcontent(incremental output). Reasoning models emit a large amount ofreasoningbefore producingcontent.- Just before completion, the server pushes a single usage-only event (
choices: []). - Receiving
data: [DONE]indicates the stream is finished — clients should stop reading.
The endpoint is compatible with OpenAI's Chat Completions protocol, so you can use openai-python directly:
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["SENSENOVA_API_KEY"],
base_url="https://token.sensenova.cn/v1",
)
completion = client.chat.completions.create(
model="sensenova-6.7-flash-lite",
max_tokens=2000,
temperature=0.7,
messages=[{"role": "user", "content": "Hello"}],
)
print(completion.choices[0].message.content)Streaming:
stream = client.chat.completions.create(
model="sensenova-6.7-flash-lite",
max_tokens=2000,
stream=True,
messages=[{"role": "user", "content": "Write a poem"}],
)
for chunk in stream:
if chunk.choices and chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)When using the SDK, the
reasoningfield may not be exposed on the canonical objects. To retrieve it, call the raw HTTP endpoint directly.
Error response shape:
{
"error": {
"message": "model is not found",
"type": "not_found_error",
"code": "5"
}
}| HTTP | error.type | Meaning | Suggested action |
|---|---|---|---|
| 400 | invalid_request_error |
Malformed parameters, e.g. image download failure | Check parameter shape and image URL accessibility |
| 401 | authentication_error |
API key invalid or expired | Recreate the key in the console |
| 403 | — | Lacking permission or risk-blocked | Check account permissions and content compliance |
| 404 | not_found_error |
Model or endpoint does not exist | Double-check the spelling of model |
| 429 | — | Rate limited | Retry with exponential backoff |
| 5xx | — | Server-side issue | Retry later; if persistent, file a ticket |
For further support, sign in to the SenseCore console to file a ticket or browse the latest official docs.