Personal single-user Telegram bot powered by a local LLM (Ollama), with a fully configurable persona, persistent layered memory, autonomous scheduled messages and photo generation via Stable Horde. Built to run entirely offline on modest hardware (Raspberry Pi 5 class) for private companion / roleplay use between consenting adults.
Adult content warning — the system optionally supports explicit roleplay and NSFW image generation through Stable Horde. It is meant for personal, local, adult use only. It is not a product intended for public or multi-user deployment.
- Local LLM via Ollama (recommended model:
qwen3.5:4b, multimodal) - LangGraph ReAct agent with tool calling
- Fully configurable persona in
config/persona.yaml(identity, style, mood, examples, rules) - Layered SQLite memory: raw messages + durable facts + dated events + daily mood + interaction stats
- Automatic fact extraction in background after each user message (LLM extracts personal info)
- Multimodality: the bot receives and "sees" photos you send (contextual caption)
- Autonomous messages (APScheduler): the bot writes you spontaneously during the day
- Daily mood: a mood is randomly picked every morning and influences the tone
- Intimacy level 1-10 that grows with the number of exchanges, modulating tone familiarity
- Photo generation via Stable Horde (free with anonymous API):
/pic <english prompt>— explicit prompt/selfie <hint>— LLM generates scene + caption- Automatic detection of photo requests in chat text ("send me a picture...")
- Context-aware: if the bot just mentioned a photo and you insist ("go on, send it"), it triggers
- Autonomous: N% of autonomous ticks become spontaneous photos
- Explicit mode on/off: a flag in the YAML enables/disables NSFW language and prompts
- Voice messages (optional): a % of text replies can be synthesized into Telegram voice messages via Piper TTS — fully local CPU, any language supported by Piper
- systemd deployment ready (
deploy/diana-bot.service)
Telegram ──► Bot (chat_id filter) ──► LangGraph ReAct agent
│ │
├── on_text ├── ChatOllama (text)
├── on_photo (vision) ├── tools: datetime, remember_this
├── /pic /selfie /... └── DB memory
└── /fact /event /memory
Scheduler (APScheduler) ──► autonomous msg / photo
SQLite: messages, facts, events, daily_mood, stats, pic_log, autonomous_log
Horde client ──► Stable Horde API ──► photo (img2img / txt2img)
- Raspberry Pi 5 (8 GB RAM) or equivalent ARM64 / x86_64 Linux machine
- Storage: at least 10 GB free (model ~3.4 GB + venv + data)
- Network access for Telegram + Stable Horde
On more powerful hardware (laptop with GPU, x86 server) performance scales linearly.
- Linux (tested on Raspberry Pi OS bookworm)
- Python 3.11+
- Ollama ≥ 0.20
- systemd (for persistent deployment)
- git, curl
# Install Ollama (Linux / ARM64)
curl -fsSL https://ollama.com/install.sh | sh
# Verify
ollama --version
# Pull the recommended model (~3.4 GB)
ollama pull qwen3.5:4bModel alternatives (edit OLLAMA_MODEL in .env):
qwen3.5:2b— faster, lower qualityqwen3-vl:4b— older, still multimodal
git clone git@github.com:mattabott/DIANA.git diana
cd diana
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip wheel
pip install -r requirements.txt- On Telegram open @BotFather →
/newbot→ follow the prompts - Copy the bot token
- Open @userinfobot →
/start→ copy your numeric user id
cp .env.example .env
cp config/persona.yaml.example config/persona.yamlEdit .env:
TELEGRAM_BOT_TOKEN=<the bot token>ALLOWED_CHAT_ID=<your user id>- other fields have sensible defaults
Edit config/persona.yaml:
identity.name,location,occupation,appearance,short_biovisual_prompt_prefix(English) if you want a specific look for generated photosexplicit_mode: true|falseto enable/disable adult contentpersonality.traits,use_phrases,avoid_phrases,examplesdaily_moods,intimacy_levels→ editable
The bot generates photos through Stable Horde using the anonymous API key 0000000000 → works immediately but with low priority (photos in 2-10 min).
To speed things up:
- Register an account at https://stablehorde.net/register
- Copy your personal API key
- Paste it in
.envunderHORDE_API_KEY
source venv/bin/activate
python -m src.botOpen Telegram → write to your bot → watch logs in the terminal. Ctrl+C to stop.
# Run from the project root — install.sh substitutes <USER> and <PROJECT_DIR>
# placeholders in the service file automatically.
sudo ./deploy/install.shThe script:
- Copies
deploy/diana-bot.serviceto/etc/systemd/system/ - Runs
systemctl daemon-reload - Runs
enable --now→ the bot starts immediately and at system boot
Useful commands:
sudo systemctl status diana-bot # status
sudo journalctl -u diana-bot -f # live logs
sudo systemctl restart diana-bot # after persona.yaml or code changes
sudo systemctl stop diana-bot # stop
./deploy/uninstall.sh # remove completely| Command | What it does |
|---|---|
/start |
Initial greeting |
/memory |
Show everything the bot knows about you (facts, events, stats) |
/fact <text> |
Save a durable fact (always injected into the prompt) |
/event <text> |
Save a recent event (injected into the prompt for 7 days) |
/pic <english prompt> |
Generate a photo with explicit prompt |
/selfie [hint] |
Generate a photo with scene decided by the LLM |
/refs |
List currently loaded reference photos |
/setref |
Set a reference (send a photo with caption /setref) |
/clearref |
Remove all references |
/ping |
Force an immediate autonomous message (debug) |
Additionally: the bot automatically detects photo requests in chat text (e.g., "send me a photo", "show me"), even as conversational continuations.
All text changes about personality, name, city, style go in config/persona.yaml. After each change:
sudo systemctl restart diana-botconfig/persona.yaml → explicit_mode: true|false. The scenario_framing and explicit_guidance are injected only when true. Combine with explicit examples for better results.
In .env:
HORDE_MODELS: ordered list of preferred Horde modelsHORDE_SEED: fixed number → more consistent subject across photosHORDE_NSFW: true/false for the NSFW flag in Horde requests
In persona.yaml:
identity.visual_prompt_prefix: English description of the subject (always adult: useadult,26 year old, etc. NEVERyoung/teen/youthful— even as negatives they trigger Horde CSAM filter)identity.visual_shot_types: random composition patterns (selfie / mirror / etc.)
The bot can reply with a Telegram voice message (waveform + inline playback) for a configurable % of text replies. Fully local, no cloud.
Prerequisites:
ffmpeginstalled on the host (used to encode OGG Opus):sudo apt install ffmpeg- A Piper voice model — pick a language matching your persona:
- English US female:
en_US-amy-medium - English GB female:
en_GB-alba-medium - Italian female:
it_IT-paola-medium - Spanish female:
es_ES-sharvard-medium - German:
de_DE-thorsten-medium - French female:
fr_FR-siwis-medium - See the full catalog at https://huggingface.co/rhasspy/piper-voices
- English US female:
Setup:
mkdir -p data/voices
cd data/voices
# example: English US female voice
curl -LO https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/amy/medium/en_US-amy-medium.onnx
curl -LO https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/amy/medium/en_US-amy-medium.onnx.jsonThen in .env:
VOICE_ENABLED=true
TTS_VOICE_MODEL=en_US-amy-medium.onnx
VOICE_PROBABILITY=40 # % of replies sent as voice
VOICE_MAX_CHARS=400 # longer replies stay as text
Latency on a Raspberry Pi 5 CPU-only: ~0.5-1s extra per voice reply.
- Facts (durable): inserted via
/factor auto-extracted by the LLM after each user message - Events (recent): inserted via
/event, drop out of the prompt after 7 days - Mood: randomly picked at the start of the day, persists until midnight
- Intimacy:
1 + interaction_count // 20(max 10)
.env:
AUTONOMOUS_ENABLED: master on/offAUTONOMOUS_MIN_HOUR/MAX_HOUR: allowed time windowAUTONOMOUS_MAX_PER_DAY: quotaAUTONOMOUS_CHECK_INTERVAL_MIN: check frequencyAUTONOMOUS_WEEKEND: true/false to run (or not) during Saturday+SundayPIC_AUTONOMOUS_PROB: probability (%) that an autonomous tick becomes a photo instead of text
| Issue | Cause / fix |
|---|---|
| Bot does not respond | Check sudo journalctl -u diana-bot -f. If Ollama is down: sudo systemctl status ollama |
| Extremely slow replies (>3 min) | First model load can take 1-2 min. OLLAMA_KEEP_ALIVE=24h avoids re-load |
| Timeout after 600s | Context too large. Reduce HISTORY_WINDOW in src/bot.py or reduce num_ctx |
CorruptPrompt from Horde |
Your prompt contains words that trigger filters (especially CSAM). Avoid teen, young, schoolgirl even in negative prompts. IP is timed out for ~2 min afterwards |
| Photo arrives as black banner / censored | Horde CSAM post-filter. The bot detects files < 25 KB and replies "it came out wrong, try again". Reinforce adult/mature in visual_prompt_prefix |
KudosUpfront |
Requests > 576px or > 50 steps require kudos. The 384x576 default is fine. Do not increase to 768 without a registered account and kudos |
| Very long Horde queue (>10 min) | Not much use waiting with anon key. Register at stablehorde.net for initial kudos |
| Safe scenes when asking for NSFW | The LLM refused. The keyword fallback should handle tits/nude/topless/lingerie/shower/bed; add more keywords in src/pic_prompt.py:_EXPLICIT_KEYWORD_MAP if needed |
.envcontains Telegram token and chat_id → never commit (it's in.gitignore)data/chatbot.dbcontains conversation history → never commitdata/refs/contains reference photos → never commit- The bot hard-filters on
ALLOWED_CHAT_ID: any other chat is ignored withUNAUTHORIZEDin logs - All processing (LLM, DB, persona) is local on your machine. Only Telegram API and Stable Horde see outbound traffic
.
├── .env.example # copy to .env and customize
├── .gitignore
├── config/
│ └── persona.yaml.example # copy to persona.yaml and customize
├── deploy/
│ ├── diana-bot.service # systemd unit
│ ├── install.sh # deployment (requires sudo)
│ └── uninstall.sh
├── scripts/
│ ├── backfill_facts.py # retroactive fact extraction from DB history
│ ├── bench_models.sh # Ollama model benchmark
│ └── test_ollama.py # LLM smoke test (text + image)
├── src/
│ ├── __init__.py
│ ├── agent.py # ReAct agent + vision LLM + fact extraction
│ ├── bot.py # entrypoint, Telegram handlers
│ ├── config.py # .env loader
│ ├── horde.py # Stable Horde client
│ ├── memory.py # SQLite (Memory + AsyncMemory)
│ ├── persona.py # YAML loader → system prompt
│ ├── pic_flow.py # photo orchestration (intent, flow, rate limit)
│ ├── pic_prompt.py # LLM scene + caption generator
│ ├── scheduler.py # autonomous messages (text + photo)
│ └── tools.py # ReAct agent tools
└── requirements.txt
Experimental software for personal use. Not tested for multi-user scalability or public deployment. The author is not responsible for generated content: LLM and Stable Diffusion models can produce unexpected output. Use responsibly and in compliance with local laws and the Terms of Service of Telegram, Ollama, and Stable Horde.