๐ง Status: This project will no longer be developed, because every time Ollama releases an update, new bugs appear.
In the future, I will rebuild it as a standalone application, without relying on any other software. ๐ง
Local AI Assistant is an offline/local AI chat application featuring:
- Backend built with Flask โ API communication and chat storage,
- Frontend (UI) built with PySide6 (Qt) โ interactive chat interface.
- Ollama Integration โ run local AI models (default:
gemma3:4b, now supports model selection). - Multi-language system with 9 languages and layered fallback.
This project is designed to run AI fully locally, with customizable identity, persona, profile, memory, and chat history.
local-ai-assistant/
โโ app.py
โ
โโ backend/
โ โโ i18n/
โ โ โโ ar.json
โ โ โโ en_gb.json
โ โ โโ en_us.json
โ โ โโ id.json
โ โ โโ es.json
โ โ โโ ja.json
โ โ โโ ko.json
โ โ โโ pt.json
โ โ โโ zh.json
โ โโ locales/
โ โ โโ ar.json
โ โ โโ en_gb.json
โ โ โโ en_us.json
โ โ โโ id.json
โ โ โโ es.json
โ โ โโ ja.json
โ โ โโ ko.json
โ โ โโ pt.json
โ โ โโ zh.json
โ โโ config.py
โ โโ core.py
โ โโ i18n.py
โ โโ persona.py
โ โโ storage.py
โ โโ ollama_client.py
โ โโ routes.py
โ
โโ ui/
โ โโ main.py
โ โโ chat_window.py
โ โโ client.py
โ โโ worker.py
โ โโ widgets/
โ โโ settings.py
โ โโ history.py
โ โโ bubbles.py
โ โโ identity.py
โ
โโ data/
โ โโ chat_history.json
โ โโ chat_sessions.json
โ โโ ui_chat_config.json
โ โโ config.json
โโ config.json
โโ requirement.txt
โโ README.md
- ๐ฅ Modern UI using PySide6 (Qt)
- ๐ Chat History โ rename, delete, or continue past sessions
- ๐ญ Custom Persona โ change AI name, user name, and prompt
- ๐ค Profile System โ add personal info (e.g. What do you do, Anything else the AI should know)
- ๐ง Chat Memory โ AI remembers up to 32 previous messages
- ๐ Multi-language Support โ 9 languages, native names in UI, with layered fallback (
en_us โ id โ target) English (US, UK), Bahasa Indonesia, ๆฅๆฌ่ช, ํ๊ตญ์ด, ไธญๆ๏ผ็ฎไฝ๏ผ, Portuguรชs, Espaรฑol, ุงูุนุฑุจูุฉ. - ๐จ Custom Background โ solid color or custom image
- โ๏ธ Flask Backend with
/chat,/chats,/configendpoints - ๐ค Ollama Integration โ run local AI models; default
gemma3:4b, now supports model selection - ๐ All data & configs stored locally under
data/
git clone https://github.com/rillToMe/local-ai-assistant.git
cd local-ai-assistantpython -m venv venv
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windowspip install -r requirements.txtMain dependencies:
flaskflask-corsrequestsPySide6
โ ๏ธ Note: Make sure you have installed Ollama and the required models (gemma3:4bby default). Other models can be selected if installed.
python app.py- Flask backend will start on
http://127.0.0.1:5000 - PySide6 UI will automatically open
- Settings โ change background, update Identity & Prompt
- History โ view, rename, or continue previous sessions
- Profile โ add info about yourself (used by AI in responses)
- Language โ choose UI language from 9 available
- Models โ select available Ollama models installed on your system
Based on official benchmarks and community tests:
| Model | Min RAM (CPU-only) | Approx. GPU VRAM (BF16 / 4-bit) | Notes |
|---|---|---|---|
| gemma3:1b | โฅโฏ2โฏGB RAM | ~1.5โฏGB / ~0.9โฏGB | Lightweight - runs on old notebooks, but slow (~7โ10โฏtokens/sec) getdeploying.com, windowscentral.com |
| qwen3:1.8b | โฅโฏ2โฏGB RAM | ~2โฏGB / ~1โฏGB (est.) | Slightly better reasoning - light enough for laptops |
| gemma3:4b | โฅโฏ4โฏGB RAM | ~6.4โฏGB / ~3.4โฏGB | Recommended default - good speed & quality getdeploying.com, ai.google.dev |
| qwen3:4b | โฅโฏ4โฏGB RAM | ~6โฏGB / ~3โฏGB (est.) | Balanced - strong chat & reasoning |
| gemma3:12b | โฅโฏ9โฏGB RAM | ~20โฏGB / ~8.7โฏGB | Requires strong GPU or high RAgetdeploying.com, ai.google.dev |
| qwen3:8b | โฅโฏ9โฏGB RAM | ~18โฏGB / ~8โฏGB (est.) | Good quality & context |
| deepseek-r1:8b | โฅโฏ9โฏGB RAM | ~18โฏGB / ~8โฏGB (est.) | Specialized reasoning |
| gemma3:27b | โฅโฏ18โฏGB RAM | ~46โฏGB / ~21โฏGB | Heavy - best on high-end GPUs or servers getdeploying.com, ai.google.dev |
| gpt-oss:20b | โฅโฏ32โฏGB RAM | ~40โฏGB / ~20โฏGB (est.) | Large - better long context |
| gpt-oss:120b | โฅโฏ128โฏGB RAM / Multi-GPU | ~120โฏGB+ / 60โฏGB+ (est.) | Experimental - extremely heavy compute requirement |
- Fix UI crash bugs
- Add multi-tab chat support
- Add chat export/import
- Optimize performance for long requests
- Expand model integration beyond Ollama
- This is still in early development, expect frequent bugs and issues
- UI/UX is minimal for now, focused on core functionality
- Default persona is simple, but can be extended with custom prompts
