A starter template for building agentic AI applications with multi-step reasoning, tool calling, and RAG capabilities.
- Overview
- Architecture
- Key Features
- Project Structure
- Technology Stack
- Getting Started
- Configuration
- Architecture Decisions
- Extending the Project
- License
This project is a modular, extensible framework for building agentic AI applications in Python. It follows industry best practices inspired by frameworks like Microsoft Semantic Kernel, LangChain, and LlamaIndex.
- Agentic Reasoning: Multi-step task planning and execution
- Tool Calling: Automatic tool invocation based on agent decisions
- RAG (Retrieval-Augmented Generation): Semantic search over documents
- Flexible LLM Integration: Support for OpenAI, GitHub Models, Gemini
- Streaming API: Real-time responses via FastAPI
- Chat UI: React frontend with WebSocket support
User (React Frontend)
β
API Gateway (FastAPI)
β
Orchestrator (Task Router)
β
Agent (Reasoning Engine)
β
Tool Manager (Search, Math, FileReader, MCP)
β
Memory Layer (Vector DB, RAG)
β
LLM Provider (OpenAI / GitHub / Gemini)
β
Response Stream β React UI
| Component | Purpose | Technologies |
|---|---|---|
| Frontend | User interaction & chat UI | React, TypeScript |
| API | HTTP/WebSocket server | FastAPI, Uvicorn |
| Agent | Decision making & planning | Python async, reasoning logic |
| Tools | Extensible capabilities | Search, Math, File I/O, MCP |
| Memory | Context & semantic search | ChromaDB, Vector embeddings |
| LLM | Language model calls | OpenAI, GitHub Models, Gemini |
β
Ask anything - Natural language queries with full context
β
Multi-step reasoning - Agent plans and executes complex tasks
β
Tool calling - Automatic tool selection and invocation
β
Search capabilities - Web search, document search, code search
β
RAG integration - Semantic search over your documents
β
Streaming responses - Real-time output via WebSocket
β
Modular design - Easy to add new tools, providers, memory backends
β
Production-ready - Docker support, structured logging, error handling
ntg-python-agent/
β
βββ src/
β βββ agent/
β β βββ __init__.py
β β βββ runner.py # Agent loop
β β βββ planner.py # Reasoning / planning
β β βββ context.py # Agent state
β β
β βββ llm/
β β βββ __init__.py
β β βββ provider.py # OpenAI / GitHub / Gemini wrapper
β β βββ models.py
β β
β βββ tools/
β β βββ __init__.py
β β βββ search.py
β β βββ math.py
β β βββ tool_registry.py
β β
β βββ memory/
β β βββ __init__.py
β β βββ vector_store.py
β β βββ embedder.py
| |
β βββ server/
β βββ main.py # FastAPI application
β βββ routers/ # API endpoints (future)
β
βββ tests/ # Unit & integration tests
βββ pyproject.toml # Python project configuration
βββ .env # Environment variables
βββ .gitignore
βββ LICENSE
βββ README.md
| Layer | Technology | Version |
|---|---|---|
| Runtime | Python | 3.12+ |
| API Framework | FastAPI | Latest |
| Async | asyncio | Built-in |
| LLM Integration | OpenAI SDK | Latest |
| Environment | python-dotenv | Latest |
| Frontend | React | 18+ |
| Containerization | Docker | Latest |
- Python 3.12 or higher (required by project configuration)
- pip, uv package manager
- API keys for LLM providers (OpenAI, etc.)
- Optional: Docker for containerized deployment
-
Clone the repository
git clone https://github.com/nashtech-garage/ntg-python-agent.git cd ntg-python-agent -
Create virtual environment
python -m venv .venv source .venv/bin/activate # on Linux/macOS .venv\Scripts\activate # On Windows
-
Install pre-commit in venv
python -m pip install pre-commit pre-commit --version # Check version -
Install base dependencies
pip install -e .This installs:
openai>=1.60.0- OpenAI API clientfastapi>=0.115.0- Web frameworkuvicorn>=0.30.0- ASGI serverhttpx>=0.27.0- HTTP clientpython-dotenv>=1.0.1- Environment variablespydantic>=2.7.0- Data validation
-
Install optional dependencies (as needed)
For RAG/Memory features:
pip install -e ".[rag]"Includes:
faiss-cpu,sentence-transformers,chromadbFor development:
pip install -e ".[dev]"Includes:
pre-commit,ruff,pytest -
Configure environment variables: Copy and paste all from .env.example to .env file
-
Run the server
uvicorn src.server.main:app --reload
Server runs at:
http://localhost:8000
# Install uv (one-time)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone and setup
git clone https://github.com/nashtech-garage/ntg-python-agent.git
cd ntg-python-agent
# Sync dependencies
uv sync --all-groups
# Run
uv run uvicorn src.server.main:app --reloadCreate a .env file in the project root with the following variables:
# Required: your OpenAI API key
OPENAI_API_KEY=sk-proj-xxxFollow these steps to create and obtain an OpenAI API key (secret):
- Create an account or sign in at the OpenAI platform:
- Verify your email address and finish any required account setup.
- Open the API keys page:
- Click Create new secret key (or similar). Copy the generated key immediately β you won't be able to see it again in full.
- Paste the copied key into your project's
.envfile as the value ofOPENAI_API_KEY.
The project defines dependencies in pyproject.toml:
Core Dependencies (always installed):
openai>=1.60.0- OpenAI API client for LLM callsfastapi>=0.115.0- Modern web framework with async supportuvicorn>=0.30.0- ASGI web serverhttpx>=0.27.0- Async HTTP clientpython-dotenv>=1.0.1- Load environment variables from.envpydantic>=2.7.0- Data validation and settings management
Optional: RAG Group - For retrieval-augmented generation:
pip install -e ".[rag]"faiss-cpu>=1.8.0- Vector search librarysentence-transformers>=3.0.0- Embedding modelschromadb>=0.5.0- Vector database
Optional: Dev Group - For development and testing:
pip install -e ".[dev]"pre-commit>=4.5.0- Git hooks frameworkruff>=0.14.8- Python linter and formatterpytest>=8.3.0- Testing framework
- Name: ntg-python-agent
- Version: 0.1.0
- Python Version: >=3.12 (required)
- License: MIT
- Repository: github.com/nashtech-garage/ntg-python-agent
This section documents key architectural decisions made in this project.
Status: Accepted
Context: Supporting multiple LLM providers (OpenAI, GitHub Models, Gemini) requires flexibility.
Decision: Implement provider abstraction layer with pluggable implementations.
Rationale:
- Reduces vendor lock-in
- Allows easy switching between providers
- Supports cost optimization (choose cheapest provider per task)
- Enables A/B testing different models
Consequences:
- β Easy to add new providers
- β Consistent interface across providers
β οΈ Need to handle provider-specific features gracefully
Implementation: /src/agent/llm.py - Base class for all LLM providers
Status: Accepted
Context: Agent operations are I/O bound (API calls, DB queries, tool invocations).
Decision: Use Python asyncio throughout the codebase.
Rationale:
- Better resource utilization (non-blocking I/O)
- Natural fit for FastAPI
- Handles concurrent requests efficiently
- Enables real-time streaming responses
Consequences:
- β High concurrency support
- β Responsive APIs
β οΈ Requires careful handling of blocking operations
Implementation: All agent methods are async def, FastAPI handles async endpoints
Status: Planned
Context: Need efficient semantic search over documents for context retrieval.
Decision: Use ChromaDB as primary vector database with embedding service.
Rationale:
- Lightweight, in-process option for development
- Production options available (Pinecone, Weaviate, Milvus)
- Fast approximate nearest neighbor search
- Supports metadata filtering
Consequences:
- β Fast semantic search
- β Flexible storage options
β οΈ Requires embedding model selection
MIT License