A Custom GPT powered by RAG (Retrieval-Augmented Generation) that provides expert assistance for learning about ether.fi, liquid staking, and EigenLayer restaking.
- Semantic Search: Vector-based search across EtherFi documentation and smart contracts
- Interactive Study: Commands like
/brief,/deepdive,/walkfor different learning styles - Quiz Generation: AI-generated questions from actual protocol documentation
- Interactive Learning: Full interactive learning experience with Q/A model and progressive difficulty
- Source Citations: Every answer includes links to source documentation
EtherFi GPT
βββ Crawler Module (Milestone 2)
β βββ GitBook documentation crawler
β βββ GitHub repository crawler
βββ Embedding Pipeline (Milestone 3)
β βββ Text chunking (1200 tokens, 200 overlap)
β βββ OpenAI embeddings (text-embedding-3-large)
βββ Vector Database (Milestone 3)
β βββ Supabase + pgvector
β βββ Cosine similarity search
βββ API (Milestone 4)
β βββ POST /api/search - Semantic search
β βββ GET /api/browse - Fetch URL content
β βββ GET /api/version - Crawl timestamps
β βββ POST /api/quiz - Generate quizzes
βββ Custom GPT (Milestone 6)
βββ OpenAI GPT Builder with Actions
| Milestone | Status | Description |
|---|---|---|
| 1. Project Scaffold | β Complete | Directory structure, requirements, FastAPI server |
| 2. Crawler Module | β Complete | 90 files crawled (49 docs + 42 GitHub files) |
| 3. Chunk + Embed | β Code Complete | Chunking, embedding, vector DB ready |
| 4. Retrieval API | β Complete | All 4 endpoints implemented |
| 5. OpenAPI Spec | β Complete | Full OpenAPI 3.1 specification |
| 6. Custom GPT | β Complete | Integration guide and instructions |
- Python 3.10+
- OpenAI API key
- Supabase account (free tier works)
- GitHub token (optional, for re-crawling)
# Clone and navigate
cd etherfi-gpt
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Configure environment variables
cp .env.example .env
# Edit .env with your API keys:
# - OPENAI_API_KEY
# - SUPABASE_URL
# - SUPABASE_KEY
# - GITHUB_TOKEN (optional)# Execute SQL in Supabase SQL Editor
# Copy contents of storage/setup.sql
# Or run:
python storage/setup_database.py# Test with 2 files
python test_embedding_small.py
# Process all 90 files (~10-15 minutes, ~$0.07 cost)
python embeddings/embed_all.py# Development
uvicorn api.main:app --reload
# Production
uvicorn api.main:app --host 0.0.0.0 --port 8000
# API will be available at:
# - http://localhost:8000
# - Docs: http://localhost:8000/docs
# - Health: http://localhost:8000/health# Health check
curl http://localhost:8000/health
# Search
curl -X POST http://localhost:8000/api/search \
-H "Content-Type: application/json" \
-d '{"query": "eETH vs weETH", "top_k": 5}'
# Version info
curl http://localhost:8000/api/version
# Browse URL
curl "http://localhost:8000/api/browse?url=https://etherfi.gitbook.io/etherfi/"
# Generate quiz
curl -X POST http://localhost:8000/api/quiz \
-H "Content-Type: application/json" \
-d '{"topic": "liquid staking", "num_questions": 5}'etherfi-gpt/
βββ api/
β βββ main.py # FastAPI application
β βββ routers.py # API endpoints
β βββ openapi_actions.yaml # OpenAPI spec for GPT Actions
βββ crawler/
β βββ crawl_docs.py # GitBook crawler
β βββ crawl_github.py # GitHub crawler
βββ embeddings/
β βββ embed_all.py # Embedding pipeline
βββ storage/
β βββ vector_db.py # Vector database operations
β βββ setup.sql # Database schema
β βββ setup_database.py # Setup helper
βββ utils/
β βββ chunker.py # Text chunking
βββ data/
β βββ raw/ # Crawled data (90 files)
β βββ docs/ # GitBook docs (49 files)
β βββ github/ # GitHub files (42 files)
βββ requirements.txt # Python dependencies
βββ .env # Environment variables
βββ Dockerfile # Docker configuration
βββ README.md # This file
- CUSTOM_GPT_SETUP.md - Complete Custom GPT integration guide
- MILESTONE_1_VERIFICATION.md - Project scaffold verification
- CRAWLER_STATUS.md - Crawler implementation details
- MILESTONE_3_STATUS.md - Embedding pipeline status
- MILESTONE_3_VERIFICATION.md - Implementation verification
# Re-crawl GitBook docs
python crawler/crawl_docs.py
# Re-crawl GitHub repos
python crawler/crawl_github.py# Test chunking
python test_chunking.py
# Test embedding (small sample)
python test_embedding_small.py
# Test API health
python test_health.py# After re-crawling, regenerate embeddings
python embeddings/embed_all.pySee CUSTOM_GPT_SETUP.md for detailed deployment instructions.
fly launch
fly secrets set OPENAI_API_KEY="..." SUPABASE_URL="..." SUPABASE_KEY="..."
fly deploy- Connect GitHub repo
- Set environment variables
- Deploy
- OpenAI embeddings: ~$0.07 (one-time for 90 files)
- Supabase: Free tier sufficient
- Hosting: Free tier available
- OpenAI API: $5-20 (depends on usage)
- Hosting: $5-10 (Fly.io or Render)
- Supabase: Free tier or $25/month
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Health check |
/api/search |
POST | Semantic search |
/api/browse |
GET | Fetch URL content |
/api/version |
GET | Crawl timestamps |
/api/quiz |
POST | Generate quiz questions |
POST /api/search
{
"query": "eETH vs weETH differences",
"top_k": 8
}POST /api/search
{
"query": "staking contract functions",
"top_k": 5,
"filters": {
"repo": ["smart-contracts"]
}
}POST /api/quiz
{
"topic": "EigenLayer restaking",
"num_questions": 5
}This is a private project to learn interactively about EtherFi. If you'd like to contribute:
- Ensure all tests pass
- Follow existing code style
- Update documentation
- Test with the Custom GPT
Private - All Rights Reserved
- ether.fi for the amazing protocol
- EigenLayer for restaking infrastructure
- OpenAI for GPT-4 and embeddings API
- Supabase for vector database hosting
Built with β€οΈ for EtherFi
For setup questions, see CUSTOM_GPT_SETUP.md