Skip to content

asmitadhungana/etherfi-gpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

EtherFi GPT

A Custom GPT powered by RAG (Retrieval-Augmented Generation) that provides expert assistance for learning about ether.fi, liquid staking, and EigenLayer restaking.

🌟 Features

  • Semantic Search: Vector-based search across EtherFi documentation and smart contracts
  • Interactive Study: Commands like /brief, /deepdive, /walk for different learning styles
  • Quiz Generation: AI-generated questions from actual protocol documentation
  • Interactive Learning: Full interactive learning experience with Q/A model and progressive difficulty
  • Source Citations: Every answer includes links to source documentation

πŸ—οΈ Architecture

EtherFi GPT
β”œβ”€β”€ Crawler Module (Milestone 2)
β”‚   β”œβ”€β”€ GitBook documentation crawler
β”‚   └── GitHub repository crawler
β”œβ”€β”€ Embedding Pipeline (Milestone 3)
β”‚   β”œβ”€β”€ Text chunking (1200 tokens, 200 overlap)
β”‚   └── OpenAI embeddings (text-embedding-3-large)
β”œβ”€β”€ Vector Database (Milestone 3)
β”‚   β”œβ”€β”€ Supabase + pgvector
β”‚   └── Cosine similarity search
β”œβ”€β”€ API (Milestone 4)
β”‚   β”œβ”€β”€ POST /api/search - Semantic search
β”‚   β”œβ”€β”€ GET /api/browse - Fetch URL content
β”‚   β”œβ”€β”€ GET /api/version - Crawl timestamps
β”‚   └── POST /api/quiz - Generate quizzes
└── Custom GPT (Milestone 6)
    └── OpenAI GPT Builder with Actions

πŸ“Š Current Status

Milestone Status Description
1. Project Scaffold βœ… Complete Directory structure, requirements, FastAPI server
2. Crawler Module βœ… Complete 90 files crawled (49 docs + 42 GitHub files)
3. Chunk + Embed βœ… Code Complete Chunking, embedding, vector DB ready
4. Retrieval API βœ… Complete All 4 endpoints implemented
5. OpenAPI Spec βœ… Complete Full OpenAPI 3.1 specification
6. Custom GPT βœ… Complete Integration guide and instructions

πŸš€ Quick Start

Prerequisites

  • Python 3.10+
  • OpenAI API key
  • Supabase account (free tier works)
  • GitHub token (optional, for re-crawling)

1. Setup Environment

# Clone and navigate
cd etherfi-gpt

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Configure environment variables
cp .env.example .env
# Edit .env with your API keys:
# - OPENAI_API_KEY
# - SUPABASE_URL
# - SUPABASE_KEY
# - GITHUB_TOKEN (optional)

2. Setup Database

# Execute SQL in Supabase SQL Editor
# Copy contents of storage/setup.sql
# Or run:
python storage/setup_database.py

3. Run Embedding Pipeline

# Test with 2 files
python test_embedding_small.py

# Process all 90 files (~10-15 minutes, ~$0.07 cost)
python embeddings/embed_all.py

4. Start API Server

# Development
uvicorn api.main:app --reload

# Production
uvicorn api.main:app --host 0.0.0.0 --port 8000

# API will be available at:
# - http://localhost:8000
# - Docs: http://localhost:8000/docs
# - Health: http://localhost:8000/health

5. Test Endpoints

# Health check
curl http://localhost:8000/health

# Search
curl -X POST http://localhost:8000/api/search \
  -H "Content-Type: application/json" \
  -d '{"query": "eETH vs weETH", "top_k": 5}'

# Version info
curl http://localhost:8000/api/version

# Browse URL
curl "http://localhost:8000/api/browse?url=https://etherfi.gitbook.io/etherfi/"

# Generate quiz
curl -X POST http://localhost:8000/api/quiz \
  -H "Content-Type: application/json" \
  -d '{"topic": "liquid staking", "num_questions": 5}'

πŸ“ Project Structure

etherfi-gpt/
β”œβ”€β”€ api/
β”‚   β”œβ”€β”€ main.py              # FastAPI application
β”‚   β”œβ”€β”€ routers.py           # API endpoints
β”‚   └── openapi_actions.yaml # OpenAPI spec for GPT Actions
β”œβ”€β”€ crawler/
β”‚   β”œβ”€β”€ crawl_docs.py        # GitBook crawler
β”‚   └── crawl_github.py      # GitHub crawler
β”œβ”€β”€ embeddings/
β”‚   └── embed_all.py         # Embedding pipeline
β”œβ”€β”€ storage/
β”‚   β”œβ”€β”€ vector_db.py         # Vector database operations
β”‚   β”œβ”€β”€ setup.sql            # Database schema
β”‚   └── setup_database.py    # Setup helper
β”œβ”€β”€ utils/
β”‚   └── chunker.py           # Text chunking
β”œβ”€β”€ data/
β”‚   └── raw/                 # Crawled data (90 files)
β”‚       β”œβ”€β”€ docs/            # GitBook docs (49 files)
β”‚       └── github/          # GitHub files (42 files)
β”œβ”€β”€ requirements.txt         # Python dependencies
β”œβ”€β”€ .env                     # Environment variables
β”œβ”€β”€ Dockerfile              # Docker configuration
└── README.md               # This file

πŸ“š Documentation

πŸ”§ Development

Re-crawl Data

# Re-crawl GitBook docs
python crawler/crawl_docs.py

# Re-crawl GitHub repos
python crawler/crawl_github.py

Run Tests

# Test chunking
python test_chunking.py

# Test embedding (small sample)
python test_embedding_small.py

# Test API health
python test_health.py

Update Embeddings

# After re-crawling, regenerate embeddings
python embeddings/embed_all.py

🚒 Deployment

See CUSTOM_GPT_SETUP.md for detailed deployment instructions.

Quick Deploy to Fly.io

fly launch
fly secrets set OPENAI_API_KEY="..." SUPABASE_URL="..." SUPABASE_KEY="..."
fly deploy

Quick Deploy to Render

  1. Connect GitHub repo
  2. Set environment variables
  3. Deploy

πŸ’° Cost Breakdown

Development/Testing

  • OpenAI embeddings: ~$0.07 (one-time for 90 files)
  • Supabase: Free tier sufficient
  • Hosting: Free tier available

Production (Monthly)

  • OpenAI API: $5-20 (depends on usage)
  • Hosting: $5-10 (Fly.io or Render)
  • Supabase: Free tier or $25/month

🎯 API Endpoints

Endpoint Method Description
/health GET Health check
/api/search POST Semantic search
/api/browse GET Fetch URL content
/api/version GET Crawl timestamps
/api/quiz POST Generate quiz questions

πŸ§ͺ Example Queries

Search for comparisons

POST /api/search
{
  "query": "eETH vs weETH differences",
  "top_k": 8
}

Search with filters

POST /api/search
{
  "query": "staking contract functions",
  "top_k": 5,
  "filters": {
    "repo": ["smart-contracts"]
  }
}

Generate quiz

POST /api/quiz
{
  "topic": "EigenLayer restaking",
  "num_questions": 5
}

🀝 Contributing

This is a private project to learn interactively about EtherFi. If you'd like to contribute:

  1. Ensure all tests pass
  2. Follow existing code style
  3. Update documentation
  4. Test with the Custom GPT

πŸ“„ License

Private - All Rights Reserved

πŸ™ Acknowledgments

  • ether.fi for the amazing protocol
  • EigenLayer for restaking infrastructure
  • OpenAI for GPT-4 and embeddings API
  • Supabase for vector database hosting

Built with ❀️ for EtherFi

For setup questions, see CUSTOM_GPT_SETUP.md

About

RAG-powered Custom GPT for ether.fi protocol. Semantic search across GitBook docs and smart contracts using OpenAI embeddings, Supabase pgvector, and FastAPI.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors