Skip to content

whoisdsmith/ExaHub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ExaHub Banner

πŸš€ ExaHub: Enhanced GitHub Search

Supercharge your GitHub searches with semantic AI

✨ Features

  • πŸ” Advanced GitHub Search - Find repositories, code, issues, and users with GitHub's powerful search syntax
  • 🧠 AI-Powered Semantic Enhancement - Leverage Exa's semantic AI to discover conceptually relevant results
  • 🎯 Customizable Filters - Refine results with filters for language, stars, forks, dates, and more
  • πŸ’» Modern UI - Beautiful and responsive interface built with Bootstrap 5
  • πŸ”Œ RESTful API - Seamlessly integrate search capabilities into your applications
  • πŸ”— URL Similarity Search - Find websites similar to any GitHub project
  • 🌐 Content Similarity Search - Discover conceptually related content across the web
  • πŸ“Š Rate Limiting - Built-in protection against API abuse
  • πŸ“ Comprehensive Logging - Detailed logs for monitoring and debugging

πŸš€ Quick Start

Prerequisites

  • Python 3.8 or higher
  • GitHub API token (recommended to avoid rate limits)
  • Exa API key (required for semantic search capabilities)

Installation

  1. Clone the repository
git clone https://github.com/yourusername/ExaHub.git
cd ExaHub
  1. Create and activate a virtual environment
# Create a virtual environment
python -m venv venv

# Activate the virtual environment
# On Windows
venv\Scripts\activate

# On macOS/Linux
source venv/bin/activate
  1. Install dependencies
pip install -r requirements.txt
  1. Configure environment variables

Create a .env file in the project root:

GITHUB_TOKEN=your_github_token
EXA_API_KEY=your_exa_api_key
SECRET_KEY=a_secure_random_string
FLASK_ENV=development
  1. Start the application
python run.py
  1. Open your browser

Navigate to http://localhost:5000 to start searching!

🧩 How It Works

ExaHub combines the power of GitHub's traditional search with Exa AI's semantic understanding:

  1. Your search query is processed through the GitHub API to find exact matches
  2. Exa AI enhances these results by identifying semantically related content
  3. Results are ranked by relevance, with highlighted matches
  4. Additional metadata enriches each result for better context

πŸ” Search Capabilities

GitHub Search Syntax

keyword                   # Search for a keyword
"exact phrase"            # Search for the exact phrase
repo:owner/name           # Filter by repository
user:username             # Filter by user
org:organization          # Filter by organization
language:python           # Filter by language
stars:>1000               # Repositories with more than 1000 stars
created:>2022-01-01       # Created after January 1, 2022
pushed:>2023-01-01        # Last updated after January 1, 2023
topic:machine-learning    # Filter by topic
is:public                 # Only public repositories
fork:true                 # Include forks

Semantic Search

ExaHub integrates Exa AI to enhance search results by:

  • Finding conceptually similar results beyond exact keyword matches
  • Ranking results based on semantic relevance to your query
  • Highlighting the most relevant sections of the code or text
  • Discovering content that traditional keyword search might miss

πŸ”Œ API Usage

Search Repositories

import requests
import json

# Example search parameters
search_data = {
    "query": "machine learning",
    "type": "repositories",
    "language": "python",
    "stars": [100, null],  # Repos with at least 100 stars
    "topics": ["ai", "deep-learning"],
    "enhance_with_exa": true,
    "page": 1,
    "per_page": 10
}

# Send the request
response = requests.post(
    "http://localhost:5000/search/api",
    json=search_data,
    headers={"Content-Type": "application/json"}
)

# Process the results
results = response.json()

Find Similar URLs

# Find websites similar to a GitHub repository
url_data = {
    "url": "https://github.com/tensorflow/tensorflow",
    "num_results": 5
}

response = requests.post(
    "http://localhost:5000/api/similar-urls",
    data=url_data
)

Similarity Search

# Find content similar to a concept
similarity_data = {
    "prompt": "Transformer architecture for natural language processing",
    "num_results": 10,
    "search_type": "neural"
}

response = requests.post(
    "http://localhost:5000/api/similarity-search",
    data=similarity_data
)

πŸ”§ Configuration Options

Variable Description Default
GITHUB_TOKEN GitHub API token None
EXA_API_KEY Exa API key None
SECRET_KEY Flask secret key 'dev'
FLASK_ENV Environment (development/production) 'development'
HOST Server host '0.0.0.0'
PORT Server port 5000
FLASK_CONFIG Configuration profile to use 'default'

πŸ› οΈ Project Structure

ExaHub/
β”œβ”€β”€ app/                          # Application package
β”‚   β”œβ”€β”€ __init__.py               # Flask app initialization
β”‚   β”œβ”€β”€ config.py                 # Configuration settings
β”‚   β”œβ”€β”€ routes/                   # Route definitions
β”‚   β”‚   β”œβ”€β”€ main.py               # Main/index routes
β”‚   β”‚   β”œβ”€β”€ search.py             # Search routes
β”‚   β”‚   └── api.py                # API routes
β”‚   β”œβ”€β”€ models/                   # Data models
β”‚   β”‚   β”œβ”€β”€ search_params.py      # Search parameters model
β”‚   β”‚   └── search_result.py      # Search results model
β”‚   β”œβ”€β”€ services/                 # Service layer
β”‚   β”‚   β”œβ”€β”€ search_service.py     # GitHub search service
β”‚   β”‚   └── similarity_service.py # Exa AI service
β”‚   β”œβ”€β”€ static/                   # Static assets
β”‚   └── templates/                # HTML templates
β”‚       β”œβ”€β”€ base.html             # Base template
β”‚       β”œβ”€β”€ index.html            # Home page
β”‚       β”œβ”€β”€ results.html          # Search results
β”‚       └── api_playground.html   # API testing page
β”œβ”€β”€ run.py                        # Application entry point
β”œβ”€β”€ requirements.txt              # Python dependencies
β”œβ”€β”€ .env                          # Environment variables
└── README.md                     # Project documentation

πŸ§ͺ Development

Running Tests

# Run all tests
pytest

# Run with coverage report
coverage run -m pytest
coverage report

Rate Limiting

ExaHub includes built-in rate limiting to prevent API abuse:

  • Global: 200 requests per day, 50 per hour
  • Search API: 30 requests per minute
  • Similarity search: 20 requests per minute
  • Web interface: 60 searches per hour

Logging

The application includes comprehensive logging:

  • Development: Console logging at DEBUG level
  • Production: Rotating file logs (10MB max size, 10 backups)
  • Log location: instance/logs/exahub.log

πŸ”‘ Getting API Keys

GitHub API Token

  1. Go to GitHub Developer Settings
  2. Generate a new token with repo and read:user scopes
  3. Copy the token to your .env file

Exa API Key

  1. Visit Exa.ai to sign up for an account
  2. Navigate to your profile settings to create an API key
  3. Add the key to your .env file

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • GitHub API for repository search capabilities
  • Exa AI for semantic search technology

Made with ❀️ by Dustin Smith

Report Bug β€’ Request Feature

About

ExaHub: Because scrolling endlessly through GitHub should be for procrastination, not actual searching. πŸš€πŸ§ 

Topics

Resources

Stars

Watchers

Forks

Contributors