Find what you're looking for in your files, even when you don't know the exact words.
SemiSearch is a privacy-focused command-line tool that helps you search through your local files using intelligent text analysis. Unlike traditional search tools that only match exact keywords, SemiSearch understands relationships between words and concepts.
Traditional search tools look for exact word matches. If you search for "car", you won't find documents about "automobile" or "vehicle". SemiSearch uses statistical analysis (TF-IDF) to understand relationships between words and find related concepts.
Example: Searching for "error handling" will find:
- Code with try/catch blocks
- Documentation about exception management
- Comments mentioning "error recovery"
- Functions named "handleFailure" or "processException"
Everything happens on your computer. No data is sent to the cloud. No AI services are called. Your files stay yours.
SemiSearch adapts to your system:
- Any computer: Statistical text analysis (TF-IDF) that works great
- Zero configuration: Just install and start searching
- Progressive enhancement: Advanced features unlock as you learn
Option 1: Build from Source (Recommended)
# Clone repository
git clone https://github.com/kxrm/semisearch.git
cd semisearch
# Build release version
cargo build --release --features neural-embeddings
# The binary will be in target/release/semisearch
./target/release/semisearch --helpOption 2: Download Pre-built Binaries Download the pre-built binary for your system from the releases page (when available).
# Simple search (no subcommand needed)
semisearch "database connection"
# Search in a specific directory
semisearch "user authentication" src/
# Handle typos automatically
semisearch "databse" --fuzzy
# Get interactive help
semisearch help-meSemiSearch includes a comprehensive help system:
# Interactive help with examples and guidance
semisearch help-me
# Check if everything is working
semisearch status
# Detailed system diagnostics
semisearch doctor
# Quick command reference
semisearch --help
# Advanced options for power users
semisearch --advanced --helpThe interactive help (semisearch help-me) is perfect for beginners. It provides:
- Real-time examples based on your queries
- Personalized suggestions
- Step-by-step guidance
- Common use cases for your situation
SemiSearch automatically chooses the best search strategy based on your query:
The tool automatically detects what kind of search you need:
- Simple terms → Fast keyword search
- Conceptual queries → Semantic search for meaning and relationships
- Poor keyword results → Automatic semantic fallback
- Code patterns → Code-aware search
- Typos detected → Automatic fuzzy matching
Traditional exact word matching. Fast and precise.
When used: For exact text you know exists (like "TODO" comments)
Handles typos and partial matches.
semisearch "authentcation" --fuzzyWhen used: When you add --fuzzy or when typos are detected
Statistical ranking based on word importance and relationships.
When used: For conceptual searches like "error handling patterns"
Pattern matching for complex searches.
semisearch --advanced "user_[0-9]+" --mode regexWhen used: In advanced mode for pattern matching
Find code examples and patterns:
# Find all error handling code
semisearch "error handling"
# Find async patterns
semisearch "async await"
# Find TODO comments
semisearch "TODO"
# Find function definitions
semisearch "fn main"Search through papers and notes:
# Find content about a research topic
semisearch "machine learning" docs/
# Find related concepts
semisearch "neural networks" --fuzzySearch through drafts and documents:
# Find all mentions of a topic
semisearch "climate change"
# Find similar content
semisearch "renewable energy"For large directories, create an index for instant searches:
# Create an index
semisearch index ./my-project
# Now searches are much faster
semisearch "database queries"Access power-user features:
# Enable advanced options
semisearch --advanced --help
# Use specific search modes
semisearch --advanced "pattern" --mode regex
# Include/exclude file patterns
semisearch --advanced "TODO" --include "*.rs"
semisearch --advanced "test" --exclude "*test*"
# Fine-tune relevance
semisearch --advanced "query" --semantic-threshold 0.8Control what results you see:
# Show surrounding context lines
./target/release/semisearch --advanced "password" --context 3
# Use semantic search for conceptual queries
./target/release/semisearch --advanced "authentication" --mode semantic
# Output as JSON
./target/release/semisearch --advanced "config" --format json
# Show only file paths
./target/release/semisearch --advanced "function" --files-onlyCheck what search capabilities your system supports:
semisearch doctorThis shows:
- Available search methods
- Database status and indexed files
- Performance metrics
- Recommendations for your system
SemiSearch grows with you:
- Encouraging tips: "Great start! Keep exploring"
- Clear guidance: When searches fail, you know exactly what to try next
- Zero setup: Works immediately without configuration
- Feature discovery: "Try --fuzzy for spelling variations"
- Contextual suggestions: Based on your actual usage patterns
- Learning progression: Tips become more advanced as you use the tool
- Power features: "Try --advanced for more options"
- Efficiency tips: "You're using semisearch a lot! Here are advanced features..."
- All capabilities: Full access to regex, filtering, and advanced modes
Imagine you're looking for a book in a library:
- Traditional search: You can only find books if you know the exact title
- SemiSearch: A librarian who understands what you're looking for and shows you related books
SemiSearch acts like that smart librarian for your files.
- Text Understanding: SemiSearch reads your files and understands what they're about
- Statistical Analysis: It uses TF-IDF to find relationships between words and concepts
- Similarity Matching: When you search, it finds files with related meanings
- Smart Ranking: Results are sorted by how closely they match your intent
- Local Processing: All analysis happens on your computer
- No Cloud Services: Never sends your data anywhere
- Offline Operation: Works without internet
- Your Data Stays Yours: No tracking, no analytics, no external APIs
# Clone the repository
git clone https://github.com/kxrm/semisearch.git
cd semisearch
# Build with neural embeddings support
cargo build --release --features neural-embeddings
# Run the tool
./target/release/semisearch "your query"Pre-built binaries may be available on the releases page for some versions.
# Clone repository
git clone https://github.com/kxrm/semisearch.git
cd semisearch
# Build release version
cargo build --release
# The binary will be in target/release/semisearch# Search (default command)
semisearch "your query"
semisearch "query" path/to/search/
# Get help and status
semisearch help-me # Interactive help
semisearch status # Quick health check
semisearch doctor # Detailed diagnostics
semisearch --help # Command reference# Handle typos
semisearch "databse" --fuzzy
# Find exact matches
semisearch "exact phrase" --exact
# Show more context
semisearch "function" --context 2
# JSON output
semisearch "config" --format json# Enable all options
semisearch --advanced --help
# Specific search modes
semisearch --advanced "query" --mode semantic
semisearch --advanced "pattern" --mode regex
# File filtering
semisearch --advanced "TODO" --include "*.rs"
semisearch --advanced "test" --exclude "*test*"
# Fine-tune relevance
semisearch --advanced "query" --semantic-threshold 0.8semisearch index . # Index current directory
semisearch index ./src # Index specific directory
semisearch status # Check indexed filesWhen no matches are found, SemiSearch provides helpful suggestions:
semisearch "nonexistent"
# Shows: Try different words, check spelling, search broader locationsCommon fixes:
- Try fuzzy search:
semisearch "query" --fuzzy - Use simpler terms: Break complex searches into parts
- Check the location: Make sure you're searching the right directory
SemiSearch automatically provides tips:
semisearch "function"
# Shows: "Many results found. Use more specific terms or search in specific folders"Try:
- Be more specific:
semisearch "function validateUser" - Search specific folders:
semisearch "TODO" src/ - Use exact phrases:
semisearch "exact phrase" --exact
Speed up searches:
- Create an index:
semisearch index . - Search specific folders:
semisearch "query" ./src - Check system status:
semisearch doctor
SemiSearch provides contextual help based on what you're trying to do:
- Interactive guidance:
semisearch help-me - System status:
semisearch status - Detailed diagnostics:
semisearch doctor - Command reference:
semisearch --help
-
Index large directories:
semisearch index ./large-project # Subsequent searches will be much faster -
Use specific paths:
# Faster: searches only src/ semisearch "function" src/
-
Leverage automatic optimization:
- SemiSearch automatically chooses the fastest method for your query
- Simple searches use fast keyword matching
- Complex searches use statistical analysis
SemiSearch provides these search capabilities:
- ✅ Keyword search: Fast exact matching for precise queries
- ✅ Fuzzy search: Handles typos and similar words (
--fuzzy) - ✅ Semantic search: Automatically used for conceptual queries and as fallback when keyword search fails
- ✅ TF-IDF analysis: Statistical text analysis for concept matching
- ✅ Regex patterns: Advanced pattern matching (
--advanced --mode regex) - ✅ Context lines: Show surrounding lines around matches (
--advanced --context N) - ✅ Project detection: Automatically adapts to Rust, JavaScript, Python, and documentation projects
- ✅ File filtering: Include/exclude patterns (
--include "*.rs") - ✅ Multiple output formats: Plain text and JSON (
--format json)
Note: Semantic search works automatically in basic searches - no --advanced flag needed!
Check your specific capabilities: ./target/release/semisearch doctor
We welcome contributions! See CONTRIBUTING.md for guidelines.
MIT License - see LICENSE for details.
- Built with Rust for performance and safety
- TF-IDF implementation for intelligent text analysis
- Inspired by the need for private, intelligent search
Remember: Your files stay on your computer. Your privacy is preserved. Search smarter, not harder. 🔍