Production-grade document retrieval API with hybrid search combining BM25 keyword matching and HNSW vector similarity using Apache Lucene 9+.
- Hybrid Search: Combine keyword (BM25) and vector (HNSW) search with adjustable weighting
- High Performance: Lucene HNSW for fast approximate nearest neighbor search
- Hexagonal Architecture: Clean separation of domain, ports, and adapters
- Observability: Prometheus metrics, OpenTelemetry instrumentation, Grafana dashboards
- π Pluggable Embeddings: HTTP-based or ONNX Runtime providers
- π° RSS Ingestion: Automated document ingestion from RSS feeds and URLs
- Modern Dashboard: Next.js 14 UI with real-time search
- Docker Ready: Complete docker-compose setup with all services
- 90%+ Test Coverage: Comprehensive test suite with JUnit 5, WireMock, Testcontainers
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β REST API Layer β
β (SearchController, AdminController, HealthController) β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββ
β Domain Layer β
β Models: DocumentChunk, SearchQuery, SearchResult β
β Ports: EmbeddingProvider, Indexer, Searcher β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββ
β Infrastructure Layer β
β β
β βββββββββββββββββββ ββββββββββββββββββββββββββββββββ β
β β Lucene HNSW β β Embedding Providers β β
β β - Indexer β β - HttpEmbeddingProvider β β
β β - Searcher β β - OnnxEmbeddingProvider β β
β βββββββββββββββββββ ββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β Ingestion Pipeline ββ
β β - RssIngestService ββ
β β - HtmlCleaner (jsoup) ββ
β β - Chunker (token-based splitting) ββ
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Java 21+
- Docker & Docker Compose (optional)
- Node.js 20+ (for dashboard)
# Start all services
docker-compose up --build
# API available at http://localhost:8080
# Dashboard at http://localhost:3000
# Prometheus at http://localhost:9090
# Grafana at http://localhost:3001 (admin/admin)# Run API in dev mode (with ONNX stub embeddings)
make dev
# Or manually:
./gradlew bootRun --args='--spring.profiles.active=dev'# See all available commands
make help
# Build and test
make build
make test
# Run with Docker
make docker-upOnce the API is running (via make dev or docker-compose up), run the interactive demo:
make demoThis will:
- Ingest sample documents from Hacker News RSS
- Run search with
alpha=0.0(pure BM25 keyword search) - Run search with
alpha=1.0(pure KNN vector search) - Show how rankings differ based on the alpha parameter
Expected output:
π¦ Searchlight Hybrid Search Demo
====================================
π₯ Ingesting sample RSS feed (Hacker News)...
β
Indexed 15 documents, 73 chunks
βββββββββββββββββββββββββββββββββββββββββββββββββββ
π Search 1: Pure Keyword (alpha=0.0, BM25 only)
βββββββββββββββββββββββββββββββββββββββββββββββββββ
[0.95] Best Programming Languages for 2025
[0.82] Learn to Code: A Beginner's Guide
[0.71] Programming Paradigms Explained
...
βββββββββββββββββββββββββββββββββββββββββββββββββββ
π Search 2: Pure Vector (alpha=1.0, KNN only)
βββββββββββββββββββββββββββββββββββββββββββββββββββ
[0.91] Machine Learning Fundamentals
[0.87] Deep Learning in Practice
[0.79] Building Neural Networks
...
β
Demo complete! Rankings differ based on alpha parameter.
Once running, access interactive API docs at:
http://localhost:8080/swagger-ui.html
curl -X POST http://localhost:8080/api/v1/search \
-H "Content-Type: application/json" \
-d '{
"q": "machine learning",
"k": 10,
"alpha": 0.5
}'Request Body (copy to Swagger UI):
{
"q": "machine learning neural networks",
"k": 5,
"alpha": 0.5
}Parameters:
q(string, required): Query textk(int, optional): Number of results (default: 10)alpha(float 0-1, optional): Hybrid weight (default: 0.5)0.0= pure keyword search (BM25 only)1.0= pure vector search (KNN only)0.5= balanced hybrid search
offset(int, optional): Pagination offset (default: 0)
Sample Response:
{
"query": "machine learning neural networks",
"results": [
{
"id": "doc-1-chunk-0",
"sourceId": "doc-1",
"title": "Introduction to Neural Networks",
"url": "https://example.com/neural-networks",
"snippet": "Neural networks are the foundation of modern machine learning...",
"score": 0.85,
"keywordScore": 0.72,
"vectorScore": 0.91,
"source": "hacker-news",
"timestamp": "2025-10-03T10:30:00Z"
}
],
"total": 1,
"took": 45
}curl -X POST http://localhost:8080/api/v1/admin/ingest \
-H "Content-Type: application/json" \
-d '{
"urls": ["https://news.ycombinator.com/rss"],
"mode": "RSS"
}'Request Body (copy to Swagger UI):
{
"urls": [
"https://news.ycombinator.com/rss",
"https://example.com/blog/feed.xml"
],
"mode": "RSS"
}Sample Response:
{
"message": "Ingestion completed",
"documentsProcessed": 25,
"chunksIndexed": 142,
"errors": 0
}curl http://localhost:8080/api/v1/docs/{id}curl -X POST http://localhost:8080/api/v1/admin/reindexcurl http://localhost:8080/api/v1/healthcurl http://localhost:8080/actuator/prometheusThe Next.js dashboard provides a beautiful UI for searching:
- Real-time search with results
- Adjustable alpha slider for hybrid search tuning
- Result scoring visualization
- Responsive design
cd dashboard
npm install
npm run dev
# Open http://localhost:3000# Run all tests
./gradlew test
# Generate coverage report
./gradlew jacocoTestReport
# View coverage
open build/reports/jacoco/test/html/index.html- Unit Tests: Domain logic, embeddings, chunking
- Integration Tests: Lucene indexing & searching
- API Tests: Controller endpoints with MockMvc
- E2E Tests: Full stack smoke tests
searchlight:
index:
path: data/index
similarity: COSINE # COSINE, DOT_PRODUCT, EUCLIDEAN
hnsw:
m: 16
ef-construction: 100
embedding:
provider: onnx # http or onnx
url: http://localhost:8000/embed
dimension: 384
timeout: 30000
chunker:
size: 512
overlap: 50SEARCHLIGHT_EMBEDDING_PROVIDER=http|onnx
SEARCHLIGHT_EMBEDDING_URL=http://embedder:8000/embed
SEARCHLIGHT_EMBEDDING_DIMENSION=384
SPRING_PROFILES_ACTIVE=dev|ci|prodRun load tests with k6:
make benchSample Results (local machine, mock embeddings):
| Metric | Value |
|---|---|
| Requests/sec | ~200 RPS |
| P50 Latency | 45ms |
| P95 Latency | 120ms |
| P99 Latency | 180ms |
| Error Rate | <0.1% |
Note: Performance varies based on index size, hardware, and embedding provider.
searchlight/
βββ src/main/java/com/searchlight/
β βββ app/ # Spring Boot application & config
β βββ domain/ # Core domain models & ports
β β βββ model/
β β βββ ports/
β βββ infra/ # Infrastructure implementations
β β βββ embeddings/ # Embedding providers
β β βββ index/ # Lucene HNSW
β β βββ ingest/ # RSS/HTML processing
β βββ api/ # REST controllers & DTOs
β βββ controller/
β βββ dto/
βββ src/test/java/com/searchlight/
β βββ fixtures/
β βββ infra/
β βββ api/
β βββ e2e/
βββ dashboard/ # Next.js frontend
βββ scripts/ # Helper scripts
βββ config/ # Prometheus, Grafana configs
βββ docker-compose.yml
- Implement
EmbeddingProviderinterface - Add
@ConditionalOnPropertyfor configuration - Register in Spring context
- Update configuration
Example:
@Component
@ConditionalOnProperty(name = "searchlight.embedding.provider", havingValue = "custom")
public class CustomEmbeddingProvider implements EmbeddingProvider {
// Implementation...
}Key metrics exposed via Prometheus:
search_requests_total- Total search requestssearch_latency- Search latency histogramembedding_latency- Embedding generation timeindex_docs_count- Total documents in indexingest_documents_total- Documents ingestedingest_errors_total- Ingestion errors
Pre-configured dashboards available at http://localhost:3001:
- Request rates and latencies
- JVM metrics (heap, GC, threads)
- Index statistics
- Error rates
- Real ONNX Runtime integration with MiniLM-L6-v2
- Multi-tenant indexes with namespace isolation
- Synonym expansion for keyword search
- Query rewriting and expansion
- Re-ranking with cross-encoder models
- Postgres integration for source document registry
- Incremental indexing and updates
- Document deduplication
- Faceted search support
- Saved searches and query history
- API rate limiting
- Authentication & authorization
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure
./gradlew buildpasses - Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- Apache Lucene - High-performance search library
- Spring Boot - Application framework
- Next.js - React framework for dashboard
- HNSW - Hierarchical Navigable Small World algorithm
For questions or feedback, please open an issue on GitHub.
Built with using Java 21, Spring Boot 3, and Apache Lucene