BPlexica is an AI-based search service. It performs web searches through SearXNG, and uses LangChain and Large Language Models (LLMs) to synthesize and analyze search results to provide users with detailed and accurate answers.
- AI-based Search: Perform web searches through SearXNG instance
- LLM Integration: Connect with OpenAI GPT models using LangChain (supports custom API URLs)
- Real-time Streaming: Real-time response streaming via Server-Sent Events (SSE)
- RESTful API: Modern API design based on FastAPI
- Various Search Modes: Support for various focus modes including web, image, news, maps, etc.
- Korean Language Support: Korean interface and response support
- π‘οΈ Privacy Protection: Automatic detection and masking of Korean personally identifiable information (PII)
- π Comprehensive Logging: Date-based log file management and automatic cleanup
- Comprehensive Error Handling: Custom exceptions and detailed error messages
bplexica/
βββ app/ # Core application code
β βββ __init__.py
β βββ main.py # FastAPI application entry point
β βββ api/ # API endpoints
β β βββ __init__.py
β β βββ routers/
β β βββ __init__.py
β β βββ search.py # Search API router
β βββ core/ # Core configuration and utilities
β β βββ __init__.py
β β βββ config.py # Environment configuration
β β βββ exceptions.py # Custom exceptions
β β βββ logger.py # Logging system
β β βββ middleware.py # Privacy protection middleware
β β βββ privacy_filter.py # Personal information detection and masking
β β βββ prompts.py # LLM prompt templates
β βββ schemas/ # Pydantic schemas
β β βββ __init__.py
β β βββ search.py # Search request/response schemas
β βββ services/ # Business logic
β βββ __init__.py
β βββ llm_service.py # LLM service
β βββ search_service.py # Search service coordination
β βββ searxng_client.py # SearXNG client
βββ searchxng/ # SearXNG configuration
β βββ settings.yml # SearXNG settings file
β βββ uwsgi.ini # uWSGI configuration
β βββ limiter.toml # Rate limiting configuration
β βββ searxng-docker/ # Docker container configuration
β βββ docker-compose.yaml # Docker Compose configuration
β βββ Caddyfile # Caddy web server configuration
β βββ searxng/ # SearXNG container configuration
βββ tests/ # Test files
β βββ test_privacy_filter.py # Personal information masking tests
β βββ test_privacy_protection.sh # Privacy protection integration tests
βββ log/ # Log files directory
β βββ bplexica_YYYY-MM-DD.log # General logs (by date)
β βββ bplexica_error_YYYY-MM-DD.log # Error logs (by date)
βββ requirements.txt # Python dependencies
βββ README.md # Project documentation
- Web Framework: FastAPI (high-performance asynchronous web framework)
- AI/ML: LangChain, OpenAI GPT (or other LLM providers)
- Search Engine: SearXNG (meta search engine)
- HTTP Client: HTTPX (asynchronous HTTP requests)
- Data Validation: Pydantic (type safety and data validation)
- Configuration Management: python-dotenv, pydantic-settings
- HTML Parsing: BeautifulSoup4, lxml
- Web Server: Uvicorn
git clone https://github.com/sh2orc/bplexica.git
cd bplexicapython -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate# Basic installation
pip install -r requirements.txt
# Or use Makefile (recommended)
make install
# Include development/testing dependencies
make install-devCreate a .env file in the project root and add the following settings:
# SearXNG Configuration
SEARXNG_BASE_URL=http://localhost:8080
# OpenAI Configuration
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_API_URL=https://api.openai.com/v1
LLM_MODEL=gpt-3.5-turbo
# Embedding Model Configuration (optional)
EMBEDDING_MODEL=text-embedding-ada-002
# Logging Configuration (optional)
LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR, CRITICAL
LOG_DIR=log # Directory where log files will be stored
LOG_MAX_BYTES=10485760 # Maximum log file size (10MB)
LOG_BACKUP_COUNT=30 # Number of backup files
LOG_CLEANUP_DAYS=30 # Automatic deletion period for old log files (days)Run a SearXNG instance using Docker:
# Direct execution
cd searchxng/searxng-docker
docker-compose up -d
# Or use Makefile (recommended)
make run-searxngVerify that SearXNG is running at http://localhost:8080.
# Direct execution
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
# Or use Makefile (recommended)
make run-serveruvicorn app.main:app --host 0.0.0.0 --port 8000# Set up the entire development environment
make quickstart
# Run the API server in a separate terminal
make run-server# Run development environment with Docker (recommended)
make quickstart-docker
# Or manually
make docker-build
make docker-dev
# Production environment
make docker-run
# Check logs
make docker-logs
# Stop containers
make docker-stopThe application will run at http://localhost:8000.
You can test the API through Swagger UI at the following URLs:
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc
Performs regular or streaming search.
Request Example:
{
"query": "What is FastAPI?",
"focus_mode": "web",
"stream": false
}Response Example:
{
"message": "FastAPI is a modern, fast web framework for Python...",
"sources": [
{
"title": "FastAPI Official Documentation",
"url": "https://fastapi.tiangolo.com/",
"snippet": "FastAPI is a modern, fast web framework for Python 3.7+..."
}
]
}Simple search using query parameters.
Request Example:
GET /api/v1/search?query=Python&focus_mode=web
Set stream: true to receive real-time responses via Server-Sent Events (SSE).
JavaScript Client Example:
const eventSource = new EventSource('/api/v1/search', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
query: 'The future of artificial intelligence',
focus_mode: 'web',
stream: true
})
});
eventSource.onmessage = function(event) {
const data = JSON.parse(event.data);
console.log('Response chunk:', data.content);
};- Coordinates SearXNG client and LLM service
- Handles regular and streaming search requests
- Context extraction and source management
- Asynchronous communication with SearXNG instance
- Support for various search categories
- Browser user agent emulation
- Comprehensive error handling
- OpenAI GPT model integration through LangChain
- Regular and streaming response generation
- Uses custom prompt templates
- Ready to support multiple LLM providers
Fine-grained error handling through custom exception classes:
SearxngConnectionError: SearXNG connection failureSearxngRateLimitError: Rate limit exceededSearxngSearchError: Search service errorLLMProcessingError: LLM processing error
You can use various OpenAI-compatible services:
# Azure OpenAI
OPENAI_API_URL=https://your-resource.openai.azure.com/openai/deployments/your-deployment
# OpenAI proxy server
OPENAI_API_URL=https://your-proxy-server.com/v1
# Local LLM (e.g., Ollama, LocalAI)
OPENAI_API_URL=http://localhost:11434/v1
# Default OpenAI API
OPENAI_API_URL=https://api.openai.com/v1You can add new providers in app/services/llm_service.py:
elif model_provider == "anthropic":
from langchain_anthropic import ChatAnthropic
self.llm = ChatAnthropic(
model=self.model_name,
api_key=self.api_key
)You can modify prompt templates in app/core/prompts.py to adjust the style and format of AI responses.
You can add new search engines or categories by modifying the SearXNG configuration (searchxng/settings.yml).
Test framework setup:
pip install pytest pytest-asyncio
pytest tests/BPlexica provides a comprehensive logging system to monitor and debug the application's behavior.
- Date-based Log Files: Automatically created in
log/bplexica_YYYY-MM-DD.logformat - Error-only Logs: Error level and above logs recorded in
log/bplexica_error_YYYY-MM-DD.log - Log Rotation: Automatic rotation when file size reaches the set threshold
- Automatic Cleanup: Automatic deletion of old log files (default 30 days)
-
HTTP Request Logs
- Client IP, request method, URL, response code, processing time
- Search query and configuration information
-
Search Service Logs
- Search request start/completion time
- SearXNG search result collection information
- Context extraction and source management
-
LLM Service Logs
- Model name used, prompt length, response length
- Processing time and token usage tracking
- Number of streaming chunks and total response length
-
System Logs
- Application start/stop
- Configuration changes
- Error and exception information
You can control logging behavior through environment variables:
# Set log level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
LOG_LEVEL=INFO
# Log file storage directory
LOG_DIR=log
# Maximum log file size (bytes)
LOG_MAX_BYTES=10485760 # 10MB
# Number of backup files
LOG_BACKUP_COUNT=30
# Automatic deletion period for old log files (days)
LOG_CLEANUP_DAYS=30# Real-time log monitoring
tail -f log/bplexica_$(date +%Y-%m-%d).log
# Monitor error logs only
tail -f log/bplexica_error_$(date +%Y-%m-%d).log
# Search for specific keywords
grep "search request" log/bplexica_*.log
# Find requests with long processing times
grep "processing time.*[5-9]\.[0-9]" log/bplexica_*.log# Daily request count statistics
grep "search API call" log/bplexica_2025-07-18.log | wc -l
# Calculate average response time
grep "processing time" log/bplexica_2025-07-18.log | \
grep -o '[0-9]\+\.[0-9]\+s' | sed 's/s//' | \
awk '{sum+=$1; count++} END {print "Average:", sum/count, "seconds"}'
# Error frequency analysis
grep "ERROR" log/bplexica_error_2025-07-18.log | \
cut -d'|' -f4 | sort | uniq -c | sort -nrYou can monitor the following performance metrics through logs:
- Response Time: Processing time for each API request
- LLM Performance: Processing time and token usage by model
- Search Performance: SearXNG search and result processing time
- Error Rate: Error frequency by time period/function
BPlexica provides automatic detection and masking of Korean personally identifiable information (PII). It detects and safely masks personal information before processing any API request.
- Resident Registration Number:
901212-1234567β901212-******* - Foreign Registration Number:
901212-5123456β901212-******* - Passport Number:
M12345678βM1****78 - Driver's License Number:
11-12-123456-78β11-12-****-**78 - Email:
test@example.comβte**@e******.com - Mobile Phone Number:
010-1234-5678β010-****-5678 - Phone Number:
02-1234-5678β02-****-5678 - Card Number:
1234-5678-9012-3456β1234-****-****-3456 - Account Number:
123-45-678901β123-**-**8901 - Business Registration Number:
123-45-67890β123-**-***90 - Corporate Registration Number:
123456-1234567β123456-****567 - IP Address:
192.168.1.100β192.168.*******
- Automatic Detection: Real-time personal information detection using regular expressions
- Safe Masking: Mask sensitive parts and keep only necessary parts
- Detailed Logging: Record detected personal information types and locations in logs
- Response Headers: Indicate protection status with
X-Privacy-Protected,X-PII-Detectedheaders - Multi-layer Processing: Support for URL parameters, JSON body, and nested structures
# Run privacy protection feature test
./tests/test_privacy_protection.sh
# Real-time monitoring of personal information detection logs
tail -f log/bplexica_$(date +%Y-%m-%d).log | grep "personal_information"
# Search for specific personal information types
grep "personal_information detection.*resident_registration_number" log/bplexica_*.logThe privacy protection feature is enabled by default, and the following paths are excluded:
/docs- API documentation/redoc- API documentation (ReDoc)/openapi.json- OpenAPI schema/favicon.ico- Favicon
- Priority Processing: Executed as the very first step of all requests
- Lossless Masking: Maintains the structure and form of the original data as much as possible
- Performance Optimization: Fast regex matching and efficient string processing
- Detailed Auditing: Records all personal information detection and masking processes in logs
This project is distributed under the MIT License. See the LICENSE file for more information.
- FastAPI - Modern Python web framework
- LangChain - LLM application development framework
- SearXNG - Privacy-focused meta search engine
- OpenAI - GPT model provider
If you have questions or suggestions about the project, please create an issue or contact us.
BPlexica - A smarter search experience with AI