Automatic FAQ generation from repeated Slack questions using semantic similarity clustering.
π Portfolio project demonstrating knowledge management automation, embedding-based clustering, and cross-platform integration (Slack β API β Notion).
In active Slack communities, the same questions get asked repeatedly by different people. This creates:
- Information fragmentation - answers scattered across threads
- Duplicated effort - team members answering the same questions
- Poor discoverability - no central FAQ or knowledge base
An automated system that:
- Monitors incoming Slack messages
- Clusters semantically similar questions using embeddings
- Triggers FAQ draft creation when a question is asked 3+ times
- Routes drafts to Notion for review and publication
βββββββββββββββββββ
β Slack Channel β
β (Monitored) β
ββββββββββ¬βββββββββ
β message event
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β n8n Workflow (Railway) β
β βββββββββββββ β
β β Trigger β β Slack event subscriptions β
β βββββββ¬ββββββ β
β β β
β βββββββΌββββββββββ β
β β IF (Regex) β β Basic filter: has "?" or β
β βββββββ¬ββββββββββ question words β
β β β
β βββββββΌββββββββββββββ β
β β LLM Filter β β Anthropic Claude: β
β β (Anthropic) β "Is this a genuine β
β βββββββ¬ββββββββββββββ question?" β
β β β
β βββββββΌββββββββββ β
β β IF (yes?) β β LLM response check β
β βββββββ¬ββββββββββ β
β β β
β βββββββΌββββββββββ β
β β HTTP POST β β /check endpoint β
β βββββββ¬ββββββββββ β
β β β
β βββββββΌββββββββββ β
β β IF (β₯3 && not β β Cluster threshold + β
β β drafted?) β draft prevention β
β βββββββ¬ββββββββββ β
β β β
β βββββββΌββββββββββ β
β β LLM Draft FAQ β β Anthropic Claude: β
β β (Anthropic) β generate FAQ entry β
β βββββββ¬ββββββββββ β
β β β
β βββββββΌββββββββββ β
β β Code (parse) β β Extract Question/Answer β
β βββββββ¬ββββββββββ β
β β β
β βββββββΌββββββββββ β
β β Notion β β Create page in FAQ DB β
β βββββββ¬ββββββββββ β
β β β
β βββββββΌββββββββββ β
β β HTTP POST β β /mark-drafted endpoint β
β βββββββ¬ββββββββββ β
β β β
β βββββββΌββββββββββ β
β β Slack Message β β Notify docs channel β
β βββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β
β POST /check
βΌ
βββββββββββββββββββ
β Python API β (Railway)
β FastAPI β
β βββββββββββββ β
β β Embedding β β β OpenAI text-embedding-3-small
β βββββββ¬ββββββ β
β β β
β βββββββΌββββββ β
β β Similarityβ β β Cosine similarity (threshold: 0.70)
β βββββββ¬ββββββ β
β β β
β βββββββΌββββββ β
β β Cluster β β β Assign to cluster or create new
β βββββββ¬ββββββ β
β β β
β βββββββΌββββββ β
β β SQLite β β β Persistent storage (Railway volume)
β βββββββββββββ β
βββββββββββββββββββ
- Stage 1 (Regex): Fast pattern matching for obvious questions (
?or question words like "how", "what", "where") - Stage 2 (LLM - Anthropic Claude): Semantic analysis to filter out:
- Rhetorical questions ("Why is this so hard???")
- Greetings ("How are you?")
- Time-sensitive queries ("What time is the meeting?")
- Vague requests ("Can someone help me?")
- Why: Reduces API costs by filtering 90% of non-questions with cheap regex before using LLM
- Trade-off: Adds ~1-2 seconds latency per message, but ensures only genuine FAQ-worthy questions are clustered
- Why Claude: More reliable at following strict formatting instructions (Question: / Answer: structure)
- Experience: OpenAI models (GPT-4o-mini, GPT-3.5-turbo) produced inconsistent or empty outputs with structured prompts
- Use cases: Claude handles both question filtering and FAQ draft generation
- Why: Good balance of performance (1536 dimensions) and cost ($0.02 per 1M tokens)
- Alternatives considered:
text-embedding-3-large(higher accuracy but 3x cost), Sentence-BERT (self-hosted but more complex)
- Why: Standard for embedding comparison, robust to question length variations
- Threshold: 0.70 (empirically tuned, see TUNING_LOG.md)
- Rationale: Paraphrased questions score 0.70-0.75, different topics score <0.55
- Why: Real-time clustering as questions arrive (vs. batch processing)
- Trade-off: Simpler to implement, but doesn't handle cluster merging or splitting
- Why: Simple, serverless-friendly, sufficient for moderate scale (<10K questions)
- Limitations: No built-in vector search (could migrate to PostgreSQL with pgvector for larger scale)
- Why: Balances noise reduction (not every question) with timeliness (catches real patterns quickly)
- Draft Prevention: After first FAQ is created, cluster is marked
faq_drafted: trueto prevent duplicates - Configurable: Can be adjusted based on channel activity
| Method | Endpoint | Purpose | Status |
|---|---|---|---|
GET |
/ |
Service info | β Live |
GET |
/health |
Health check | β Live |
POST |
/check |
Cluster a new question | β Live |
GET |
/clusters |
List all clusters | β Live |
POST |
/clusters/{id}/mark-drafted |
Mark FAQ as drafted | β Live |
POST |
/reset |
Clear database (testing only) | |
POST |
/debug |
Inspect similarity scores |
Production URL: https://question-cluster-api-production.up.railway.app
curl -X POST https://question-cluster-api-production.up.railway.app/check \
-H "Content-Type: application/json" \
-d '{
"text": "How do I reset my password?",
"source_channel": "C12345",
"source_user": "U67890"
}'Response:
{
"status": "matched",
"cluster_id": 6,
"cluster_count": 3,
"similar_questions": [
"How do I reset my password?",
"Where do I change my password?",
"I forgot my password, how do I get a new one?"
]
}- API: FastAPI (Python 3.11)
- Hosting: Railway (Docker)
- Database: SQLite (volume-mounted at
/data) - Workflow: n8n (also on Railway)
- LLM: Anthropic Claude (question filtering + FAQ drafting)
- Embeddings: OpenAI text-embedding-3-small
Python API Service:
| Variable | Description |
|---|---|
OPENAI_API_KEY |
OpenAI API key for embeddings |
DB_PATH |
Database file path (default: /data/questions.db) |
PORT |
Auto-assigned by Railway |
n8n Service:
| Variable | Description |
|---|---|
ANTHROPIC_API_KEY |
Anthropic API key for Claude LLM nodes |
SLACK_BOT_TOKEN |
Slack bot token for event subscriptions |
NOTION_API_KEY |
Notion integration token |
# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Set environment variables
export OPENAI_API_KEY="sk-..."
export DB_PATH="./questions.db"
# Run server
uvicorn main:app --reload --port 8000- No cluster merging - If similar questions are added before threshold is reached, they may form separate clusters
- No multi-language support - Embeddings are English-optimized
- No user deduplication - Same exact question from same user can inflate cluster count
- Partial draft prevention -
/mark-draftedendpoint exists but API doesn't returnfaq_draftedstatus yet - No auth -
/resetand/debugendpoints are public (dev only) - LLM filtering adds latency - ~1-2 seconds per message for Claude API call
- Complete draft prevention loop (return
faq_draftedin/checkresponse, check in n8n IF node) - Add user deduplication (don't count same user asking same question twice)
- Implement cluster merging (periodic job to merge high-similarity clusters)
- Add authentication for admin endpoints
- Migrate to PostgreSQL with pgvector for better vector search
- Add analytics dashboard (cluster trends, top questions, response times)
- Support multi-language questions (use multilingual embedding models)
- Add feedback loop (mark drafted FAQs as "helpful" or "not helpful")
- Cache LLM filter results to reduce API costs for repeated message patterns
- Add confidence scores to LLM filter (not just yes/no, but 0-100% confidence)
See TUNING_LOG.md for threshold calibration methodology.
# Reset database
curl -X POST http://localhost:8000/reset
# Add questions and verify clustering
curl -X POST http://localhost:8000/check \
-H "Content-Type: application/json" \
-d '{"text": "How do I reset my password?"}'
# Check similarity scores
curl -X POST http://localhost:8000/debug \
-H "Content-Type: application/json" \
-d '{"text": "Where do I change my password?"}'MIT License - see LICENSE for details.