An intelligent RSS feed monitoring Slackbot that curates and summarizes content based on your interests.
- 📡 RSS Feed Monitoring: Monitors multiple RSS feeds from a configurable text file
- 🎯 Semantic Curation: Uses semantic embeddings to find content semantically similar to your topics of interest
- ✍️ AI Summaries: Uses Claude to generate concise, insightful summaries for each item
- 💬 Clean Slack Layout: Each paper posted as a top-level message with AI summary
- 🤖 GitHub Actions: Runs automatically on schedule via GitHub Actions
- Fetch: Retrieves items from all RSS feeds in
feeds.txt - Deduplicate: Removes duplicate articles within the current run (based on URL)
- Filter by Time: Keeps only items from the configured timeframe
- Filter Previously Posted: Removes articles that have been posted before (if caching enabled)
- Curate: Uses OpenAI embeddings to calculate semantic similarity between article titles and topics in
topics.txt, then ranks by relevance and limits to top N items - Summarize: Uses Claude to generate summaries for curated items
- Post: Posts header, then each paper as a top-level Slack message with summary
- Cache: Saves posted article URLs to prevent reposting
Important: You should fork this repository to your own GitHub account rather than cloning it directly. This allows:
- GitHub Actions to run automatically on your schedule
- Persistent embedding cache across workflow runs
- Easy customization and updates specific to your needs
To fork:
- Click the "Fork" button at the top right of this repository
- Select your GitHub account as the destination
- GitHub will create a copy of this repository under your account
Once forked, you can clone your fork and configure it following the setup instructions below.
First, install uv if you haven't already:
# On macOS and Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# On Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"Then clone your forked repository and install dependencies:
git clone https://github.com/YOUR-USERNAME/wellread.git
cd wellread
uv syncReplace YOUR-USERNAME with your GitHub username.
Install the pre-commit hook to prevent committing debug code:
./install-hooks.shThis hook prevents committing any lines containing NOCOMMIT, which is useful for temporary debugging changes.
Edit feeds.txt and add your RSS feed URLs (one per line):
https://arxiv.org/rss/cs.AI
https://arxiv.org/rss/cs.LG
https://blog.example.com/feed
Edit topics.txt and add your topics of interest (one per line):
machine learning
large language models
neural networks
artificial intelligence
Edit config.json to adjust settings:
{
"timeframe_hours": 24,
"max_items_to_post": 20,
"min_relevance_score": 60,
"cache_posted_articles": true,
"posted_articles_cache_file": "cache/posted_articles.json",
"embedding_cache_dir": "cache/embeddings",
"llm_models": {
"summarization": "claude-sonnet-4-5-20250929"
}
}timeframe_hours: How far back to look for new posts (default: 24)max_items_to_post: Maximum number of articles to post (default: 20)min_relevance_score: Minimum semantic similarity score (0-100) for articles to be included (default: 60)cache_posted_articles: Whether to cache posted articles to avoid reposting (default: true)posted_articles_cache_file: File path for the posted articles cache (default: cache/posted_articles.json)embedding_cache_dir: Directory for caching OpenAI embeddings (default: cache/embeddings)llm_models.summarization: Claude model for article summaries (default: claude-sonnet-4-5-20250929)
- Go to https://api.slack.com/apps
- Click "Create New App" → "From scratch"
- Name it "WellRead Bot" and select your workspace
Under "OAuth & Permissions", add these scopes:
chat:writechat:write.publicchannels:readgroups:read
- Click "Install to Workspace"
- Copy the "Bot User OAuth Token" (starts with
xoxb-)
In Slack, right-click your target channel → "View channel details" → Copy the Channel ID at the bottom
- Go to https://platform.openai.com/api-keys
- Create a new API key
- Copy the key (starts with
sk-)
Note: The bot uses text-embedding-3-small for semantic similarity. Cost is ~$0.02 per 1M tokens. Embeddings are cached locally and in GitHub Actions to minimize API calls.
- Go to https://console.anthropic.com/
- Create an API key
- Copy the key (starts with
sk-ant-)
In your GitHub repository, go to Settings → Secrets and variables → Actions, and add:
OPENAI_API_KEY: Your OpenAI API key (required for semantic curation)ANTHROPIC_API_KEY: Your Anthropic API key (required for summaries)SLACK_BOT_TOKEN: Your Slack bot token (xoxb-...)SLACK_CHANNEL: Your Slack channel ID (e.g., C01234ABCD)SLACK_WEBHOOK: (Optional) Slack webhook URLTIMEFRAME_HOURS: (Optional) Override default timeframe (e.g., 48)MAX_ITEMS_TO_POST: (Optional) Override maximum items to post (e.g., 10)
Edit .github/workflows/daily-digest.yml to change the schedule:
on:
schedule:
- cron: '0 9 * * *' # 9 AM UTC dailyRun locally with environment variables:
export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
export SLACK_BOT_TOKEN="your-token"
export SLACK_CHANNEL="your-channel-id"
uv run python src/main.pyOr trigger manually in GitHub Actions:
- Go to Actions tab
- Select "Daily RSS Digest" workflow
- Click "Run workflow"
wellread/
├── src/
│ ├── main.py # Main entry point
│ ├── rss_parser.py # RSS feed fetching and parsing
│ ├── curator.py # Content curation logic
│ ├── summarizer.py # Claude AI integration
│ ├── slack_poster.py # Slack posting with threading
│ └── article_cache.py # Posted articles cache management
├── .github/
│ └── workflows/
│ └── daily-digest.yml # GitHub Actions workflow
├── feeds.txt # RSS feed URLs
├── topics.txt # Topics of interest
├── config.json # Configuration
├── pyproject.toml # Project metadata and dependencies
├── uv.lock # Locked dependencies (auto-generated)
└── cache/
├── embeddings/ # OpenAI embeddings cache (gitignored)
└── posted_articles.json # Posted articles cache (gitignored)
The bot uses two types of caching to optimize performance and avoid reposting:
Automatically caches OpenAI embeddings to minimize API costs:
- Local Development: Cache stored in
cache/embeddings/(gitignored) - GitHub Actions: Cache persists across workflow runs (expires after 7 days of inactivity)
- Cache Key: Embeddings are keyed by
model:text_hashto handle model upgrades - Benefits: Topics are typically cached permanently; recurring articles skip re-embedding
View cache stats in the bot output:
Topic embeddings: 6 cached, 0 new
Article embeddings: 42 cached, 8 new
Total API calls saved: 48
Tracks previously posted articles to avoid reposting (enabled by default):
- Local Development: Cache stored in
cache/posted_articles.json(gitignored) - GitHub Actions: Cache persists across all workflow runs
- Cache Key: Article URLs are stored in a JSON file
- Benefits: Prevents duplicate posts even across multiple runs
- Configuration: Can be disabled by setting
cache_posted_articles: falseinconfig.json
When enabled, the bot logs:
📚 Loaded article cache with 47 previously posted articles
🔍 Filtering out previously posted articles...
15 unposted items (3 already posted)
Edit config.json to change Claude model:
{
"llm_models": {
"summarization": "claude-3-5-haiku-20241022"
}
}Available models:
claude-sonnet-4-5-20250929- Best quality (default)claude-3-5-haiku-20241022- Fast and cheap
Or use environment variable:
SUMMARIZATION_MODEL- Override summarization model
Edit config.json or use environment variables:
max_items_to_post: Maximum number of articles to post (useMAX_ITEMS_TO_POSTenv var)timeframe_hours: Adjust lookback period (useTIMEFRAME_HOURSenv var)min_relevance_score: Minimum relevance threshold (useMIN_RELEVANCE_SCOREenv var)
To change the embedding model, edit src/curator.py:
def __init__(self, openai_api_key: str):
self.embedding_model = "text-embedding-3-small" # or "text-embedding-3-large"Edit src/summarizer.py to modify the Claude prompt:
async def summarize_paper(self, item, topics):
prompt = """Your custom prompt here..."""
# ...Edit src/slack_poster.py to change message formatting:
async def post_paper_with_summary(self, channel, thread_ts, paper, index, total):
# Customize Slack message blocks here
pass- Check that feeds in
feeds.txtare valid RSS/Atom feeds - Verify the timeframe isn't too restrictive
- Ensure topics in
topics.txtmatch content in feeds
- Verify bot token has correct permissions
- Check that channel ID is correct
- Ensure bot is added to the channel
- Adjust
max_concurrentparameter insrc/summarizer.py - Add delays between API calls if needed
MIT