WellRead 📰🤖

An intelligent RSS feed monitoring Slackbot that curates and summarizes content based on your interests.

Features

📡 RSS Feed Monitoring: Monitors multiple RSS feeds from a configurable text file
🎯 Semantic Curation: Uses semantic embeddings to find content semantically similar to your topics of interest
✍️ AI Summaries: Uses Claude to generate concise, insightful summaries for each item
💬 Clean Slack Layout: Each paper posted as a top-level message with AI summary
🤖 GitHub Actions: Runs automatically on schedule via GitHub Actions

How It Works

Fetch: Retrieves items from all RSS feeds in feeds.txt
Deduplicate: Removes duplicate articles within the current run (based on URL)
Filter by Time: Keeps only items from the configured timeframe
Filter Previously Posted: Removes articles that have been posted before (if caching enabled)
Curate: Uses OpenAI embeddings to calculate semantic similarity between article titles and topics in topics.txt, then ranks by relevance and limits to top N items
Summarize: Uses Claude to generate summaries for curated items
Post: Posts header, then each paper as a top-level Slack message with summary
Cache: Saves posted article URLs to prevent reposting

Getting Started

Fork This Repository

Important: You should fork this repository to your own GitHub account rather than cloning it directly. This allows:

GitHub Actions to run automatically on your schedule
Persistent embedding cache across workflow runs
Easy customization and updates specific to your needs

To fork:

Click the "Fork" button at the top right of this repository
Select your GitHub account as the destination
GitHub will create a copy of this repository under your account

Once forked, you can clone your fork and configure it following the setup instructions below.

Setup

1. Clone Your Fork and Install

First, install uv if you haven't already:

# On macOS and Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

# On Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

Then clone your forked repository and install dependencies:

git clone https://github.com/YOUR-USERNAME/wellread.git
cd wellread
uv sync

Replace YOUR-USERNAME with your GitHub username.

2. Install Git Hooks (Optional but Recommended)

Install the pre-commit hook to prevent committing debug code:

./install-hooks.sh

This hook prevents committing any lines containing NOCOMMIT, which is useful for temporary debugging changes.

3. Configure RSS Feeds

Edit feeds.txt and add your RSS feed URLs (one per line):

https://arxiv.org/rss/cs.AI
https://arxiv.org/rss/cs.LG
https://blog.example.com/feed

4. Configure Topics

Edit topics.txt and add your topics of interest (one per line):

machine learning
large language models
neural networks
artificial intelligence

5. Configure Settings (Optional)

Edit config.json to adjust settings:

{
  "timeframe_hours": 24,
  "max_items_to_post": 20,
  "min_relevance_score": 60,
  "cache_posted_articles": true,
  "posted_articles_cache_file": "cache/posted_articles.json",
  "embedding_cache_dir": "cache/embeddings",
  "llm_models": {
    "summarization": "claude-sonnet-4-5-20250929"
  }
}

timeframe_hours: How far back to look for new posts (default: 24)
max_items_to_post: Maximum number of articles to post (default: 20)
min_relevance_score: Minimum semantic similarity score (0-100) for articles to be included (default: 60)
cache_posted_articles: Whether to cache posted articles to avoid reposting (default: true)
posted_articles_cache_file: File path for the posted articles cache (default: cache/posted_articles.json)
embedding_cache_dir: Directory for caching OpenAI embeddings (default: cache/embeddings)
llm_models.summarization: Claude model for article summaries (default: claude-sonnet-4-5-20250929)

6. Set Up Slack

Create a Slack App

Go to https://api.slack.com/apps
Click "Create New App" → "From scratch"
Name it "WellRead Bot" and select your workspace

Configure Bot Token Scopes

Under "OAuth & Permissions", add these scopes:

chat:write
chat:write.public
channels:read
groups:read

Install to Workspace

Click "Install to Workspace"
Copy the "Bot User OAuth Token" (starts with xoxb-)

Get Channel ID

In Slack, right-click your target channel → "View channel details" → Copy the Channel ID at the bottom

7. Get OpenAI API Key

Go to https://platform.openai.com/api-keys
Create a new API key
Copy the key (starts with sk-)

Note: The bot uses text-embedding-3-small for semantic similarity. Cost is ~$0.02 per 1M tokens. Embeddings are cached locally and in GitHub Actions to minimize API calls.

8. Get Anthropic API Key

Go to https://console.anthropic.com/
Create an API key
Copy the key (starts with sk-ant-)

9. Configure GitHub Secrets

In your GitHub repository, go to Settings → Secrets and variables → Actions, and add:

OPENAI_API_KEY: Your OpenAI API key (required for semantic curation)
ANTHROPIC_API_KEY: Your Anthropic API key (required for summaries)
SLACK_BOT_TOKEN: Your Slack bot token (xoxb-...)
SLACK_CHANNEL: Your Slack channel ID (e.g., C01234ABCD)
SLACK_WEBHOOK: (Optional) Slack webhook URL
TIMEFRAME_HOURS: (Optional) Override default timeframe (e.g., 48)
MAX_ITEMS_TO_POST: (Optional) Override maximum items to post (e.g., 10)

10. Configure Schedule

Edit .github/workflows/daily-digest.yml to change the schedule:

on:
  schedule:
    - cron: '0 9 * * *'  # 9 AM UTC daily

Manual Testing

Run locally with environment variables:

export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
export SLACK_BOT_TOKEN="your-token"
export SLACK_CHANNEL="your-channel-id"
uv run python src/main.py

Or trigger manually in GitHub Actions:

Go to Actions tab
Select "Daily RSS Digest" workflow
Click "Run workflow"

Project Structure

wellread/
├── src/
│   ├── main.py            # Main entry point
│   ├── rss_parser.py      # RSS feed fetching and parsing
│   ├── curator.py         # Content curation logic
│   ├── summarizer.py      # Claude AI integration
│   ├── slack_poster.py    # Slack posting with threading
│   └── article_cache.py   # Posted articles cache management
├── .github/
│   └── workflows/
│       └── daily-digest.yml  # GitHub Actions workflow
├── feeds.txt              # RSS feed URLs
├── topics.txt             # Topics of interest
├── config.json            # Configuration
├── pyproject.toml         # Project metadata and dependencies
├── uv.lock                # Locked dependencies (auto-generated)
└── cache/
    ├── embeddings/        # OpenAI embeddings cache (gitignored)
    └── posted_articles.json  # Posted articles cache (gitignored)

Caching

The bot uses two types of caching to optimize performance and avoid reposting:

OpenAI Embeddings Cache

Automatically caches OpenAI embeddings to minimize API costs:

Local Development: Cache stored in cache/embeddings/ (gitignored)
GitHub Actions: Cache persists across workflow runs (expires after 7 days of inactivity)
Cache Key: Embeddings are keyed by model:text_hash to handle model upgrades
Benefits: Topics are typically cached permanently; recurring articles skip re-embedding

View cache stats in the bot output:

Topic embeddings: 6 cached, 0 new
Article embeddings: 42 cached, 8 new
Total API calls saved: 48

Posted Articles Cache

Tracks previously posted articles to avoid reposting (enabled by default):

Local Development: Cache stored in cache/posted_articles.json (gitignored)
GitHub Actions: Cache persists across all workflow runs
Cache Key: Article URLs are stored in a JSON file
Benefits: Prevents duplicate posts even across multiple runs
Configuration: Can be disabled by setting cache_posted_articles: false in config.json

When enabled, the bot logs:

📚 Loaded article cache with 47 previously posted articles
🔍 Filtering out previously posted articles...
15 unposted items (3 already posted)

Customization

Adjust Model Selection

Edit config.json to change Claude model:

{
  "llm_models": {
    "summarization": "claude-3-5-haiku-20241022"
  }
}

Available models:

claude-sonnet-4-5-20250929 - Best quality (default)
claude-3-5-haiku-20241022 - Fast and cheap

Or use environment variable:

SUMMARIZATION_MODEL - Override summarization model

Adjust Curation Parameters

Edit config.json or use environment variables:

max_items_to_post: Maximum number of articles to post (use MAX_ITEMS_TO_POST env var)
timeframe_hours: Adjust lookback period (use TIMEFRAME_HOURS env var)
min_relevance_score: Minimum relevance threshold (use MIN_RELEVANCE_SCORE env var)

To change the embedding model, edit src/curator.py:

def __init__(self, openai_api_key: str):
    self.embedding_model = "text-embedding-3-small"  # or "text-embedding-3-large"

Change Summary Style

Edit src/summarizer.py to modify the Claude prompt:

async def summarize_paper(self, item, topics):
    prompt = """Your custom prompt here..."""
    # ...

Modify Slack Format

Edit src/slack_poster.py to change message formatting:

async def post_paper_with_summary(self, channel, thread_ts, paper, index, total):
    # Customize Slack message blocks here
    pass

Troubleshooting

No items found

Check that feeds in feeds.txt are valid RSS/Atom feeds
Verify the timeframe isn't too restrictive
Ensure topics in topics.txt match content in feeds

Slack posting fails

Verify bot token has correct permissions
Check that channel ID is correct
Ensure bot is added to the channel

Rate limiting

Adjust max_concurrent parameter in src/summarizer.py
Add delays between API calls if needed

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.github/workflows		.github/workflows
hooks		hooks
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
config.json		config.json
feeds.txt		feeds.txt
install-hooks.sh		install-hooks.sh
pyproject.toml		pyproject.toml
rss_reference_list_DONT_USE_ALL.txt		rss_reference_list_DONT_USE_ALL.txt
topics.txt		topics.txt
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

WellRead 📰🤖

Features

How It Works

Getting Started

Fork This Repository

Setup

1. Clone Your Fork and Install

2. Install Git Hooks (Optional but Recommended)

3. Configure RSS Feeds

4. Configure Topics

5. Configure Settings (Optional)

6. Set Up Slack

Create a Slack App

Configure Bot Token Scopes

Install to Workspace

Get Channel ID

7. Get OpenAI API Key

8. Get Anthropic API Key

9. Configure GitHub Secrets

10. Configure Schedule

Manual Testing

Project Structure

Caching

OpenAI Embeddings Cache

Posted Articles Cache

Customization

Adjust Model Selection

Adjust Curation Parameters

Change Summary Style

Modify Slack Format

Troubleshooting

No items found

Slack posting fails

Rate limiting

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages