Skip to content

Feature Request: Store and filter by article categories/tags from RSS/Atom feeds #10

@weronikakombat

Description

@weronikakombat

Summary

Currently, blogwatcher does not store category/tag information from RSS/Atom feeds, even though this data is commonly available (e.g., <category> elements in RSS 2.0).

Motivation

Many blogs (like TechCrunch) categorize articles by topic (Security, AI, Startups, etc.). Users would benefit from being able to:

  1. See what categories an article belongs to
  2. Filter articles by category (e.g., blogwatcher articles --category "AI")
  3. Get a better overview of reading without maintaining separate blog entries for each category feed

Proposed Solution

Database Schema

Add a categories column to the articles table:

ALTER TABLE articles ADD COLUMN categories TEXT; -- JSON array or comma-separated

RSS Parsing

The gofeed library already exposes item.Categories []string. In internal/rss/parser.go, capture this field and store it.

CLI Changes

  • blogwatcher articles --category "AI" - filter by category
  • blogwatcher articles - display categories in output

Example RSS Snippet

<item>
  <title>Hackers and internet outages hit Iran amid U.S. air strikes</title>
  <category>Security</category>
  <category>cyberattack</category>
  <category>iran</category>
</item>

Implementation Notes

  • Size: This should be a small change (~50 lines)
  • Backward compatibility: Existing databases without the column should be handled gracefully
  • Performance: Minimal impact - just one additional column
  • The change would need to touch:
    • internal/model/model.go - add Categories field
    • internal/storage/database.go - update schema and queries
    • internal/rss/rss.go - capture categories from gofeed.Item
    • internal/cli/commands.go - add --category flag

Would you accept a PR?

I'm happy to submit a PR implementing this if you're open to it. Let me know if you'd prefer a specific approach!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions