[recipes] Slack message deduplication pattern for thought ingestion#89
[recipes] Slack message deduplication pattern for thought ingestion#89claydunker-yalc wants to merge 3 commits intoNateBJones-Projects:mainfrom
Conversation
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
justfinethanku
left a comment
There was a problem hiding this comment.
Code Review - PR #89: Slack Message Deduplication Pattern
What's Good
Strong contribution. This extracts a proven pattern from production code, documents it clearly, and solves a real problem that many users will encounter. The code is clean, well-commented, and includes important design decisions (fail-open behavior).
README quality is excellent:
- Clear "What It Does" and "Why This Matters" sections
- Step-by-step flow with visual badges
- Good use of callouts (IMPORTANT, TIP)
- Troubleshooting covers the key failure modes
- Code examples are realistic and copy-paste ready
metadata.json is complete and valid:
- All required fields present
- Correct category, difficulty, and time estimate
- Proper version format and tags
Required Changes
1. Missing Prerequisites Section Header
Your README includes prerequisites in the "How It Works" section, but CONTRIBUTING.md requires a dedicated Prerequisites section as one of the 5 required README sections:
Your contribution's README must include these sections:
- What it does
- Prerequisites
- Step-by-step instructions
...
Fix: Add a top-level ## Prerequisites section before "How It Works" that clearly lists:
- Working Open Brain setup (core
thoughtstable with a jsonbmetadatacolumn) - A Supabase Edge Function that ingests thoughts from Slack (like
ingest-thought) - Slack Events API webhook delivering messages to your edge function
You can remove the duplicate text from the current location under "How It Works."
2. Code File Missing in metadata.json "Files" Table
Your README's "Files" table lists:
index.tsREADME.mdmetadata.json
But the table incorrectly states index.ts is a "Standalone example showing the dedup pattern." This is confusing because:
- The code IS in the PR (good)
- But the table description doesn't make it clear this is reference code vs. production-ready code
Fix: Update the "Files" table description for index.ts to be more accurate:
| File | Purpose |
|------|---------|
| `index.ts` | Reference implementation showing the dedup helper function and handler placement pattern |
| `README.md` | This guide |
| `metadata.json` | Contribution metadata for the OB1 repo |Nice-to-Haves (Not Blocking)
3. Add a "When to Use This" Section
This pattern is specifically for Slack webhook dedup. Consider adding a brief section after "What It Does" explaining when this pattern applies vs. when it doesn't:
## When to Use This
Use this pattern if:
- You're ingesting thoughts from Slack via webhooks
- You're experiencing duplicate rows in your `thoughts` table
- You want to avoid burning API credits on retry events
This pattern is NOT needed if:
- You're using Slack's Socket Mode (it has built-in dedup)
- Your ingestion source already provides idempotency (like email Message-IDs)This would help users quickly determine if they need this recipe.
4. Consider a Simple Test Command
Your "Expected Outcome" section is clear, but adding a concrete SQL query users can run to verify dedup is working would be helpful:
## Expected Outcome
When a duplicate Slack event arrives, you should see `Skipping duplicate message: <timestamp>` in your edge function logs. The function returns `200` immediately without generating embeddings, calling the LLM, or writing any database rows.
**Verify it's working:** Run this query to confirm you have exactly one row per unique `slack_ts`:
\```sql
select metadata->>'slack_ts' as slack_ts, count(*)
from thoughts
where source = 'slack'
group by metadata->>'slack_ts'
having count(*) > 1;
\```
If the query returns no rows, dedup is working correctly. If it returns rows, you have duplicates.Automated Check Compliance
| Rule | Status | Notes |
|---|---|---|
| 1. Folder structure | ✅ Pass | Correctly in recipes/ |
| 2. Required files | ✅ Pass | README.md, metadata.json, index.ts all present |
| 3. Metadata valid | ✅ Pass | Valid JSON, all required fields |
| 4. No credentials | ✅ Pass | Uses env vars correctly |
| 5. SQL safety | ✅ Pass | No destructive SQL, uses jsonb query pattern correctly |
| 6. Category artifacts | ✅ Pass | Has code file (index.ts) with detailed instructions |
| 7. PR format | ✅ Pass | Title starts with [recipes] |
| 8. No binary blobs | ✅ Pass | Text files only |
| 9. README completeness | Missing dedicated Prerequisites section (see Required Change #1) | |
| 10. Primitive deps | N/A | No primitive dependencies declared |
| 11. Scope check | ✅ Pass | All changes in recipes/slack-message-dedup/ |
| 12. Internal links | ✅ Pass | No broken links |
| 13. Remote MCP pattern | N/A | Not an MCP contribution |
Verdict: Minor fixes needed
What needs to happen before merge:
- Add a dedicated
## Prerequisitessection before "How It Works" - Update the "Files" table description for
index.tsto clarify it's a reference implementation
Once these are addressed, this is ready to merge. Excellent work on a high-quality, practical contribution.
Update Files table description per PR review feedback. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Address nice-to-have suggestions from PR review: - Add section clarifying when this pattern applies vs not - Add SQL query to verify dedup is working correctly Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
ingest-thoughtas a standalone, reusable recipeslack_tsstored in thethoughtstable's jsonbmetadatacolumn to prevent duplicate processing when Slack delivers the same webhook event multiple timesalreadyProcessed()helper function, handler placement guidance, and a GIN index recommendation for performanceRequirements
thoughtstableTesting
Tested on my own Open Brain instance. Confirmed that duplicate Slack events are skipped with a console log and return 200 without generating embeddings, calling the LLM, or writing duplicate rows.
Generated with Claude Code