Summary
Add an abstractive summarization step so each article gets a short “TL;DR.”
Motivation
- Helps end-users absorb long articles quickly.
- Demonstrates multi-document summarization (e.g. daily digest).
Scope
None
Acceptance Criteria
Additional Context
Details
- Category: nlp
- Priority: P1
- Estimate: 2d
- Dependencies:
- Database connection module (
nlp/db.py) in place
- Articles already normalized and persisted
Tasks
- Add dependencies
- Add
transformers and torch to /nlp/requirements.txt.
- Core function signature
- Define in
/nlp/core.py:
def summarize(text: str) -> str
- Celery task hook
- In
/nlp/tasks.py, register:
@app.task
def summarize_task(article_id: str) -> str
- CLI entrypoint
- In
/nlp/cli.py, expose:
python -m nlp.cli summarize --article-id=<id>
- Tests & documentation
- Unit test that
summarize() returns a non-empty string under 200 chars.
- Test that
summarize_task() updates the DB with a "summary" field.
- Update
/nlp/README.md with:
- Installation steps
- How to run the Celery task
- How to invoke the CLI command
Summary
Add an abstractive summarization step so each article gets a short “TL;DR.”
Motivation
Scope
None
Acceptance Criteria
summarize(text)produces a concise summary (<200 chars)summarize_task(article_id)saves"summary"to the article recordsummarizecommand runs without errors and prints confirmationAdditional Context
Details
nlp/db.py) in placeTasks
transformersandtorchto/nlp/requirements.txt./nlp/core.py:/nlp/tasks.py, register:/nlp/cli.py, expose:summarize()returns a non-empty string under 200 chars.summarize_task()updates the DB with a"summary"field./nlp/README.mdwith: