Is your feature request related to a problem? Please describe.
Currently, uploaded documents are processed, chunked, and embedded successfully, but users have no quick way to understand the document contents before starting a conversation.
While reviewing the codebase, I noticed that:
The Document model already contains a summary field (summary = Column(Text, nullable=True)).
A summarization utility already exists in backend/app/rag/summarizer.py.
However, this functionality is not connected to the document ingestion pipeline and summaries are never surfaced in the frontend UI. As a result, users miss an opportunity to get immediate AI-generated insights about their uploaded documents.
Describe the solution you'd like
Complete the existing summarization workflow by:
Backend
After document ingestion, chunking, and embedding generation are completed, automatically generate a concise summary using the existing summarizer and store it in doc.summary.
Example:
from app.rag.summarizer import summarize_document
doc.summary = summarize_document(
chunks,
hf_token=user.hf_token
)
db.commit()
Frontend
Display the generated summary inside the document sidebar/card using a collapsible section.
Example UI:
{doc.summary && (
AI Summary
<p className="mt-1 leading-relaxed">
{doc.summary}
</p>
)}
This would allow users to quickly understand document contents immediately after upload.
Describe alternatives you've considered
Alternative 1: Generate summaries on demand
Instead of generating summaries during ingestion, a button could trigger summary generation when requested by the user.
Drawback: Adds latency and extra API calls whenever a user wants a summary.
Alternative 2: Generate summaries client-side
The frontend could request summaries separately after upload.
Drawback: Increases complexity and duplicates logic already available in the backend ingestion workflow.
Generating summaries during ingestion appears to be the cleanest and most efficient approach.
Additional Context
This feature appears to be partially implemented already:
Document.summary database field exists.
rag/summarizer.py exists.
No database migration should be required.
This makes the feature relatively low-risk and well-scoped while providing a meaningful UX improvement.
Benefits
Immediate document understanding
Better user onboarding experience
Increased visibility of AI capabilities
Reuses existing infrastructure
Minimal implementation complexity
Suggested Acceptance Criteria
Summary is generated automatically after successful document ingestion.
Summary is saved to Document.summary.
Summary is returned by document APIs.
Summary is displayed in the document sidebar/card when available.
Summary section is collapsible.
No database migration is required.
GSSoC '26
Is your feature request related to a problem? Please describe.
Currently, uploaded documents are processed, chunked, and embedded successfully, but users have no quick way to understand the document contents before starting a conversation.
While reviewing the codebase, I noticed that:
The Document model already contains a summary field (summary = Column(Text, nullable=True)).
A summarization utility already exists in backend/app/rag/summarizer.py.
However, this functionality is not connected to the document ingestion pipeline and summaries are never surfaced in the frontend UI. As a result, users miss an opportunity to get immediate AI-generated insights about their uploaded documents.
Describe the solution you'd like
Complete the existing summarization workflow by:
Backend
After document ingestion, chunking, and embedding generation are completed, automatically generate a concise summary using the existing summarizer and store it in doc.summary.
Example:
from app.rag.summarizer import summarize_document
doc.summary = summarize_document(
chunks,
hf_token=user.hf_token
)
db.commit()
Frontend
Display the generated summary inside the document sidebar/card using a collapsible section.
Example UI:
{doc.summary && (
AI Summary
Describe alternatives you've considered
Alternative 1: Generate summaries on demand
Instead of generating summaries during ingestion, a button could trigger summary generation when requested by the user.
Drawback: Adds latency and extra API calls whenever a user wants a summary.
Alternative 2: Generate summaries client-side
The frontend could request summaries separately after upload.
Drawback: Increases complexity and duplicates logic already available in the backend ingestion workflow.
Generating summaries during ingestion appears to be the cleanest and most efficient approach.
Additional Context
This feature appears to be partially implemented already:
Document.summary database field exists.
rag/summarizer.py exists.
No database migration should be required.
This makes the feature relatively low-risk and well-scoped while providing a meaningful UX improvement.
Benefits
Immediate document understanding
Better user onboarding experience
Increased visibility of AI capabilities
Reuses existing infrastructure
Minimal implementation complexity
Suggested Acceptance Criteria
Summary is generated automatically after successful document ingestion.
Summary is saved to Document.summary.
Summary is returned by document APIs.
Summary is displayed in the document sidebar/card when available.
Summary section is collapsible.
No database migration is required.
GSSoC '26