Skip to content

[FEAT] : Display AI-Generated Document Summary After Ingestion #641

Description

@vivek0369

Is your feature request related to a problem? Please describe.

Currently, uploaded documents are processed, chunked, and embedded successfully, but users have no quick way to understand the document contents before starting a conversation.

While reviewing the codebase, I noticed that:

The Document model already contains a summary field (summary = Column(Text, nullable=True)).

A summarization utility already exists in backend/app/rag/summarizer.py.

However, this functionality is not connected to the document ingestion pipeline and summaries are never surfaced in the frontend UI. As a result, users miss an opportunity to get immediate AI-generated insights about their uploaded documents.

Describe the solution you'd like

Complete the existing summarization workflow by:

Backend
After document ingestion, chunking, and embedding generation are completed, automatically generate a concise summary using the existing summarizer and store it in doc.summary.

Example:

from app.rag.summarizer import summarize_document

doc.summary = summarize_document(
chunks,
hf_token=user.hf_token
)

db.commit()
Frontend
Display the generated summary inside the document sidebar/card using a collapsible section.

Example UI:

{doc.summary && (

AI Summary
<p className="mt-1 leading-relaxed">
  {doc.summary}
</p>
)} This would allow users to quickly understand document contents immediately after upload.

Describe alternatives you've considered

Alternative 1: Generate summaries on demand
Instead of generating summaries during ingestion, a button could trigger summary generation when requested by the user.

Drawback: Adds latency and extra API calls whenever a user wants a summary.

Alternative 2: Generate summaries client-side
The frontend could request summaries separately after upload.

Drawback: Increases complexity and duplicates logic already available in the backend ingestion workflow.

Generating summaries during ingestion appears to be the cleanest and most efficient approach.

Additional Context

This feature appears to be partially implemented already:

Document.summary database field exists.

rag/summarizer.py exists.

No database migration should be required.

This makes the feature relatively low-risk and well-scoped while providing a meaningful UX improvement.

Benefits
Immediate document understanding

Better user onboarding experience

Increased visibility of AI capabilities

Reuses existing infrastructure

Minimal implementation complexity

Suggested Acceptance Criteria

Summary is generated automatically after successful document ingestion.

Summary is saved to Document.summary.

Summary is returned by document APIs.

Summary is displayed in the document sidebar/card when available.

Summary section is collapsible.

No database migration is required.

GSSoC '26

  • Yes, I am participating in GirlScript Summer of Code and would like to build this.

Metadata

Metadata

Labels

enhancementNew feature or improvementgssocGirlScript Summer of Code 2026 issue/PR

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions