An AI-powered research synthesis platform that transforms uploaded documents into comprehensive research papers, presentations, and analytics using advanced language models and agent-based processing.
- Overview
- Features
- Tech Stack
- Architecture
- Prerequisites
- Quick Start
- Environment Configuration
- Project Structure
- AI Agent Pipeline
- Testing
- Usage Guide
- Database Schema
- Deployment
- Troubleshooting
- Contributing
- License
- Support
- Acknowledgments
Synthesis is a comprehensive research synthesis platform designed to streamline the academic research process. It leverages advanced AI capabilities through a multi-agent pipeline to transform raw PDF documents into polished research papers, presentations, and actionable analytics.
The platform employs a sophisticated agent-based architecture where specialized AI agents work collaboratively to handle different aspects of research synthesis—from document parsing and content outline generation to final paper writing and presentation creation.
| Feature | Description |
|---|---|
| Document Upload | Support for PDF files with intelligent text extraction using pdf-parse library |
| AI Agent Pipeline | Automated research paper generation using 12 specialized agents |
| Multi-format Export | Export papers in PDF, DOCX, LaTeX, and Markdown formats |
| Analytics Dashboard | Quality metrics, citation analysis, word count trends, and concept networks |
| Interactive Chat | Ask questions about your research and get AI-powered insights |
| Hypothesis Generation | Automated generation of research hypotheses with scoring |
| Presentation Generation | Create PowerPoint presentations from research findings |
| Real-time Pipeline Tracking | Monitor agent progress and status in real-time |
- Reader Agent: Parses and extracts content from uploaded PDF documents
- Summarizer Agent: Creates concise summaries of document content
- Outliner Agent: Generates structured outlines for research papers
- Writer Agent: Produces comprehensive research paper content
- Reviewer Agent: Reviews and provides feedback on generated content
- Presenter Agent: Creates PowerPoint presentations from research
- Graph Agent: Generates concept networks and visual analytics
- Hypothesis Agent: Generates and scores research hypotheses
- Statistics Agent: Analyzes statistical patterns in research data
- Experiment Agent: Designs experimental frameworks
- Verifier Agent: Validates accuracy of generated content
- Orchestrator: Coordinates the entire agent pipeline
- Framework: Next.js 16 (App Router)
- UI Library: React 19
- Styling: Tailwind CSS 4, shadcn/ui components
- State Management: React hooks and context
- Charts: Recharts, D3.js-based custom components
- Rich Text Editor: TipTap editor
- Math Rendering: KaTeX for mathematical expressions
- Runtime: Next.js API Routes
- ORM: Prisma 6
- Authentication: Supabase Auth
- AI Integration: Google Gemini 1.5 Flash, LangChain
- Development: SQLite
- Production: PostgreSQL
- PDF Processing: pdf-parse, mammoth
- Document Generation: docx, pptxgenjs, jspdf, html2pdf.js
- Vector Database: ChromaDB
- Error Tracking: Sentry
- Testing: Playwright
- MCP Integration: Model Context Protocol server
┌─────────────────────────────────────────────────────────────┐
│ API Routes (Next.js) │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌───────────┐ │
│ │/api/proj │ │/api/agent│ │/api/upload│ │/api/export│ │
│ │ ects │ │ s │ │ │ │ │ │
│ └──────────┘ └──────────┘ └──────────┘ └───────────┘ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌───────────┐ │
│ │/api/chat │ │/api/analy│ │/api/downl│ │/api/memory│ │
│ │ │ │ tics │ │ oad │ │ │ │
│ └──────────┘ └──────────┘ └──────────┘ └───────────┘ │
└──────────────────────────┬──────────────────────────────────┘
│
▼
Before setting up the project, ensure you have the following installed:
| Requirement | Version | Description |
|---|---|---|
| Node.js | >=18.0.0 | JavaScript runtime |
| npm | >=9.0.0 | Package manager |
| Git | >=2.30.0 | Version control |
| Database | SQLite or PostgreSQL | Data storage |
- Docker: For containerized deployment
- PostgreSQL: For production database
- Google Cloud Account: For Gemini API access
git clone https://github.com/your-username/synthesis.git
cd synthesisnpm installcp .env.example .env.localEdit .env.local and add your configuration:
# Required: Google Gemini API
GEMINI_API_KEY=your_gemini_api_key_here
# Database (SQLite for development)
DATABASE_URL="file:./dev.db"
# Supabase (Required for authentication)
NEXT_PUBLIC_SUPABASE_URL=your_supabase_url
NEXT_PUBLIC_SUPABASE_ANON_KEY=your_anon_key
SUPABASE_SERVICE_ROLE_KEY=your_service_role_key
# Sentry (Optional - for error tracking)
SENTRY_DSN=your_sentry_dsnnpx prisma migrate dev
npx prisma generatenpm run devNavigate to http://localhost:3000
| Variable | Description | Required |
|---|---|---|
GEMINI_API_KEY |
Google Gemini API key | Yes |
DATABASE_URL |
Prisma database connection string | Yes |
| Variable | Description | Default |
|---|---|---|
NEXT_PUBLIC_SUPABASE_URL |
Supabase project URL | - |
NEXT_PUBLIC_SUPABASE_ANON_KEY |
Supabase anon key | - |
SUPABASE_SERVICE_ROLE_KEY |
Supabase service role key (server-side) | - |
SENTRY_DSN |
Sentry error tracking DSN | - |
- Go to Google AI Studio
- Create a new API key
- Copy the key to your environment variables
synthesis/
├── app/ # Next.js App Router pages
│ ├── auth/ # Authentication pages (signin)
│ ├── api/ # API routes
│ │ ├── agents/ # Agent execution endpoints
│ │ ├── analytics/ # Analytics data endpoints
│ │ ├── chat/ # AI chat endpoints
│ │ ├── download/ # File download endpoints
│ │ ├── export/ # Export endpoints (PDF, DOCX, etc.)
│ │ ├── memory/ # Research memory endpoints
│ │ ├── projects/ # Project CRUD endpoints
│ │ └── upload/ # File upload endpoints
│ ├── privacy/ # Privacy policy page
│ ├── terms/ # Terms of service page
│ ├── layout.tsx # Root layout
│ └── page.tsx # Home page
├── components/ # React components
│ ├── ui/ # Reusable UI components (shadcn/ui)
│ ├── agent-activity-feed.tsx # Real-time agent activity display
│ ├── agent-performance-metrics.tsx # Agent performance charts
│ ├── agent-pipeline.tsx # Pipeline visualization
│ ├── agent-status.tsx # Agent status indicators
│ ├── chat-with-research.tsx # Interactive research chat
│ ├── citation-chart.tsx # Citation analysis charts
│ ├── concept-network.tsx # Concept network visualization
│ ├── file-upload.tsx # Document upload component
│ ├── hypothesis-table.tsx # Hypothesis display table
│ ├── outline-editor.tsx # Research outline editor
│ ├── paper-viewer.tsx # Paper content viewer
│ ├── project-card.tsx # Project card component
│ ├── project-details-view.tsx # Detailed project view
│ ├── rich-text-editor.tsx # TipTap rich text editor
│ ├── sidebar.tsx # Navigation sidebar
│ ├── statistics-viewer.tsx # Statistics display
│ └── word-count-trend.tsx # Word count trend charts
├── lib/ # Core libraries
│ ├── agents/ # AI agent implementations
│ │ ├── orchestrator.ts # Pipeline orchestration
│ │ ├── reader-agent.ts # Document parsing
│ │ ├── writer-agent.ts # Content generation
│ │ ├── reviewer-agent.ts # Content review
│ │ ├── presenter-agent.ts # Presentation creation
│ │ ├── graph-agent.ts # Concept networks
│ │ ├── hypothesis-agent.ts # Hypothesis generation
│ │ ├── outliner-agent.ts # Outline generation
│ │ ├── summarizer-agent.ts # Summarization
│ │ ├── statistics-agent.ts # Statistical analysis
│ │ ├── experiment-agent.ts # Experiment design
│ │ └── verifier-agent.ts # Content verification
│ ├── prisma.ts # Prisma client singleton
│ ├── gemini-client.ts # Google Gemini AI client
│ ├── supabase.ts # Supabase client
│ ├── langchain-config.ts # LangChain configuration
│ ├── vector-store.ts # ChromaDB vector store
│ ├── file-processor.ts # PDF/file processing
│ ├── export-utils.ts # Export utilities
│ ├── ppt-generator.ts # PowerPoint generation
│ ├── analytics-utils.ts # Analytics calculations
│ ├── citation-extractor.ts # Citation extraction
│ ├── citation-verifier.ts # Citation verification
│ ├── prompt-loader.ts # Prompt template loader
│ ├── rate-limit.ts # API rate limiting
│ ├── streaming-response.ts # Streaming response utils
│ ├── validation.ts # Input validation
│ ├── version-control.ts # Version control utils
│ ├── themes.ts # Theme configuration
│ ├── types.ts # TypeScript types
│ └── utils.ts # General utilities
├── hooks/ # Custom React hooks
├── prompts/ # AI agent prompts
│ ├── system-prompts/ # System prompt templates
│ └── agent-prompts/ # Agent-specific prompts
├── prisma/ # Database schema
│ ├── schema.prisma # Database models
│ └── migrations/ # Migration files
├── mcp/ # Model Context Protocol integration
├── tests/ # Test files
├── scripts/ # Utility scripts
├── docs/ # Documentation
├── public/ # Static assets
├── playwright.config.ts # Playwright E2E test config
├── middleware.ts # Next.js middleware
└── sentry.*.config.ts # Sentry error tracking configs
| Directory | Description |
|---|---|
app/ |
Next.js App Router with pages and API routes |
app/api/ |
RESTful API endpoints for all features |
components/ |
React components organized by feature |
components/ui/ |
Reusable shadcn/ui components |
lib/agents/ |
AI agent implementations (12 specialized agents) |
lib/ |
Core utilities, database clients, and AI integrations |
hooks/ |
Custom React hooks for state management |
prompts/ |
Prompt templates for AI agents |
prisma/ |
Database schema and migrations |
mcp/ |
Model Context Protocol server integration |
tests/ |
Playwright E2E and unit tests |
docs/ |
Project documentation |
The platform uses a sophisticated multi-agent pipeline to transform documents into research outputs:
┌─────────┐ ┌────────────┐ ┌─────────┐ ┌─────────┐
│ Upload │───▶│ Reader │───▶│ Outline │───▶│ Writer │
│ PDF │ │ Agent │ │ Agent │ │ Agent │
└─────────┘ └────────────┘ └─────────┘ └─────────┘
│
┌─────────┐ ┌───────────┴────┐
│Presener │◀───│ Reviewer │
│ Agent │ │ Agent │
└─────────┘ │
│ ┌────────────┐ │
│ │ Graph │◀────┤
│ │ Agent │ │
│ └────────────┘ │
│ ▼
│ ┌────────────┐ ┌─────────┐
└────────▶│ Hypothesis │◀───│ Output │
│ Agent │ │ Files │
└────────────┘ └─────────┘
- Reader Agent extracts text content from uploaded PDFs
- Summarizer Agent creates document summaries
- Outliner Agent generates research paper structure
- Writer Agent produces full research paper content
- Reviewer Agent evaluates and improves quality
- Presenter Agent creates PowerPoint presentations
- Graph Agent builds concept networks and visualizations
- Hypothesis Agent generates research hypotheses
- Statistics Agent analyzes statistical patterns
- Experiment Agent designs experimental frameworks
- Verifier Agent ensures factual accuracy
- Orchestrator coordinates all agents and manages workflow
# Install Playwright browsers
npx playwright install
# Run all tests
npx playwright test
# Run tests in UI mode
npx playwright test --ui
# Run specific test file
npx playwright test tests/example.spec.tsnpm test- Click "New Project" on the dashboard
- Enter project name and description
- Set research type (paper, presentation, analysis)
- Click "Create" to initialize
- Navigate to your project
- Click "Upload" in the document panel
- Select PDF files (multiple files supported)
- Wait for upload and text extraction
- View extracted content in the reader
- Click "Run Pipeline" button
- Select agents to run (or run all)
- Monitor real-time progress
- View results in respective tabs
| Tab | Description |
|---|---|
| Overview | Project summary and status |
| Outline | Generated paper outline |
| Paper | Full research paper content |
| Analytics | Quality metrics and insights |
| Statistics | Word count, citation analysis |
| Chat | Interactive AI assistant |
- Navigate to the Paper tab
- Click "Export" button
- Select format (PDF, DOCX, LaTeX, Markdown)
- Download generated file
- Navigate to the Presentation tab
- Click "Generate Slides"
- Customize slide layout
- Export as PowerPoint
The application uses Prisma ORM with SQLite (development) or PostgreSQL (production). Here are the main models:
model User {
id String @id @default(uuid())
email String @unique
name String?
avatarUrl String?
createdAt DateTime @default(now())
updatedAt DateTime @default(now())
projects Project[]
}model Project {
id String @id @default(uuid())
name String
description String?
status String @default("idle")
progress Int @default(0)
tags String?
archived Boolean @default(false)
favorited Boolean @default(false)
userId String?
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
user User? @relation(fields: [userId], references: [id])
documents Document[]
agentRuns AgentRun[]
hypotheses Hypothesis[]
conceptNodes ConceptNode[]
outlines Outline[]
presentations Presentation[]
statistics Statistic[]
}model Document {
id String @id @default(uuid())
projectId String
filename String
filepath String
filesize Int
mimetype String
extractedText String?
metadata String?
createdAt DateTime @default(now())
project Project @relation(fields: [projectId], references: [id], onDelete: Cascade)
}model AgentRun {
id String @id @default(uuid())
projectId String
agentName String
status String @default("pending")
input String?
output String?
error String?
executionTime Float?
startedAt DateTime?
completedAt DateTime?
createdAt DateTime @default(now())
project Project @relation(fields: [projectId], references: [id], onDelete: Cascade)
}model Hypothesis {
id String @id @default(uuid())
projectId String
title String
description String
testability Int
novelty Int
feasibility Int
category String
createdAt DateTime @default(now())
project Project @relation(fields: [projectId], references: [id], onDelete: Cascade)
}- ConceptNode: Stores concept network nodes with importance scores and cluster assignments
- Outline: Stores generated research paper outlines with sections
- Presentation: Stores generated PowerPoint presentation data
- Statistic: Stores statistical analysis results with categories and values
For complete schema, see prisma/schema.prisma
npm run build
npm run start-
Build the image:
docker build -t synthesis:latest . -
Run the container:
docker run -p 3000:3000 --env-file .env.production synthesis:latest
- Install Vercel CLI:
npm i -g vercel - Run deployment:
vercel - Configure environment variables in Vercel dashboard
| Environment | Database | API URL |
|---|---|---|
| Development | SQLite (local) | http://localhost:3000 |
| Production | PostgreSQL | Your production URL |
Error: GEMINI_API_KEY is not defined
Solution: Ensure your .env.local file contains a valid GEMINI_API_KEY
Error: Can't reach database server
Solution:
- For SQLite: Ensure the path in
DATABASE_URLis correct - For PostgreSQL: Verify connection string and database availability
Error: Failed to extract text from PDF
Solution:
- Ensure PDF is not password-protected
- Check that PDF is not corrupted
- Verify pdf-parse library is properly installed
Error: Agent execution failed
Solution:
- Check API quota limits
- Verify network connectivity
- Review agent prompt configurations
- Check logs in
test-results/directory
Error: TypeScript or linting errors
Solution:
npm run lint
# Fix any reported issues
npm run build- Use pagination for large document collections
- Implement caching for repeated AI queries
- Use streaming for long-form content generation
- Optimize database queries with proper indexing
We welcome contributions! Please see our Contributing Guide for details on:
- Code style guidelines
- Pull request process
- Development workflow
- Testing requirements
- Documentation standards
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature - Make your changes
- Run tests:
npm test - Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
If you encounter any issues or have questions:
- Check the Quick Start Guide
- Review the Testing Guide
- Search existing GitHub Issues
- Open a new issue if needed
- Built with Next.js 16 and React 19
- Powered by Google Gemini AI
- UI components built with shadcn/ui
- Rich text editing with TipTap
- Mathematical expressions with KaTeX
- Charts and visualizations with Recharts
Made with ❤️ by the Synthesis Team