Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions Dockerfile.webapp
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
FROM python:3.13-slim

# Install LibreOffice for local conversion (optional, can use the separate container)
RUN apt-get update && \
apt-get install -y --no-install-recommends \
libreoffice \
curl \
&& rm -rf /var/lib/apt/lists/*

# Set working directory
WORKDIR /app

# Install UV first
RUN pip install --no-cache-dir uv

# Copy project files
COPY pyproject.toml ./
COPY src/ ./src/

# Install dependencies (without frozen lock to allow updates)
RUN uv sync

# Set Python path
ENV PYTHONPATH=/app

# Expose port
EXPOSE 8000

# Run the web application
CMD ["uv", "run", "python", "-m", "uvicorn", "src.webapp:app", "--host", "0.0.0.0", "--port", "8000"]
166 changes: 166 additions & 0 deletions LOCALHOST_GUIDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
# Running PPT2Desc on Localhost

This guide explains how to run the PPT2Desc web application on your local machine.

## Quick Start

### Option 1: Using Docker Compose (Recommended)

This is the easiest way to get started. Both LibreOffice converter and the web application will run in containers.

1. **Start the services:**
```bash
docker compose up -d
```

2. **Access the web interface:**
- Open your browser and navigate to: **http://localhost:5001**

3. **Stop the services:**
```bash
docker compose down
```

That's it! The web interface will be available at http://localhost:5001, and you can upload PowerPoint files directly through your browser.

### Option 2: Running Locally with UV

If you prefer to run the application directly on your machine without Docker:

1. **Install dependencies:**
```bash
uv sync
```

2. **Start the LibreOffice converter (optional, if you want to use Docker-based conversion):**
```bash
docker compose up -d libreoffice-converter
```

3. **Run the web application:**
```bash
uv run uvicorn src.webapp:app --host 0.0.0.0 --port 5001
```

4. **Access the web interface:**
- Open your browser and navigate to: **http://localhost:5001**

## Using the Web Interface

Once the application is running, you can:

1. **Upload a PowerPoint file** (.ppt or .pptx)
2. **Select an AI provider** (Gemini, OpenAI, Anthropic, etc.)
3. **Configure model settings** (API keys, model name, etc.)
4. **Add optional instructions** to customize the output
5. **Click "Convert Presentation"** to process your file

The results will be displayed directly in the browser, showing detailed descriptions for each slide.

The web service runs on **port 5001** by default.

## Configuration Options

### AI Provider Settings

The web interface supports multiple AI providers:

- **Google Gemini API**: Requires API key
- **Google Vertex AI**: Requires GCP project ID, region, and service account credentials
- **OpenAI**: Requires API key
- **Anthropic Claude**: Requires API key
- **Azure OpenAI**: Requires API key, endpoint, and deployment name
- **AWS Bedrock**: Requires access key ID, secret access key, and region

### LibreOffice Configuration

By default, the web application uses the Docker-based LibreOffice converter at `http://libreoffice-converter:2002` (when using Docker Compose) or `http://localhost:2002` (when running locally).

If you have LibreOffice installed locally, you can leave the LibreOffice URL field blank, and the application will attempt to find it in your system PATH.

## API Endpoints

If you want to integrate the service programmatically:

### Health Check
```bash
curl http://localhost:5001/health
```

### Convert Presentation
```bash
curl -X POST http://localhost:5001/convert \
-F "file=@presentation.pptx" \
-F "client=gemini" \
-F "api_key=YOUR_API_KEY" \
-F "model=gemini-2.5-flash"
```

## Troubleshooting

### Port Already in Use

If port 5001 is already in use, you can change it:

**Docker Compose:**
Edit `docker-compose.yml` and change the port mapping:
```yaml
ports:
- "5002:8000" # Change 5002 to any available port
```

**Local Running:**
```bash
uv run uvicorn src.webapp:app --host 0.0.0.0 --port 5002
```

### LibreOffice Connection Issues

If you get errors about LibreOffice conversion:

1. Make sure the LibreOffice converter is running:
```bash
docker compose ps
```

2. Check the health of the converter:
```bash
curl http://localhost:2002/health
```

3. If using local LibreOffice, ensure it's installed:
```bash
which soffice
# or
which libreoffice
```

### Memory Issues

For large presentations or high rate limits, you may need to increase Docker memory limits. Edit your Docker settings or add resource limits to `docker-compose.yml`.

## Development

To run in development mode with auto-reload:

```bash
uv run uvicorn src.webapp:app --host 0.0.0.0 --port 5001 --reload
```

## Environment Variables

You can set default values using environment variables:

```bash
export GEMINI_API_KEY=your_api_key
export OPENAI_API_KEY=your_api_key
export ANTHROPIC_API_KEY=your_api_key
```

Then you won't need to enter API keys in the web interface each time.

## Next Steps

- Check the main [README.md](README.md) for detailed information about the project
- Learn about customizing prompts and instructions
- Explore the CLI version for batch processing
22 changes: 22 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ ppt2desc is a command-line tool that converts PowerPoint presentations into deta

## Features

- **Web Interface**: Easy-to-use browser-based interface for converting presentations
- **CLI Tool**: Command-line interface for batch processing and automation
- Convert PPT/PPTX files to semantic descriptions
- Process individual files or entire directories
- Support for visual elements interpretation (charts, graphs, figures)
Expand Down Expand Up @@ -80,6 +82,26 @@ This will create a virtual environment and install all dependencies from `pyproj

## Usage

### Web Interface (Recommended for Quick Start)

The easiest way to use ppt2desc is through the web interface:

1. **Start the web application:**
```bash
docker compose up -d
```

2. **Open your browser and navigate to:**
```
http://localhost:5001
```

3. **Upload your PowerPoint file, configure your AI provider, and convert!**

For detailed instructions, see [LOCALHOST_GUIDE.md](LOCALHOST_GUIDE.md).

### Command Line Interface

Basic usage with Gemini API:
```bash
uv run src/main.py \
Expand Down
19 changes: 18 additions & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
services:
libreoffice-converter:
build:
build:
context: ./src/libreoffice_docker
dockerfile: Dockerfile
ports:
Expand All @@ -11,4 +11,21 @@ services:
test: ["CMD", "curl", "-f", "http://localhost:2002/health"]
interval: 300s
timeout: 10s
retries: 3

ppt2desc-web:
build:
context: .
dockerfile: Dockerfile.webapp
ports:
- "5001:8000"
restart: unless-stopped
depends_on:
- libreoffice-converter
environment:
- LIBREOFFICE_URL=http://libreoffice-converter:2002
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ dependencies = [
"charset-normalizer==3.4.1",
"distro==1.9.0",
"docstring-parser==0.16",
"fastapi>=0.115.0",
"google-ai-generativelanguage==0.6.10",
"google-api-core==2.24.0",
"google-api-python-client==2.156.0",
Expand Down Expand Up @@ -53,6 +54,7 @@ dependencies = [
"pymupdf==1.25.1",
"pyparsing==3.2.1",
"python-dateutil==2.9.0.post0",
"python-multipart>=0.0.12",
"requests==2.32.3",
"rsa==4.9",
"s3transfer==0.10.4",
Expand All @@ -63,6 +65,7 @@ dependencies = [
"typing-extensions==4.12.2",
"uritemplate==4.1.1",
"urllib3==2.3.0",
"uvicorn>=0.32.0",
"pytest==8.3.3",
"pytest-mock==3.14.0",
]
Expand Down
1 change: 1 addition & 0 deletions src/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Package initialization
18 changes: 13 additions & 5 deletions src/processor.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,19 @@
from concurrent.futures import ThreadPoolExecutor, as_completed
from tqdm import tqdm

from llm import LLMClient
from converters.ppt_converter import convert_pptx_to_pdf
from converters.pdf_converter import convert_pdf_to_images
from converters.docker_converter import convert_pptx_via_docker
from schemas.deck import DeckData, SlideData
# Support both relative imports (for webapp) and absolute imports (for main.py)
try:
from .llm import LLMClient
from .converters.ppt_converter import convert_pptx_to_pdf
from .converters.pdf_converter import convert_pdf_to_images
from .converters.docker_converter import convert_pptx_via_docker
from .schemas.deck import DeckData, SlideData
except ImportError:
from llm import LLMClient
from converters.ppt_converter import convert_pptx_to_pdf
from converters.pdf_converter import convert_pdf_to_images
from converters.docker_converter import convert_pptx_via_docker
from schemas.deck import DeckData, SlideData

# Create a type alias for all possible clients
logger = logging.getLogger(__name__)
Expand Down
Loading