Transform raw documents into structured BPMN workflows — automatically.
Audio meetings, PDFs, images and text files are transcribed, analysed and converted into ready-to-use workflow JSON that can be visualised or injected directly into the OpenBEE portal.
| Capability | Detail |
|---|---|
| Audio Transcription | Whisper Large v3 — French and multilingual |
| OCR / Vision | Llama 4 Scout 17B |
| Text and PDF Ingestion | PDF-to-image conversion + plain text pipeline |
| BPMN Extraction | Few-shot prompt — structured JSON (actors, nodes, edges) |
| Diagram Generation | HTML/SVG swimlane, Mermaid .mmd, Graphviz .dot |
| OpenBEE Automation | Browser-use agent auto-creates the workflow in the portal |
| Local or Cloud Inference | Works with local GPU/CPU or via Groq cloud API |
+----------------------------------------------------------+
| USER BROWSER |
| Streamlit UI :8501 |
+---------------------+--------------------+---------------+
| REST /ingest | /status polling
v v
+----------------------------------------------------------+
| BACKEND FastAPI :5000 (Ngrok tunnel) |
| |
| +----------+ +----------+ +----------+ +----------+ |
| | Audio | | PDF | | Image | | Text | |
| | Whisper | |pdf2image | | Vision | | Raw | |
| +-----+----+ +-----+----+ +-----+----+ +----+-----+ |
| +-------------+--------------+----------+ |
| Raw Context (text) |
| | |
| v |
| BPMN Extraction Prompt (few-shot) |
| Llama 4 Scout 17B |
| | |
| v |
| { actors, nodes[], edges[] } JSON BPMN |
+----------------------------------------------------------+
|
+--------+--------+
v v
+---------+ +---------------+
|Diagrams | | OpenBEE |
|HTML/SVG | | Portal |
|Mermaid | | browser-use |
|Graphviz | | agent |
+---------+ +---------------+
solution/
├── backend/
│ └── agents/
│ └── openbee_backend.py # FastAPI server — ingestion pipeline
├── frontend/
│ ├── app_OPENBEE.py # Streamlit UI
│ ├── generate_workflow_cerebras.py # Diagram generator (pure Python, no LLM)
│ └── generate_workflow_openbee.py # Browser-use agent — OpenBEE portal
├── diagram_output/ # Generated diagrams (HTML, .mmd, .dot)
├── .env # API keys (never commit this)
├── .env.example # Template for new contributors
├── requirements.txt
├── Dockerfile.backend
├── Dockerfile.frontend
├── docker-compose.yml
└── start.ps1 # One-shot Windows launcher
# 1. Clone and configure
git clone https://github.com/CorleoneAUP/solution.git
cd solution
cp .env.example .env # Fill in your API keys
# 2. Build and run
docker compose up --build
# 3. Open the UI
# http://localhost:8501# 1. Create virtual environment
python -m venv .venv
.\.venv\Scripts\activate
# 2. Install dependencies
pip install -r requirements.txt
playwright install chromium
# 3. Configure environment
copy .env.example .env # Edit with your keys
# 4. Launch everything (auto-loads .env)
.\start.ps1Or manually in two terminals:
# Terminal 1 — Backend
python backend\agents\openbee_backend.py
# Terminal 2 — Frontend
streamlit run frontend\app_OPENBEE.pyCreate a .env file at the root of solution/:
GROQ_API_KEY=gsk_... # https://console.groq.com
NGROK_TOKEN=2... # https://dashboard.ngrok.comGo to the Ingestion tab and upload one or more files:
| Format | Processing |
|---|---|
.wav / .mp3 / .ogg |
Whisper Large v3 transcription |
.pdf |
Converted to images then OCR via Llama 4 Scout |
.png / .jpg / .webp |
Direct OCR via Llama 4 Scout |
.txt / .md / .csv |
Read as plain text |
Switch to the Logs tab — the pipeline runs sequentially and streams live logs.
The Results tab shows:
- Metrics: actor count, node count, edge count
- The full BPMN JSON (collapsible)
- Download button for
workflow_bpmn.json
Click Generate Diagrams to instantly produce:
workflow.html— SVG swimlane diagram (download and open in browser)workflow.mmd— Mermaid sourceworkflow.dot— Graphviz DOT source
Click Create Workflow in OpenBEE — a real Chrome browser opens and the AI agent navigates the portal to create the workflow automatically.
The pipeline uses a few-shot prompt with explicit BPMN rules to guide llama-4-scout-17b-16e-instruct:
| Shape | Detection Rule |
|---|---|
| Oval | Process start or terminal state (archiving, classification) |
| Rectangle | Action verb + subject (who does what) |
| Diamond | Conditional language: "si", "selon", binary result |
The FIN swimlane is always injected as the last actor to receive all terminal Oval nodes.
| Service | Port | Image base |
|---|---|---|
openbee_backend |
5000 |
Python 3.11-slim + poppler |
openbee_frontend |
8501 |
Python 3.11-slim + Chromium |
docker compose logs -f backend # Watch backend logs
docker compose logs -f frontend # Watch frontend logs
docker compose down # Stop everything
docker compose up --build -d # Rebuild and run in background| Layer | Technology |
|---|---|
| LLM / Vision | llama-4-scout-17b-16e-instruct |
| Audio | whisper-large-v3 |
| Backend | FastAPI + Uvicorn |
| Tunneling | pyngrok (Ngrok free tier) |
| Frontend | Streamlit |
| PDF Processing | pdf2image + Poppler |
| Browser Automation | browser-use + Playwright + Chromium |
| Diagram Rendering | Pure Python (SVG, Mermaid, Graphviz DOT) |
MIT — see LICENSE