FigureWeave is a research-engineering project for turning paper method descriptions into publication-style figures that remain editable as SVG.
This project is inspired by AutoFigure, but it is no longer a mirror of the original system. The current codebase has been reworked into a more practical figure authoring pipeline with:
- local GPU SAM3 segmentation
- split routing for image drafting and SVG reasoning
- multi-candidate end-to-end generation
- figure caption conditioning in addition to method text
- SVG-first reconstruction with template, optimized template, and final assembly stages
- CUDA-accelerated local post-processing for segmentation and background removal
FigureWeave is especially useful for:
- method overviews
- pipeline diagrams
- system schematics
- architecture figures
- editable draft figures for papers, slides, and reports
It is not intended to replace precise plotting tools such as matplotlib, seaborn, ggplot, or Origin for charts driven by exact numeric data.
The project is no longer organized as one large monolithic script.
figureweave.pyis now a thin compatibility entrypoint for CLI execution and top-level imports.src/figureweave/config.pystores provider defaults, paths, and shared constants.src/figureweave/llm.pycontains Gemini, OpenAI, Claude, OpenRouter, and related model-calling logic.src/figureweave/vision.pycovers image drafting, SAM3 segmentation, and background removal.src/figureweave/svg_ops.pyhandles SVG reconstruction, repair, optimization, and asset replacement.src/figureweave/pipeline.pyorchestrates the end-to-end pipeline and multi-candidate execution.src/figureweave/cli.pydefines the command-line interface.
This split makes the codebase easier to extend, debug, and test without changing the public CLI usage.
Compared with the original AutoFigure-style workflow, this project adds several concrete contributions:
-
Local SAM3 on GPU Segmentation can run locally on CUDA instead of depending only on hosted APIs. This improves speed, privacy, and reproducibility for the icon-region extraction stage.
-
Dual-provider model routing Image drafting and SVG reasoning are now decoupled, so the pipeline can use different providers for different stages, such as
Gemini -> Gemini,OpenAI -> OpenAI,Gemini -> Anthropic Claude, orOpenAI -> Anthropic Claude. -
Multi-candidate generation A single run can generate multiple full candidates, preserve each artifact bundle, write a candidate manifest, and promote a selected result as the default output.
-
Figure caption conditioning The system accepts both method text and a figure caption / figure brief, so the generator and reconstructor can be constrained by explicit stage structure, layout intent, and narrative emphasis.
-
SVG-first reconstruction pipeline Instead of treating the raster image as the final result, FigureWeave explicitly reconstructs an editable SVG template, optionally refines that template, and only then assembles the final SVG with extracted assets.
-
CUDA-accelerated local post-processing Background removal and other local visual post-processing stages now use GPU-capable PyTorch when available, reducing the CPU bottleneck of the original workflow.
-
More robust fallback behavior The current pipeline includes explicit fallback paths for no-icon cases, placeholder reduction, and provider-side failures, which makes batch generation more practical for real paper figure drafting.
The following assets are used as the current FigureWeave showcase from the multimodal_medical_report run:
- Draft image: img/case/multimodal_medical_report_draft.png
- Optimized SVG template: img/case/multimodal_medical_report_template.svg
- Final assembled SVG: img/case/multimodal_medical_report_final.svg
This showcase highlights the intended FigureWeave workflow:
figure.pngas the model-generated draftoptimized_template.svgas the editable structural reconstructionfinal.svgas the assembled showcase result
The browser-based FigureWeave interface is shown below with both the configuration view and the editable SVG canvas.
- Config view: img/UI/UI_1.png
- Editable canvas: img/UI/UI_2.png
FigureWeave currently runs in five major stages:
-
Image Draft Generate a scientific-style draft figure from method text, optional figure caption, and optional reference image.
-
Segmentation Run local SAM3 or an API backend to detect icons and visual regions, producing:
samed.pngboxlib.json
-
Asset Extraction Crop detected regions and remove backgrounds to create transparent assets.
-
SVG Reasoning And Reconstruction Use a multimodal model to reconstruct the draft into an editable SVG template, then optionally refine it.
-
Assembly Replace placeholders with extracted assets and emit:
template.svgoptimized_template.svgfinal.svg
GeminiOpenAI
GeminiOpenAIAnthropic Claude
Anthropic Claude is used here for understanding and reconstruction, not for native image generation. In this project, the image drafting stage should use Gemini or OpenAI.
Start the server:
python server.pyThen open:
http://127.0.0.1:8000
The main configuration page now includes:
Method TextFigure CaptionImage Draft ProviderSVG Reasoning ProviderCandidatesGeneration ModeSAM3 BackendReference Image
The canvas page lets you:
- inspect intermediate artifacts
- switch between candidate SVGs
- review logs
- open the result in the embedded SVG editor
python figureweave.py --method_file paper.txt --output_dir outputs/demo --image_provider gemini --image_api_key YOUR_GEMINI_KEY --svg_provider anthropic --svg_api_key YOUR_ANTHROPIC_KEYIf you want to use one provider for both stages, you can still use:
python figureweave.py --method_file paper.txt --output_dir outputs/demo --provider gemini --api_key YOUR_GEMINI_KEYpython figureweave.py --method_file paper.txt --output_dir outputs/demo_multi --image_provider gemini --image_api_key YOUR_GEMINI_KEY --svg_provider openai --svg_api_key YOUR_OPENAI_KEY --num_candidates 3FigureWeave supports local SAM3 execution on GPU.
Typical setup:
git clone https://github.com/facebookresearch/sam3.git
cd sam3
pip install -e .You also need:
- an available NVIDIA GPU
- CUDA-enabled PyTorch in the current environment
- Hugging Face access to SAM3
If local SAM3 is unavailable, the codebase can still fall back to other segmentation paths depending on your configuration.
pip install -r requirements.txtAt minimum, you will usually want:
HF_TOKEN=your_huggingface_token
ROBOFLOW_API_KEY=your_roboflow_keyDepending on your selected routing, you may also need:
- Gemini API key
- OpenAI API key
- Anthropic API key
Build and run:
docker compose up -d --buildHealth checks:
docker compose ps
curl http://127.0.0.1:8000/healthzLogs:
docker compose logs -f figureweaveRestart:
docker compose restart figureweaveTypical outputs include:
figure.pngsamed.pngboxlib.jsonicons/template.svgoptimized_template.svgfinal.svgcandidates_manifest.json
When multi-candidate mode is enabled, each run is stored under:
candidate_01/candidate_02/candidate_03/
FigureWeave is inspired by AutoFigure and builds on the broader idea of converting scientific method descriptions into figure drafts.
The current project extends that direction with:
- local GPU segmentation
- dual-provider routing
- multi-candidate generation
- in-browser SVG refinement
- a more complete engineering workflow
This repository is released under the MIT License in LICENSE.



