Issue: Images are always encoded as PNG → excessive memory usage and file size (need JPEG artifact + conversion operator)
Problem
Currently, all images inside the pipeline are effectively treated as PNG (RGBA) when:
- stacking images (
imageList → image)
- converting
ImageData to bytes (e.g. before PDF generation)
- exporting intermediate results
This causes extreme memory and size inflation.
Example:
- Input: 5 JPEG files (6 + 3 + 4 + 6 + 4 MB ≈ 23 MB total)
- After stacking / PNG re-encoding: result ~128 MB
This happens because:
ImageData is uncompressed RGBA (4 bytes per pixel)
canvas.toDataURL("image/png") produces lossless PNG
- PNG is not appropriate for photographic images
- We always convert to PNG regardless of original format
This is not scalable for large image sets.
Root Cause
The pipeline currently models images as:
{ type: "image", width, height, image: ImageData }
There is no concept of:
- Original format (jpeg/png/webp)
- Compression quality
- Lossy vs lossless encoding
- Byte representation separate from ImageData
As a result, every export step defaults to PNG.
Required Direction
We need format-aware image artifacts and a conversion operator.
Proposed Solution
1️⃣ Extend Artifact model
Add a new artifact type:
type ImageJpegArtifact = {
type: "imageJpeg";
width: number;
height: number;
image: ImageData;
quality?: number; // 0–1
};
Or alternatively:
type ImageArtifact = {
type: "image";
width: number;
height: number;
image: ImageData;
format: "png" | "jpeg";
quality?: number;
};
The second option is cleaner long-term (format as metadata).
2️⃣ Add operator: Convert Image → JPEG
New op:
op.image.toJpeg
io: image → image
Parameters:
- quality (default 0.85)
- optional subsampling control (future)
Implementation:
- Use
canvas.toBlob("image/jpeg", quality)
- Return an artifact flagged as JPEG
3️⃣ Improve PDF generation
imagesToPdf should:
- Accept already-compressed JPEG bytes when available
- Avoid re-encoding PNG unnecessarily
- Prefer JPEG for photographic content
4️⃣ Optional future operator
Automatically chooses:
- JPEG for photos
- PNG for masks / flat graphics
Acceptance Criteria
- Stacking 5 JPEG images does not inflate to 100+ MB
- Image artifacts can carry format metadata
- Builder includes an “Image → JPEG” conversion operator
- PDF generation prefers JPEG when available
- Memory usage remains proportional to input size
Notes
Issue: Images are always encoded as PNG → excessive memory usage and file size (need JPEG artifact + conversion operator)
Problem
Currently, all images inside the pipeline are effectively treated as PNG (RGBA) when:
imageList → image)ImageDatato bytes (e.g. before PDF generation)This causes extreme memory and size inflation.
Example:
This happens because:
ImageDatais uncompressed RGBA (4 bytes per pixel)canvas.toDataURL("image/png")produces lossless PNGThis is not scalable for large image sets.
Root Cause
The pipeline currently models images as:
There is no concept of:
As a result, every export step defaults to PNG.
Required Direction
We need format-aware image artifacts and a conversion operator.
Proposed Solution
1️⃣ Extend Artifact model
Add a new artifact type:
Or alternatively:
The second option is cleaner long-term (format as metadata).
2️⃣ Add operator: Convert Image → JPEG
New op:
Parameters:
Implementation:
canvas.toBlob("image/jpeg", quality)3️⃣ Improve PDF generation
imagesToPdfshould:4️⃣ Optional future operator
Automatically chooses:
Acceptance Criteria
Notes
This is not just about file size — it affects:
PNG should remain available for: