A self-hosted toolkit for product content and catalog teams. Three independent modules:
- Image Processing — bulk ingest, transform, and export product images through repeatable pipelines.
- Reporting — process Pimcore xlsx exports per country (AU/NZ): produce cleansed versions and compute deltas between uploads.
- Content Generator — AI-powered B2B marketing copy and short product descriptions for individual products or bundles, backed by Google Gemini with optional live search grounding.
- Multi-method ingestion — Upload files directly, fetch from URLs, or scrape images from web pages
- Custom processing pipelines — Chain operations like crop, resize, scale, convert, and bulk rename in any order
- Real-time previews — See processed results before committing to a full batch run
- Bulk export — Download processed images as ZIP or generate XLSX manifests with public image URLs
- Public image serving — Processed images are available at
/images/{workflowId}/{filename}without authentication
- Pimcore export ingestion — Upload xlsx exports per country (AU/NZ); strict header + country validation rejects malformed files at upload time
- Cleansed downloads — Drop nine junk columns, filter dash-only rows, sort by IMSKU
- Delta downloads — Identify rows whose IMSKU is new vs the previous upload, with the same cleansing rules applied
- Shared org-level state — Everyone sees the same "current" report per country; the most recent two uploads are retained for delta comparison
- Grounded AI copy — Google Gemini with Google Search grounding can verify real SKU specs before writing; refuses with a clear message if the product can't be identified
- Bundle support — Generate consolidated solution copy for up to 5 products from any manufacturer in a single request
- Locale-aware spelling — Choose en-GB (British: optimised, colour, organisation) or en-US (American: optimized, color, organization); defaults to en-GB for AU/NZ/SG markets
- Marketing copy and short descriptions — Generate reseller marketing paragraphs, key selling points, product-grid short descriptions, or both in one workflow
- Inline or split output — Toggle key selling points inline under the marketing copy (default, paste-ready) or as a separate editable section
- Dual copy formats — Plain-text and clean HTML copies for each section, ready for Pimcore's rich-text editor
- Prompt transparency — A "View prompt" button shows users exactly what is sent to Gemini, with current form values substituted live
- Admin-editable LLM settings — Admins can edit prompts, templates, Gemini model, temperature, token limit, and Google Search grounding from Settings → LLM Instructions, with history, reset, and revert
- Role-based access — Admin, pipeline editor, and user roles with per-workflow ownership
- Soft-deleted users — Deleted users are retained for audit/history display while their usernames can be reused
- Docker-ready — Single container deployment with persistent storage and health checks
- Tiered rate limiting — Loose limit on the API surface, strict limit on auth + scrape endpoints; works correctly behind a reverse proxy
Pull and run the latest image from GHCR:
# docker-compose.yml
services:
productlens:
image: ghcr.io/regalen/productlens:latest
container_name: productlens
ports:
- "3446:3446"
volumes:
- app-data:/data
- app-workspace:/tmp/workspace
environment:
- PORT=3446
- BASE_URL=https://your-domain.com
- JWT_SECRET=replace-with-a-strong-random-secret
- CORS_ORIGIN=https://your-domain.com
# Content Generator: free Gemini API key from https://aistudio.google.com/apikey
- GEMINI_API_KEY=${GEMINI_API_KEY}
restart: unless-stopped
healthcheck:
test: ["CMD", "wget", "-qO-", "http://localhost:3446/api/health"]
interval: 30s
timeout: 10s
start_period: 30s
retries: 3
volumes:
app-data:
app-workspace:# Generate a secure JWT secret
export JWT_SECRET=$(openssl rand -hex 32)
docker compose up -dBehind a reverse proxy? The app trusts one upstream hop and reads the real client IP from
X-Forwarded-Forso rate limiting works per-client. The typical nginx/Traefik/Caddy → container topology works out of the box; if you stack multiple proxies, adjustapp.set("trust proxy", N)in server/index.ts.
Prerequisites: Node.js 20+
npm install
cp .env.example .env.local
# Edit .env.local and set JWT_SECRET
npm run devThe app will be available at http://localhost:3000.
On first run, a default admin account is created and required to change its password:
- Username:
admin - Password:
admin
Change this immediately after first login.
| Variable | Default | Description |
|---|---|---|
PORT |
3000 (Docker: 3446) |
Server listen port |
JWT_SECRET |
Random (regenerated on restart) | Secret for signing JWT tokens |
BASE_URL |
http://localhost:{PORT} |
Public URL for generated image links |
CORS_ORIGIN |
http://localhost:{PORT} in development; required in production |
Allowed CORS origin. Use a comma-separated list for multiple origins. |
DATA_DIR |
./data (Docker: /data) |
Persistent storage directory |
WORKSPACE_DIR |
{DATA_DIR}/workspace (Docker: /tmp/workspace) |
Temp directory for in-flight processing |
GEMINI_API_KEY |
(unset) | Free Gemini API key for the Content Generator — get one at https://aistudio.google.com/apikey |
Workflows move through a five-stage pipeline:
- Ingest — Add images via file upload, URL fetch, or web scraping
- Configure — Select or build a processing pipeline (crop, resize, scale, convert, rename)
- Preview — Generate and review processed previews
- Process — Run the full pipeline on selected images
- Export — Download as ZIP or XLSX with public image URLs
- Pick a report type from the Reporting tab (currently
Data_Missing_Report_Webvisible) - Select the country (AU or NZ)
- Upload the latest Pimcore xlsx export — it's strict-validated against the canonical 18-column schema and the country code in every row
- Download:
- Original — your raw upload, untouched
- Cleansed — junk columns dropped, dash-only rows filtered, sorted by IMSKU
- Delta — only the IMSKUs new since the previous upload, with cleansing applied (available once a second upload exists)
Every (report, country) keeps the most recent two uploads. The system is shared org-wide: any authenticated user sees the same current/previous and can replace it with a new upload.
- Navigate to Content Generator in the header nav
- Enter a Manufacturer, Part/Model No., and Description — or click + Add product to build a bundle (up to 5 products, any mix of manufacturers)
- Choose whether to generate Marketing copy, Product description, or both
- Choose a Length (Short / Medium / Long) for marketing copy and a Locale (
en-GBfor AU/NZ/SG,en-USfor American markets) - Check Show key selling points inline (on by default) to get a single paste-ready block, or uncheck to keep marketing paragraphs and key selling points as separate editable sections
- Click Generate — Gemini uses the configured prompt and grounding settings, then returns the requested artifacts in the chosen locale's spelling
- Edit generated output directly, then copy as plain text or HTML where available (ready to paste into Pimcore's Source view)
- Click Regenerate at any time; if you've edited the output you'll be asked to confirm before it's overwritten
GEMINI_API_KEY required. The feature returns
503if the key is missing. A free key from Google AI Studio is sufficient.
Admins can open Settings → LLM Instructions to manage Content Generator behavior without redeploying:
- edit marketing and product-description system instructions
- edit single-product and bundle user-message templates
- choose an allowed Gemini model (
gemini-2.5-flash,gemini-2.5-pro,gemini-2.0-flash,gemini-1.5-flash,gemini-1.5-pro) - tune temperature and max output tokens
- enable or disable Google Search grounding
- reset fields to factory defaults
- view history and revert to previous versions
Settings are stored in SQLite and read fresh for every generation. Prompt history stores full prompt text and is visible to admins, so do not put secrets in prompts.
Server-side validation protects the required template placeholders and output delimiters (===MARKETING===, ===BULLETS===, ===DESCRIPTION===, ===END===) that the response parser depends on.
Image-processing workflows (and all associated data — uploaded source images, generated previews, processed outputs, and database rows) are automatically deleted 7 days after creation, regardless of status. This keeps the SQLite database and /data volume lean.
Export your results before the 7-day window closes. Use the ZIP or XLSX export on the output stage to download everything you need. The purge runs hourly in the background and on server startup.
Reports are exempt from this purge. Each (report, country) holds the most recent and previous upload indefinitely; uploading a new version evicts the old previous and rotates the slots.
LLM settings and history are also exempt from workflow purge. They remain in SQLite until changed by an admin or manually removed from the database.
Users are soft-deleted. Deleting a user disables login and hides the account from user management, but keeps the row so historical report uploads and LLM setting edits can still show who performed the action. Deleted users appear as Name (Deleted) in historical metadata.
- Frontend: React 19, React Router v7, Tailwind CSS v4, shadcn/ui, Lucide icons, Motion
- Backend: Express, better-sqlite3, Sharp (image processing), ExcelJS (xlsx read/write), JSZip, Multer, Cheerio (web scraping)
- Auth + safety: JWT cookies (httpOnly, SameSite=lax, secure in prod), bcryptjs, express-rate-limit (tiered), SSRF protection on outbound fetches
- Build: Vite, TypeScript (strict mode,
noUncheckedIndexedAccess), Vitest - Deploy: Docker (node:20-slim), GitHub Actions CI/CD, GHCR
npm run dev # Start dev server (Express + Vite HMR)
npm run lint # Type-check with tsc --noEmit
npm test # Run tests
npm run build # Production buildReleased under the MIT License.