ProductLens

A self-hosted toolkit for product content and catalog teams. Three independent modules:

Image Processing — bulk ingest, transform, and export product images through repeatable pipelines.
Reporting — process Pimcore xlsx exports per country (AU/NZ): produce cleansed versions and compute deltas between uploads.
Content Generator — AI-powered B2B marketing copy and short product descriptions for individual products or bundles, backed by Google Gemini with optional live search grounding.

Features

Image Processing

Multi-method ingestion — Upload files directly, fetch from URLs, or scrape images from web pages
Custom processing pipelines — Chain operations like crop, resize, scale, convert, and bulk rename in any order
Real-time previews — See processed results before committing to a full batch run
Bulk export — Download processed images as ZIP or generate XLSX manifests with public image URLs
Public image serving — Processed images are available at /images/{workflowId}/{filename} without authentication

Reporting

Pimcore export ingestion — Upload xlsx exports per country (AU/NZ); strict header + country validation rejects malformed files at upload time
Cleansed downloads — Drop nine junk columns, filter dash-only rows, sort by IMSKU
Delta downloads — Identify rows whose IMSKU is new vs the previous upload, with the same cleansing rules applied
Shared org-level state — Everyone sees the same "current" report per country; the most recent two uploads are retained for delta comparison

Content Generator

Grounded AI copy — Google Gemini with Google Search grounding can verify real SKU specs before writing; refuses with a clear message if the product can't be identified
Bundle support — Generate consolidated solution copy for up to 5 products from any manufacturer in a single request
Locale-aware spelling — Choose en-GB (British: optimised, colour, organisation) or en-US (American: optimized, color, organization); defaults to en-GB for AU/NZ/SG markets
Marketing copy and short descriptions — Generate reseller marketing paragraphs, key selling points, product-grid short descriptions, or both in one workflow
Inline or split output — Toggle key selling points inline under the marketing copy (default, paste-ready) or as a separate editable section
Dual copy formats — Plain-text and clean HTML copies for each section, ready for Pimcore's rich-text editor
Prompt transparency — A "View prompt" button shows users exactly what is sent to Gemini, with current form values substituted live
Admin-editable LLM settings — Admins can edit prompts, templates, Gemini model, temperature, token limit, and Google Search grounding from Settings → LLM Instructions, with history, reset, and revert

Platform

Role-based access — Admin, pipeline editor, and user roles with per-workflow ownership
Soft-deleted users — Deleted users are retained for audit/history display while their usernames can be reused
Docker-ready — Single container deployment with persistent storage and health checks
Tiered rate limiting — Loose limit on the API surface, strict limit on auth + scrape endpoints; works correctly behind a reverse proxy

Quick Start

Docker (recommended)

Pull and run the latest image from GHCR:

# docker-compose.yml
services:
  productlens:
    image: ghcr.io/regalen/productlens:latest
    container_name: productlens
    ports:
      - "3446:3446"
    volumes:
      - app-data:/data
      - app-workspace:/tmp/workspace
    environment:
      - PORT=3446
      - BASE_URL=https://your-domain.com
      - JWT_SECRET=replace-with-a-strong-random-secret
      - CORS_ORIGIN=https://your-domain.com
      # Content Generator: free Gemini API key from https://aistudio.google.com/apikey
      - GEMINI_API_KEY=${GEMINI_API_KEY}
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "wget", "-qO-", "http://localhost:3446/api/health"]
      interval: 30s
      timeout: 10s
      start_period: 30s
      retries: 3

volumes:
  app-data:
  app-workspace:

# Generate a secure JWT secret
export JWT_SECRET=$(openssl rand -hex 32)

docker compose up -d

Behind a reverse proxy? The app trusts one upstream hop and reads the real client IP from X-Forwarded-For so rate limiting works per-client. The typical nginx/Traefik/Caddy → container topology works out of the box; if you stack multiple proxies, adjust app.set("trust proxy", N) in server/index.ts.

Local Development

Prerequisites: Node.js 20+

npm install
cp .env.example .env.local
# Edit .env.local and set JWT_SECRET
npm run dev

The app will be available at http://localhost:3000.

Default Credentials

On first run, a default admin account is created and required to change its password:

Username: admin
Password: admin

Change this immediately after first login.

Environment Variables

Variable	Default	Description
`PORT`	`3000` (Docker: `3446`)	Server listen port
`JWT_SECRET`	Random (regenerated on restart)	Secret for signing JWT tokens
`BASE_URL`	`http://localhost:{PORT}`	Public URL for generated image links
`CORS_ORIGIN`	`http://localhost:{PORT}` in development; required in production	Allowed CORS origin. Use a comma-separated list for multiple origins.
`DATA_DIR`	`./data` (Docker: `/data`)	Persistent storage directory
`WORKSPACE_DIR`	`{DATA_DIR}/workspace` (Docker: `/tmp/workspace`)	Temp directory for in-flight processing
`GEMINI_API_KEY`	(unset)	Free Gemini API key for the Content Generator — get one at https://aistudio.google.com/apikey

How It Works

Image Processing

Workflows move through a five-stage pipeline:

Ingest — Add images via file upload, URL fetch, or web scraping
Configure — Select or build a processing pipeline (crop, resize, scale, convert, rename)
Preview — Generate and review processed previews
Process — Run the full pipeline on selected images
Export — Download as ZIP or XLSX with public image URLs

Reporting

Pick a report type from the Reporting tab (currently Data_Missing_Report_Webvisible)
Select the country (AU or NZ)
Upload the latest Pimcore xlsx export — it's strict-validated against the canonical 18-column schema and the country code in every row
Download:
- Original — your raw upload, untouched
- Cleansed — junk columns dropped, dash-only rows filtered, sorted by IMSKU
- Delta — only the IMSKUs new since the previous upload, with cleansing applied (available once a second upload exists)

Every (report, country) keeps the most recent two uploads. The system is shared org-wide: any authenticated user sees the same current/previous and can replace it with a new upload.

Content Generator

Navigate to Content Generator in the header nav
Enter a Manufacturer, Part/Model No., and Description — or click + Add product to build a bundle (up to 5 products, any mix of manufacturers)
Choose whether to generate Marketing copy, Product description, or both
Choose a Length (Short / Medium / Long) for marketing copy and a Locale (en-GB for AU/NZ/SG, en-US for American markets)
Check Show key selling points inline (on by default) to get a single paste-ready block, or uncheck to keep marketing paragraphs and key selling points as separate editable sections
Click Generate — Gemini uses the configured prompt and grounding settings, then returns the requested artifacts in the chosen locale's spelling
Edit generated output directly, then copy as plain text or HTML where available (ready to paste into Pimcore's Source view)
Click Regenerate at any time; if you've edited the output you'll be asked to confirm before it's overwritten

GEMINI_API_KEY required. The feature returns 503 if the key is missing. A free key from Google AI Studio is sufficient.

Admin LLM Instructions

Admins can open Settings → LLM Instructions to manage Content Generator behavior without redeploying:

edit marketing and product-description system instructions
edit single-product and bundle user-message templates
choose an allowed Gemini model (gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash, gemini-1.5-flash, gemini-1.5-pro)
tune temperature and max output tokens
enable or disable Google Search grounding
reset fields to factory defaults
view history and revert to previous versions

Settings are stored in SQLite and read fresh for every generation. Prompt history stores full prompt text and is visible to admins, so do not put secrets in prompts.

Server-side validation protects the required template placeholders and output delimiters (===MARKETING===, ===BULLETS===, ===DESCRIPTION===, ===END===) that the response parser depends on.

Data Retention

Image-processing workflows (and all associated data — uploaded source images, generated previews, processed outputs, and database rows) are automatically deleted 7 days after creation, regardless of status. This keeps the SQLite database and /data volume lean.

Export your results before the 7-day window closes. Use the ZIP or XLSX export on the output stage to download everything you need. The purge runs hourly in the background and on server startup.

Reports are exempt from this purge. Each (report, country) holds the most recent and previous upload indefinitely; uploading a new version evicts the old previous and rotates the slots.

LLM settings and history are also exempt from workflow purge. They remain in SQLite until changed by an admin or manually removed from the database.

Users are soft-deleted. Deleting a user disables login and hides the account from user management, but keeps the row so historical report uploads and LLM setting edits can still show who performed the action. Deleted users appear as Name (Deleted) in historical metadata.

Tech Stack

Frontend: React 19, React Router v7, Tailwind CSS v4, shadcn/ui, Lucide icons, Motion
Backend: Express, better-sqlite3, Sharp (image processing), ExcelJS (xlsx read/write), JSZip, Multer, Cheerio (web scraping)
Auth + safety: JWT cookies (httpOnly, SameSite=lax, secure in prod), bcryptjs, express-rate-limit (tiered), SSRF protection on outbound fetches
Build: Vite, TypeScript (strict mode, noUncheckedIndexedAccess), Vitest
Deploy: Docker (node:20-slim), GitHub Actions CI/CD, GHCR

Development

npm run dev          # Start dev server (Express + Vite HMR)
npm run lint         # Type-check with tsc --noEmit
npm test             # Run tests
npm run build        # Production build

License

Released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.github/workflows		.github/workflows
components/ui		components/ui
docs		docs
lib		lib
server		server
src		src
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
components.json		components.json
db.ts		db.ts
docker-compose.yml		docker-compose.yml
index.html		index.html
metadata.json		metadata.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsconfig.server.json		tsconfig.server.json
vite.config.ts		vite.config.ts
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ProductLens

Features

Image Processing

Reporting

Content Generator

Platform

Quick Start

Docker (recommended)

Local Development

Default Credentials

Environment Variables

How It Works

Image Processing

Reporting

Content Generator

Admin LLM Instructions

Data Retention

Tech Stack

Development

License

About

Uh oh!

Releases 23

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ProductLens

Features

Image Processing

Reporting

Content Generator

Platform

Quick Start

Docker (recommended)

Local Development

Default Credentials

Environment Variables

How It Works

Image Processing

Reporting

Content Generator

Admin LLM Instructions

Data Retention

Tech Stack

Development

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 23

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages