Skip to content

farhatraiyan/elastic-artifact-engine

Repository files navigation

Elastic Artifact Engine

Standard synchronous browser-rendering services are notorious for high idle costs and OOM (Out-Of-Memory) crashes during request bursts. This engine is architected to handle artifact rendering asynchronously, designed specifically for high-scale automation workflows. By leveraging KEDA and an Asynchronous Request-Reply pattern, the platform gracefully absorbs sudden high-concurrency bursts while providing a scale-to-zero infrastructure. This architecture eliminates the thundering herd problem common in resource-intensive cloud browser hosting while maintaining near-zero idle costs.

🏗️ Architecture

Implements the Asynchronous Request-Reply pattern with queue-based load leveling to manage headless browser clusters at scale.

  • Decoupled Ingestion: Immediate acknowledgment via HTTP Ingress; background processing via Queue Storage.
  • Stable Execution: Containerized Playwright ensures identical rendering environments.
  • Scale-to-Zero: Deployed on Azure Container Apps (ACA) with KEDA queue-length scaling.
  • Markdown Extraction: Mozilla Readability integration for clean, LLM-optimized content.

View Architecture Diagram

System Architecture Diagram

📂 Structure

/elastic-artifact-engine
├── .github/workflows/         # CI/CD: QA pipelines
├── infrastructure/            # Azure Bicep IaC modules
├── packages/                  # Shared Logic/Types
│   ├── azure-adapters/        # Shared Azure infrastructure logic
│   └── shared-types/          # Shared job and status schemas
├── scripts/                   # Dev tooling
├── services/                  # Microservices
│   ├── browser-orchestrator/  # Playwright render worker (ACA)
│   └── ingress-api/           # HTTP gateway (AFA)
└── web/                       # Manual submission UI (planned)

🛠️ Tech Stack

  • Language: TypeScript (Node.js 20+)
  • Automation: Playwright (Chromium)
  • Compute: Azure Container Apps (Worker), Azure Functions (Ingress)
  • Storage: Azure Blob (Output), Azure Queue (Jobs), Azure Table (Metadata)
  • DevOps: Docker, Azure Bicep, GitHub Actions

🚦 Status

  • Shared Type System: Unified Zod contracts.
  • Core Worker Engine: Playwright + Azure Storage adapters.
  • Containerization: Playwright Docker image.
  • HTTP Ingress (AFA): Job submission and polling.
  • IaC: Bicep modules for Identity, Storage, ACR, Functions, and ACA.
  • Identity: DefaultAzureCredential adapter migration.
  • Web UI: Dashboard for submission/inspection.

💻 Local Development

Prerequisites

  • Node.js v20+
  • Docker
  • Playwright: npx playwright install chromium

Setup

npm install --legacy-peer-deps
npm run build

🏃 Commands

Command Description
npm run azurite:up Starts Azurite and initializes storage resources.
npm run start Starts background services (Ingress + Worker) via PM2. Requires Azurite.
npm run ingress --workspace @elastic-artifact-engine/browser-orchestrator -- <url> [type] Submits a render job via CLI.
npx pm2 status View service status.
npx pm2 logs Tail service logs.
npm run teardown Stops PM2 services and Azurite.
npm test --workspace <name> Runs isolated workspace tests.
npm run test:engine Runs E2E integration tests.

☁️ Cloud Deployment

See infrastructure/README.md for Azure deployment instructions.

About

A high-scale, cloud-native browserless web capture service designed for complex automation workflows.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors