Skip to content

minvoker/procwatch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

procwatch

A process supervisor written in Python. Define a set of services in a YAML config, and procwatch will start them, watch them, restart them when they crash, and stream their logs to disk. There's a live terminal dashboard and a small HTTP API for controlling things at runtime.

Linux only — uses /proc/{pid}/stat for metrics and POSIX signals for shutdown.

tui

install

requires Python 3.11+ and Linux

pip install -e .

run

procwatch examples/sample.yaml

From another terminal:

curl localhost:8080/status

Ctrl+C to stop. procwatch will SIGTERM all child processes, wait 5 seconds, then SIGKILL anything still running.

config

log_dir: /tmp/procwatch/logs
api_port: 8080

services:
  - name: web
    command: "python3 server.py"
    restart: always
    max_restarts: 5
    backoff_base: 1.0
    env:
      PORT: "9000"
    cwd: /opt/myapp

  - name: worker
    command: "python3 worker.py"
    restart: on-failure
    max_restarts: 3

  - name: migrate
    command: "python3 manage.py migrate"
    restart: never

restart options:

  • always — restart regardless of exit code
  • on-failure — only restart if exit code != 0
  • never — run once

Backoff doubles each time: 1s, 2s, 4s... capped at 30s.

logs

Each service gets its own log files:

/tmp/procwatch/logs/
  web.stdout.log
  web.stderr.log

Internal procwatch logs go to /tmp/procwatch/procwatch.log.

API

GET  /routes                        list all endpoints
GET  /status                        all services + state, pid, cpu, memory, uptime
POST /services/{name}/start         start a stopped service
POST /services/{name}/stop          stop a running service
POST /services/{name}/restart       restart a service

Example:

curl localhost:8080/status
curl -X POST localhost:8080/services/web/restart

/status response:

[
  {
    "name": "web",
    "state": "running",
    "pid": 12345,
    "restart_count": 1,
    "uptime_seconds": 42.3,
    "memory_kb": 18432
  }
]

States: pending running restarting stopped failed

failed means the service hit max_restarts and won't be retried.

how it works

Each service runs in its own asyncio.Task. The supervisor sits in the same event loop and waits on a stop event that gets set when SIGTERM or SIGINT arrives.

When a process exits, its worker checks the restart policy — if it should restart, it waits the backoff period and spawns again. stdout and stderr are drained concurrently with asyncio.gather alongside proc.wait(), so a process writing a lot of output doesn't block anything else.

Shutdown sends SIGTERM to the whole process group (not just the top-level PID) so child-of-child processes get cleaned up too. If they don't exit within 5 seconds they get SIGKILL.

Memory and CPU come from /proc/{pid}/stat — no psutil. CPU is calculated by diffing the cumulative tick counters between samples, which is the same approach top uses.

About

process supervisor for linux with async python, /proc metrics, simple REST API.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages