Skip to content
This repository was archived by the owner on May 12, 2026. It is now read-only.
This repository was archived by the owner on May 12, 2026. It is now read-only.

Implemented client-side connection pooling to eliminate per-request TCP overhead to pgBouncer #6

@kangelopoulos

Description

@kangelopoulos

Rationale

pgBouncer pools connections on the Postgres side, but each HTTP request still opens a new TCP socket from the Python worker to pgBouncer (psycopg2.connect() in frappe/database/database.py:119). On k8s this crosses pod/service boundaries, adding 5-20ms per request baseline and much worse under load when max_client_conn saturates and requests queue. This is a constant tax on every wsgi.start_response and a multiplier on our p95s.

Details

  • Modify frappe/database/postgres/database.py — replace raw psycopg2.connect() in get_connection() with a per-worker connection pool (psycopg2.pool.ThreadedConnectionPool or SimpleConnectionPool)
  • Modify frappe/database/database.py — update connect() to pull from pool, update teardown path to return connections to pool instead of closing
  • Ensure frappe.destroy() in frappe/app.py returns the connection to the pool rather than closing it
  • Preserve ISOLATION_LEVEL_READ_COMMITTED on pooled connections
  • Pool size configurable via site_config.json (e.g. "db_pool_size": 5) with a sensible default
  • Remove the Datahenge TODO at database.py:121

Testing

  • Unit: Verify pooled connections are reused across sequential requests within a worker (connection ID should repeat)
  • Unit: Verify isolation level is correctly set on pooled connections
  • Integration: Load test before/after — measure wsgi.start_response p50/p95/p99 under concurrency
  • Integration: Verify no connection leaks under error conditions (exceptions mid-request should still return connection to pool)
  • Integration: Confirm pgBouncer connection count drops via pg_stat_activity

Requirements for Done

  • frappe/database/postgres/database.py and frappe/database/database.py updated
  • Connection reuse confirmed via pg_stat_activity (fewer connections, longer-lived)
  • p95 wsgi.start_response measurably reduced under load
  • No connection leaks after sustained traffic
  • frappe.destroy() properly returns connections
  • Pool size configurable via site_config.json
  • Deployed to staging, soaked under production-like load before prod deploy

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions