You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 12, 2026. It is now read-only.
pgBouncer pools connections on the Postgres side, but each HTTP request still opens a new TCP socket from the Python worker to pgBouncer (psycopg2.connect() in frappe/database/database.py:119). On k8s this crosses pod/service boundaries, adding 5-20ms per request baseline and much worse under load when max_client_conn saturates and requests queue. This is a constant tax on every wsgi.start_response and a multiplier on our p95s.
Details
Modify frappe/database/postgres/database.py — replace raw psycopg2.connect() in get_connection() with a per-worker connection pool (psycopg2.pool.ThreadedConnectionPool or SimpleConnectionPool)
Modify frappe/database/database.py — update connect() to pull from pool, update teardown path to return connections to pool instead of closing
Ensure frappe.destroy() in frappe/app.py returns the connection to the pool rather than closing it
Preserve ISOLATION_LEVEL_READ_COMMITTED on pooled connections
Pool size configurable via site_config.json (e.g. "db_pool_size": 5) with a sensible default
Remove the Datahenge TODO at database.py:121
Testing
Unit: Verify pooled connections are reused across sequential requests within a worker (connection ID should repeat)
Unit: Verify isolation level is correctly set on pooled connections
Integration: Load test before/after — measure wsgi.start_response p50/p95/p99 under concurrency
Integration: Verify no connection leaks under error conditions (exceptions mid-request should still return connection to pool)
Integration: Confirm pgBouncer connection count drops via pg_stat_activity
Requirements for Done
frappe/database/postgres/database.py and frappe/database/database.py updated
Connection reuse confirmed via pg_stat_activity (fewer connections, longer-lived)
p95 wsgi.start_response measurably reduced under load
No connection leaks after sustained traffic
frappe.destroy() properly returns connections
Pool size configurable via site_config.json
Deployed to staging, soaked under production-like load before prod deploy
Rationale
pgBouncer pools connections on the Postgres side, but each HTTP request still opens a new TCP socket from the Python worker to pgBouncer (
psycopg2.connect()infrappe/database/database.py:119). On k8s this crosses pod/service boundaries, adding 5-20ms per request baseline and much worse under load whenmax_client_connsaturates and requests queue. This is a constant tax on everywsgi.start_responseand a multiplier on our p95s.Details
frappe/database/postgres/database.py— replace rawpsycopg2.connect()inget_connection()with a per-worker connection pool (psycopg2.pool.ThreadedConnectionPoolorSimpleConnectionPool)frappe/database/database.py— updateconnect()to pull from pool, update teardown path to return connections to pool instead of closingfrappe.destroy()infrappe/app.pyreturns the connection to the pool rather than closing itISOLATION_LEVEL_READ_COMMITTEDon pooled connectionssite_config.json(e.g."db_pool_size": 5) with a sensible defaultdatabase.py:121Testing
wsgi.start_responsep50/p95/p99 under concurrencypg_stat_activityRequirements for Done
frappe/database/postgres/database.pyandfrappe/database/database.pyupdatedpg_stat_activity(fewer connections, longer-lived)wsgi.start_responsemeasurably reduced under loadfrappe.destroy()properly returns connectionssite_config.json