Capacity planning

Capacity in NexisOmni is bounded by database connections, not disk. Because every tenant lives in its own database (or its own schema on the shared tier), the platform’s headroom comes down to how many Postgres backends peak trading concurrency demands - and how well PgBouncer consolidates them. This page orients you to the levers; the full procedure, load test, and acceptance checks live in NexisOmni/docs/ops/tenancy-capacity-runbook.md.

The real constraint: connections, not disk

PostgreSQL is process-per-connection, so the binding limit is max_connections (default 100, tuned 200-500). Under database-per-tenant, each distinct tenant connection string opens its own connection pool, so without a pooler the ceiling is:

peak concurrent active tenants x MaxPoolSize  <=  Postgres max_connections

With the defaults (MaxPoolSize=20, max_connections=200) that is roughly 8-10 fully-busy tenants. Idle tenants hold zero backends, so far more can be registered - the wall is peak concurrency, and for a POS that concurrency is correlated because shops trade in the same hours.

How the tiers and the pooler change capacity

NexisOmni tenants sit on one of two tiers, and only the shared tier goes through PgBouncer.

Path	Routes to	Why
Shared-tier runtime requests	PgBouncer (transaction pooling)	Many tenants share a small set of backends; server connection returns to the pool at each commit
Dedicated-tier runtime requests	Direct to Postgres	Sole occupant of its own database keeps session-level features; its backend footprint is already bounded per tenant
Provisioning, migrations, advisory locks	Direct to Postgres	A transaction pooler cannot hold a session lock or run `CREATE DATABASE`

With PgBouncer in front of the shared tier, the model becomes:

active tenant DBs x PgBouncer default_pool_size  <=  Postgres max_connections

Because server connections are transaction-scoped, effective client concurrency far exceeds the backend count. This takes one server from roughly tens to roughly hundreds of active shared tenants. Beyond that, the ceiling is CPU and IO.

What to watch

The health signals that tell you whether capacity is holding:

Postgres client backend count stays under max_connections, and far below tenants x MaxPoolSize, throughout load.
p95 request latency holds under load, with no connection-exhaustion errors.
Idle tenants drop their backends within server_idle_timeout.
A shared tenant’s queries only ever touch its own schema (no cross-tenant rows) - the search_path isolation check.

When to scale or graduate

Three moves, in order of what saturates first:

Put PgBouncer in front of the shared tier once a single server approaches its unpooled ceiling of a few dozen active tenants. This is the largest single jump in headroom.
Graduate a hot tenant to the dedicated tier when one tenant outgrows sharing. Graduation builds a dedicated database for that tenant, on its current server by default or on a named target server passed as ?server= on the graduate call. See NexisOmni/docs/ops/tenant-graduation.md.
Add servers to the placement pool when one server’s CPU, IO, or operational windows saturate (low hundreds of databases, hardware-dependent). New tenants spread across the pool by LeastLoaded or RoundRobin; existing tenants stay where they are. The connection-budget formula then applies per server.

Where the full runbook lives

This page is an orientation. The authoritative procedure - bringing PgBouncer up, the auth_query and per-tenant-role setup, the k6 load test, the acceptance checklist, and the multi-server placement rules - is NexisOmni/docs/ops/tenancy-capacity-runbook.md. Read it alongside ADR-0020 (per-tenant pool bounds) and ADR-0022 (the scaling ladder).