How do proxies handle queries that span multiple partitions?

Proxies typically forward multi-partition queries to a dedicated aggregation node or federated execution engine, as they lack native cross-shard join capabilities. SQL-aware proxies like ProxySQL can rewrite or fan out queries, but the merge step must happen in an upstream coordinator.

What is the latency overhead of adding a proxy routing layer?

Typically 0.5–2ms per request, depending on payload parsing complexity, TLS termination, and connection pool efficiency. This overhead is generally offset by improved routing accuracy and failover speed, particularly under topology changes.

Proxy Routing Architectures for Partitioned Databases

Q: When should I use a database proxy instead of application-level routing?

Use a proxy when you need centralized connection pooling, dynamic topology updates without client redeployment, or unified observability across heterogeneous database engines. Application-level routing is preferable when you need sub-millisecond routing decisions or when the application already owns the partition key extraction logic.

Proxy routing architectures sit between application servers and partitioned database backends, centralising connection management, shard selection, and failover logic in a dedicated network tier. This page is part of Cross-Partition Querying & Aggregation Strategies and covers when and how to operate a proxy layer, contrasting it with application-level sharding logic where the client owns routing decisions.

Problem Framing

At modest scale a single write primary handles routing implicitly. Once you cross roughly 10 M rows per table or 5 k concurrent connections, the routing problem becomes operational: you need topology-aware connection pooling, zero-downtime shard rebalancing, and a single place to enforce read/write splitting — without redeploying application code every time a shard moves.

The three scenarios where a proxy tier consistently pays for itself:

Dynamic topology changes. Adding or retiring shards without forcing client-side config pushes.
Heterogeneous engine mix. Routing writes to PostgreSQL primaries and reads to read replicas or analytics nodes behind a single connection string.
Centralised observability. Capturing per-shard latency, error rates, and connection saturation metrics without instrumenting every service.

Architecture Overview

The diagram below shows the proxy tier as a stateless layer between app servers and sharded backends. Read replicas receive analytical traffic via a separate backend pool; the admin API allows hot-loading of routing maps without restarting the proxy.

Proxy Tool Comparison

Not all proxies offer the same capabilities. Choosing the wrong one for your query patterns is the most common architectural mistake.

Proxy	Protocol awareness	SQL-aware routing	Connection pooling	Hot config reload
HAProxy	TCP only	No (needs Lua)	No	Yes (via socket)
PgBouncer	PostgreSQL	No	Yes	Partial (SIGHUP)
ProxySQL	MySQL	Yes (query rules)	Yes	Yes (admin port)
Pgpool-II	PostgreSQL	Yes (read/write split)	Yes	Yes
Odyssey	PostgreSQL	Limited	Yes	Yes

Implementation Walkthrough

Step 1 — Configure the Connection Pool

Stateless proxy tiers must be tuned to backend capacity first. Under-provisioned pools cause connection storms; over-provisioned ones exhaust backend file descriptors. The following PgBouncer configuration targets a 3-shard cluster with 25 app-side connections per shard:

[databases]
shard1 = host=10.0.1.10 port=5432 dbname=app_db
shard2 = host=10.0.1.11 port=5432 dbname=app_db
shard3 = host=10.0.1.12 port=5432 dbname=app_db

[pgbouncer]
pool_mode = transaction
max_client_conn = 500
default_pool_size = 25
reserve_pool_size = 5
reserve_pool_timeout = 3
server_idle_timeout = 600
server_lifetime = 3600
tcp_keepalive = 1
tcp_keepidle = 60
tcp_keepintvl = 10
log_connections = 0
log_disconnections = 0

The reserve_pool_size = 5 entries absorb burst traffic without opening net-new backend connections under load.

Step 2 — Define Read/Write Routing Rules (ProxySQL)

For MySQL-backed sharded clusters, ProxySQL’s query rules engine routes at the SQL level without requiring application changes. Insert rules via the admin port:

-- Route writes to hostgroup 0 (primaries)
INSERT INTO mysql_query_rules (rule_id, active, match_pattern, destination_hostgroup, apply)
VALUES (1, 1, '^(INSERT|UPDATE|DELETE|REPLACE)', 0, 1);

-- Route reads to hostgroup 1 (replicas)
INSERT INTO mysql_query_rules (rule_id, active, match_pattern, destination_hostgroup, apply)
VALUES (2, 1, '^SELECT', 1, 1);

-- Reload the rules without restarting
LOAD MYSQL QUERY RULES TO RUNTIME;
SAVE MYSQL QUERY RULES TO DISK;

ProxySQL’s match_pattern field accepts POSIX regular expressions, so tenant-scoped routing (WHERE tenant_id = 'acme') can route to a dedicated hostgroup for that tenant’s primary.

Step 3 — Inject a Shard Routing Map via the Admin API

Shard-aware middleware parses incoming SQL payloads to extract partition keys before forwarding. When a new shard goes live, push an updated routing map without restarting the proxy:

curl -X POST http://proxy-admin:8080/api/v1/routing-map \
  -H 'Content-Type: application/json' \
  -d '{
    "partition_key_range": "1000-2000",
    "target_backend": "shard_3",
    "weight": 100,
    "ttl": 3600
  }'

Hot-reload sequence to avoid connection drops:

Validate the new partition key range against the existing distribution — overlap causes duplicate-write bugs.
Push the updated routing map via the administrative API.
Wait for health checks on the new target node to reach UP status (typically 3 consecutive passing checks).
Drain legacy connections gracefully: set old route weight to 0, wait for active connection count to reach zero, then remove the entry.

Step 4 — Configure HAProxy for TCP-Level Shard Routing

When SQL parsing is not needed — for example, when the application pre-selects a port per shard — HAProxy in TCP mode provides low-overhead connection routing with robust health checking:

global
    maxconn 50000
    log /dev/log local0

defaults
    mode tcp
    timeout connect 5s
    timeout client 60s
    timeout server 60s
    option tcplog

frontend db_proxy
    bind *:5432
    mode tcp
    use_backend db_shard_1 if { dst_port 5432 }
    use_backend db_shard_2 if { dst_port 5433 }
    default_backend db_shard_1

backend db_shard_1
    mode tcp
    balance leastconn
    option tcp-check
    default-server inter 5s fall 3 rise 2 on-marked-down shutdown-sessions
    server node1a 10.0.1.10:5432 check
    server node1b 10.0.1.13:5432 check backup

backend db_shard_2
    mode tcp
    balance leastconn
    option tcp-check
    default-server inter 5s fall 3 rise 2 on-marked-down shutdown-sessions
    server node2a 10.0.1.11:5432 check
    server node2b 10.0.1.14:5432 check backup

The on-marked-down shutdown-sessions directive closes existing connections to a failed backend immediately, preventing the proxy from silently routing to a dead node.

Step 5 — Instrument the ORM Connection String

Application servers point at the proxy tier, not individual shards. Configure SQLAlchemy (or any ORM) to treat the proxy as the single database endpoint:

# SQLAlchemy connection pool targeting PgBouncer in transaction mode
DATABASE_URL: "postgresql://app_user:[email protected]:6432/app_db"
POOL_SIZE: 5
MAX_OVERFLOW: 2
POOL_TIMEOUT: 30
POOL_RECYCLE: 1800
POOL_PRE_PING: true

POOL_SIZE must be well below default_pool_size in PgBouncer. If every application process opens 25 connections and you have 20 processes, PgBouncer’s max_client_conn needs headroom above 500 or connections queue.

Configuration Reference

Parameter	Recommended value	Rationale
`pool_mode` (PgBouncer)	`transaction`	Multiplexes connections per statement; safe for stateless apps
`default_pool_size`	25 per shard	Matches typical PostgreSQL `max_connections / num_shards` target
`reserve_pool_size`	5	Absorbs burst without exhausting backend file descriptors
`server_lifetime`	3600 s	Prevents idle connections from accumulating on the backend
`tcp_keepidle`	60 s	Detects dead connections before they cause silent query hangs
`tcp_keepintvl`	10 s	Rapid re-probe once idle threshold is hit
HAProxy `inter`	5 s	Health check interval — balance sensitivity vs. check overhead
HAProxy `fall`	3	Requires 3 consecutive failures before marking a backend down
HAProxy `rise`	2	Requires 2 consecutive passes before restoring a recovered node

Operational Contrast

Compared to application-level sharding logic, where the client owns the routing decision and holds direct connections to each shard, a proxy tier trades a small latency cost (0.5–2 ms) for:

Centralised pool governance — one place to tune max_connections rather than every service instance.
Zero-downtime topology changes — routing map updates propagate from the proxy outward, not from a config push to dozens of application pods.
Cross-engine compatibility — PgBouncer can sit in front of Postgres; ProxySQL in front of MySQL; HAProxy in front of either, using the same operational runbook.

Federated query execution handles the harder case where a single query must aggregate results from multiple shards. The proxy does not merge results — it routes to the shard that owns the data, or to a dedicated aggregation node for cross-partition queries. Treat cross-shard aggregation patterns as the upstream concern; proxy routing as the connection management layer beneath it.

Failure Modes

1. Connection Storm on Proxy Restart

Root cause: All application connections attempt reconnection simultaneously after a proxy restart or failover, exhausting the backend’s max_connections within seconds.

Detection:

rate(pgbouncer_client_wait_seconds_sum[1m]) / rate(pgbouncer_client_wait_seconds_count[1m]) > 2

Mitigation: Enable server_reset_query_always = 1 in PgBouncer and stagger restart timing using haproxy -sf $(pidof haproxy) for in-place reload. Add exponential backoff with jitter in the ORM reconnect logic.

2. Stale Routing Map After Shard Rebalancing

Root cause: The proxy’s in-memory routing table still references the old shard for a key range that has been migrated. Writes land on the previous owner and are silently accepted, causing a dual-write state.

Detection:

increase(proxy_routing_map_staleness_total[5m]) > 0

Mitigation: Enforce a two-phase routing update: first mark the old route as read-only (writes rejected), migrate the data, then promote the new route. Never delete the old route entry before migration completes.

3. Read Replica Lag Causing Dirty Reads

Root cause: The proxy routes a read immediately after a write to the replica pool. The replica has not yet applied the write, returning stale data. This is especially pronounced with PgBouncer in session mode when synchronous_commit = off.

Detection:

pg_replication_lag_seconds{role="replica"} > 5

Mitigation: For session-critical reads, use SET SESSION CHARACTERISTICS AS TRANSACTION READ WRITE to force routing to the primary, or implement sticky routing that pins a session to the primary for a configurable window after any write.

4. Health Check False Positives Cycling Backends

Root cause: A busy backend briefly exceeds the TCP connection timeout during a traffic spike, triggering the fall threshold. The proxy marks it down, shifts load to remaining backends, which then also spike — cascading failure.

Detection:

increase(haproxy_backend_status{state="DOWN"}[5m]) > 2

Mitigation: Increase inter from 2 s to 5 s and fall from 2 to 3. Add slowstart 30s to HAProxy backends so recovering nodes receive a linearly increasing share of traffic rather than an immediate full load.

Common Mistakes

Routing by full table scan instead of partition key. Proxies that broadcast a query to all shards when the partition key is absent in the WHERE clause create O(n) load. Enforce partition key extraction before dispatch; reject or alert on table-scan-class queries.
No POOL_PRE_PING on the ORM side. Without a pre-ping, the ORM reuses stale connections after a proxy restart, leading to cryptic SSL SYSCALL error: EOF detected errors. Enable POOL_PRE_PING = true in SQLAlchemy or equivalent in your ORM.
Running PgBouncer in session mode for OLTP workloads. Session mode holds a backend connection for the duration of a client session, negating connection multiplexing. Use transaction mode for stateless microservices.
Missing backup server entries in HAProxy. Without check backup on standby nodes, a failed primary takes down the entire backend pool rather than failing over to the replica.

FAQ

When should I use a proxy instead of application-level routing?

Use a proxy when you need centralised connection pooling, dynamic topology updates without client redeployment, or unified observability across heterogeneous database engines. Application-level routing is preferable when you need sub-millisecond routing decisions or when the application already owns the partition key extraction logic and direct shard access simplifies the operational surface.

How do proxies handle queries that span multiple shards?

Most proxies cannot merge results from multiple backends. HAProxy and PgBouncer route a connection to exactly one backend per session. ProxySQL has limited fan-out via its multiplex option, but the merge step still requires an upstream coordinator. For true multi-shard aggregation, route the query to a federated query engine and treat the proxy as the connection pool in front of it.

What is the realistic latency overhead of the proxy tier?

Expect 0.5–2 ms per round trip, covering TLS termination, connection pool lookup, and routing table evaluation. This is typically offset by reduced backend connection overhead — a single proxy with 25 backend connections can front hundreds of client connections, lowering the total PostgreSQL max_connections pressure significantly.

Cross-Partition Querying & Aggregation Strategies — parent overview covering the full query execution landscape
Application-Level Sharding Logic — client-owned routing as an alternative to the proxy tier
Federated Query Execution — how multi-shard queries are coordinated upstream of the proxy
Cross-Shard Aggregation Patterns — scatter-gather and coordinator placement for aggregation workloads
Building a Custom Query Router with HAProxy — step-by-step HAProxy configuration with Lua-based SQL inspection

Articles in This Section