Introduction
In 2026, PostgreSQL remains the top choice for scalable applications thanks to its robustness and advanced features. But in production, default settings often lead to bottlenecks: slow queries, excessive memory use, and limited scalability. This expert tutorial guides you step by step through optimizing a PostgreSQL 16+ database: server parameter tuning, table partitioning, specialized indexes (GIN for JSONB, BRIN for time-series data), advanced window functions, and precise analysis with EXPLAIN ANALYZE.
Why it matters: Successful optimization can boost performance 10x on million-row datasets, cut cloud costs, and ensure high availability. We provide complete, functional SQL scripts ready to copy-paste into psql. Ideal for DB architects managing mixed OLTP/OLAP workloads. Ready to supercharge your PostgreSQL? (132 words)
Prerequisites
- PostgreSQL 16+ installed and running (via Docker or native).
- psql client accessible.
- Advanced SQL knowledge (complex JOINs, CTEs).
- Server with at least 4 GB RAM for testing.
- Optional: pgAdmin or DBeaver for visualization.
Initial Tuning of postgresql.conf
shared_buffers = 256MB
work_mem = 64MB
maintenance_work_mem = 256MB
effective_cache_size = 768MB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
max_wal_size = 2GB
min_wal_size = 1GB
max_connections = 100
shared_preload_libraries = 'pg_stat_statements'These settings optimize memory and checkpoints for a mid-sized production server (4-8 GB RAM). Set shared_buffers to 25% of physical RAM; work_mem per connection prevents disk spills on sorts. Restart PostgreSQL after changes with pg_ctl reload.
Why Tune the Configuration?
Like a car engine, PostgreSQL needs fine-tuning to perform at its best. shared_buffers caches hot data in RAM, cutting disk I/O by 50-70%. work_mem handles memory-intensive operations (GROUP BY, ORDER BY); too high a value causes OOM kills. Test with pg_settings and adjust using formulas: effective_cache_size ≈ 75% of total RAM. Enable pg_stat_statements to monitor slow queries.
Creating the Database and Partitioned Table
CREATE DATABASE prod_opti;
\c prod_opti;
CREATE TABLE sales (
id SERIAL,
region TEXT NOT NULL,
sale_date DATE NOT NULL,
amount NUMERIC(10,2),
metadata JSONB
) PARTITION BY RANGE (sale_date);
CREATE TABLE sales_2024_01 PARTITION OF sales
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
CREATE TABLE sales_2024_02 PARTITION OF sales
FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
CREATE INDEX idx_sales_date ON sales (sale_date);
CREATE INDEX idx_sales_region ON sales (region);RANGE partitioning on sale_date splits the table into physical sub-tables, speeding up scans on recent dates (automatic pruning). JSONB for flexible metadata. Standard B-tree indexes on date/region for equalities and ranges. Run in psql for a ready database.
Partitioning: Scaling Big Volumes
Think of an highway: without exits (partitions), traffic jams everywhere. PostgreSQL prunes irrelevant partitions during queries, slashing scan times by 10x on 1 year of data. Add partitions dynamically via cron script. Limit: 1000 partitions max; use HASH for balancing if RANGE doesn't fit.
GIN Index for JSONB and Bulk Inserts
CREATE INDEX idx_sales_metadata_gin ON sales USING GIN (metadata);
-- Insertion de 10k lignes de test
DO $$
DECLARE
i INTEGER := 0;
BEGIN
WHILE i < 10000 LOOP
INSERT INTO sales (region, sale_date, amount, metadata)
VALUES (
ARRAY['EU', 'US', 'ASIA'][1 + (i % 3)],
'2024-01-' || LPAD((1 + (i % 31))::TEXT, 2, '0')::DATE,
1000 + (i % 1000)::NUMERIC(10,2),
jsonb_build_object('user_id', i, 'tags', ARRAY['sale', 'promo'] || CASE WHEN i % 2 = 0 THEN ARRAY['vip'] END)
);
i := i + 1;
END LOOP;
END $$;
VACUUM ANALYZE sales;GIN index on JSONB enables fast searches like metadata @> '{"tags": ["vip"]}' (containment operator). The DO block inserts 10k realistic rows with variety. VACUUM ANALYZE updates stats for the planner. GIN is ~10x faster than LIKE on JSON.
Specialized Indexes: GIN and BRIN
GIN shines on unordered structures (arrays, JSONB) with bitmap compression. For sorted time-series data (like sales), add BRIN: CREATE INDEX idx_sales_brin ON sales USING BRIN (sale_date); It stores summaries per block (64 pages), perfect for sequential scans, 10-100x smaller than B-tree.
BRIN Index and Optimized Window Query
CREATE INDEX CONCURRENTLY idx_sales_brin_date ON sales USING BRIN (sale_date);
-- Query avancée : top ventes par région/rank window
SELECT
region,
sale_date,
amount,
metadata->>'user_id' AS user_id,
RANK() OVER (PARTITION BY region ORDER BY amount DESC) AS rank,
SUM(amount) OVER (PARTITION BY region) AS total_region
FROM sales
WHERE sale_date >= '2024-01-01'
AND metadata @> '{"tags": ["vip"]}'
ORDER BY region, rank
LIMIT 10;BRIN accelerates WHERE on sorted dates. Window functions (RANK, SUM OVER) compute rankings/totals without costly subqueries. @> leverages GIN for JSONB filtering. CONCURRENTLY avoids locks in production.
Complex Queries and EXPLAIN ANALYZE
Window functions avoid self-joins, reducing O(n²) complexity. Always prefix with EXPLAIN (ANALYZE, BUFFERS) to time execution: look for Seq Scan (bad), Index Scan (good), and costs >100k. Buffers hit/miss show cache efficiency.
Performance Analysis with EXPLAIN and Triggers
EXPLAIN (ANALYZE, BUFFERS) SELECT COUNT(*) FROM sales WHERE metadata @> '{"tags": ["vip"]}';
-- Trigger pour audit auto
CREATE OR REPLACE FUNCTION audit_sales()
RETURNS TRIGGER AS $$
BEGIN
INSERT INTO sales_audit (id, operation, old_amount, new_amount, changed_at)
VALUES (COALESCE(OLD.id, NEW.id), TG_OP, OLD.amount, NEW.amount, NOW());
RETURN COALESCE(OLD, NEW);
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_audit_sales
AFTER INSERT OR UPDATE OR DELETE ON sales
FOR EACH ROW EXECUTE FUNCTION audit_sales();
CREATE TABLE sales_audit (id INT, operation TEXT, old_amount NUMERIC, new_amount NUMERIC, changed_at TIMESTAMP);EXPLAIN shows if GIN/BRIN is used (Index Scan). The PL/pgSQL trigger auto-audits without app logic, using FOR EACH ROW for granularity. Create sales_audit first. Triggers are lightweight with indexes on the audit table.
Triggers and PL/pgSQL for Business Logic
PL/pgSQL is a powerful procedural language: loops, conditions, exceptions. Triggers enforce integrity (auditing, deduplication) at the DB level, resilient to app crashes. Avoid recursive triggers (stack overflow).
Custom Function and Performance Cleanup
CREATE OR REPLACE FUNCTION get_top_sales(p_region TEXT, p_date DATE)
RETURNS TABLE (user_id TEXT, amount NUMERIC, rank INT) AS $$
BEGIN
RETURN QUERY
SELECT
metadata->>'user_id',
amount,
ROW_NUMBER() OVER (ORDER BY amount DESC)
FROM sales
WHERE region = p_region AND sale_date = p_date
AND metadata ? 'tags'; -- Existe clé
RETURN;
END;
$$ LANGUAGE plpgsql STABLE;
-- Nettoyage perf
SELECT pg_stat_reset(); -- Reset stats
ANALYZE VERBOSE sales; -- Stats détaillées
REINDEX INDEX CONCURRENTLY idx_sales_metadata_gin;Parameterized reusable function with ? (exists) on JSONB. STABLE for query plan caching. pg_stat_reset for clean benchmarks; REINDEX rebuilds without downtime. Call: SELECT * FROM get_top_sales('EU', '2024-01-15');
Best Practices
- Index selectively: analyze with pg_stat_user_indexes (>5% usage).
- Automate VACUUM via autovacuum tuning (autovacuum_vacuum_scale_factor=0.05).
- Monitor with pgBadger on logs for slow queries.
- Use connection pooling (PgBouncer): max_connections=100 max.
- Incremental backups with pg_basebackup + WAL archiving.
Common Mistakes to Avoid
- Too much work_mem: causes swapping; limit per session via SET.
- Forget ANALYZE after bulk inserts: stale stats → bad plans.
- Indexes on low-cardinality columns: slow bitmap scans; prefer partial indexes.
- Triggers on hot tables without indexes: doubles writes.
Next Steps
- Official docs: PostgreSQL 16 Performance.
- Tools: pgHero for dashboards.
- Extensions: TimescaleDB for time-series, PostGIS for geo.
- Learni Group Training on advanced PostgreSQL and DB architecture.