Skip to content
Learni
View all tutorials
Bases de données

How to Optimize PostgreSQL for Production in 2026

Lire en français

Introduction

PostgreSQL, the leading open-source RDBMS, shines in production thanks to its robustness and advanced features. In 2026, with exploding data volumes from IoT, AI, and logs, optimization is critical to avoid bottlenecks. This expert tutorial guides you step-by-step to implement temporal partitioning, GIN/BRIN indexes for JSONB, full-text search, materialized views, and EXPLAIN ANALYZE analysis. Imagine slashing query times by 90% on tables with millions of rows, like tuning a V12 engine. Every step includes complete, ready-to-run SQL code for PostgreSQL 17+. By the end, your database will be ready for horizontal scaling and intensive workloads.

Prerequisites

  • PostgreSQL 17+ installed (Docker: docker run -p 5432:5432 -e POSTGRES_PASSWORD=pass postgres:17)
  • SQL client: psql or pgAdmin
  • Advanced SQL knowledge (complex JOINs, CTEs)
  • Test dataset: 1M+ simulated rows
  • Extensions enabled: CREATE EXTENSION IF NOT EXISTS pg_trgm; (optional)

Initialize the database and create a partitioned table

01_init_partitions.sql
CREATE DATABASE perf_test;
\c perf_test;

-- Parent table partitioned by range (date)
CREATE TABLE sales (
  id SERIAL,
  sale_date DATE NOT NULL,
  amount DECIMAL(10,2),
  customer_id INT,
  metadata JSONB,
  PRIMARY KEY (id, sale_date)
) PARTITION BY RANGE (sale_date);

-- Partitions for 2025-2026
CREATE TABLE sales_2025 PARTITION OF sales
  FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');
CREATE TABLE sales_2026 PARTITION OF sales
  FOR VALUES FROM ('2026-01-01') TO ('2027-01-01');

-- Insert 100k test rows (simulated)
INSERT INTO sales (sale_date, amount, customer_id, metadata)
SELECT
  ('2025-01-01'::DATE + (random() * 365)::INT) AS sale_date,
  random() * 1000 AS amount,
  (random() * 1000)::INT AS customer_id,
  jsonb_build_object('category', 'tech', 'tags', array['laptop', 'sale']) AS metadata
FROM generate_series(1, 100000);

This script creates a perf_test DB and a sales table partitioned by sale_date, limiting scans to one partition per query. The 100k test rows simulate a real workload with JSONB. Pitfall: Forgetting PK constraints on partition keys allows duplicates.

Why and how partitioning boosts performance

Partitions physically divide data like organized drawers in a giant desk. For date-filtered queries (90% of production cases), partition pruning skips irrelevant ones, cutting I/O by 80%. Pro tip: Maintenance like DROP partition without global locks. Test with EXPLAIN SELECT * FROM sales WHERE sale_date >= '2025-06-01'; – only one partition scanned.

Advanced indexes: GIN for JSONB and BRIN for dates

02_indexes.sql
CREATE INDEX idx_sales_date_brin ON sales USING BRIN (sale_date);
CREATE INDEX idx_sales_amount_btree ON sales (amount) WHERE amount > 500;
CREATE INDEX idx_sales_metadata_gin ON sales USING GIN (metadata);
CREATE INDEX idx_sales_customer_trgm ON sales USING gin (customer_id gin_trgm_ops); -- Requires pg_trgm

-- ANALYZE for accurate stats
ANALYZE sales;

-- Vacuum to reclaim space
VACUUM ANALYZE sales_2025;

BRIN is perfect for sorted/temporal data (10x compression), GIN excels on JSONB for ?| or @>. Partial index filters amount > 500. Enable pg_trgm for trigram similarity. Pitfall: Over-indexing bloats storage; monitor with pg_stat_user_indexes.

Strategic index choices for varied workloads

BRIN: Correlated/increasing columns (dates, sequential IDs), low overhead. GIN: Arrays/JSONB/full-text, heavier but fast on containment. BTREE: Standard equality/range. Measure impact with EXPLAIN (ANALYZE, BUFFERS) before/after. In production, partial indexes cut size by 50% for filtered queries.

Full-Text Search with ts_vector and ranking

03_fts.sql
-- Add FTS columns
ALTER TABLE sales ADD COLUMN tsv tsvector;
CREATE INDEX idx_sales_tsv_gin ON sales USING GIN(tsv);

-- Trigger for auto-update (french config for French)
CREATE OR REPLACE FUNCTION sales_fts_trigger() RETURNS trigger AS $$
BEGIN
  NEW.tsv := to_tsvector('french', coalesce(metadata::text, ''));
  RETURN NEW;
END$$ LANGUAGE plpgsql;

CREATE TRIGGER sales_fts_update BEFORE INSERT OR UPDATE ON sales
  FOR EACH ROW EXECUTE FUNCTION sales_fts_trigger();

-- Refresh on existing data
UPDATE sales SET metadata = metadata WHERE id <= 100000;

-- FTS query with ranking
SELECT id, sale_date, ts_rank(tsv, query) AS rank
FROM sales, plainto_tsquery('french', 'laptop vente') query
WHERE tsv @@ query
ORDER BY rank DESC
LIMIT 10;

ts_vector indexes text (French via 'french'), trigger auto-updates. Ranking prioritizes relevance. GIN speeds up @@. Pitfall: Without trigger, FTS goes stale; test stemming (french handles accents).

Full-Text Search: Beyond LIKE for relevance

Replace slow, non-stemming LIKE '%laptop%' with FTS: supports ranking, stemming, weights. Analogy: Google vs basic search. Scales to billions of docs with pg_bigm or pg_trgm for fuzzy matching.

Materialized views for fast analytics

04_materialized.sql
-- Materialized view: Monthly revenue
CREATE MATERIALIZED VIEW sales_monthly_mv AS
SELECT
  DATE_TRUNC('month', sale_date) AS month,
  SUM(amount) AS total,
  COUNT(*) AS count,
  AVG(amount) AS avg
FROM sales
GROUP BY month;

CREATE UNIQUE INDEX idx_mv_month ON sales_monthly_mv (month);

-- Fast query
SELECT * FROM sales_monthly_mv WHERE month >= '2025-06-01';

-- Periodic refresh (cron/job)
REFRESH MATERIALIZED VIEW CONCURRENTLY sales_monthly_mv;

Matviews precompute aggregates, 100x faster than live queries. CONCURRENTLY avoids locks. Unique index on key. Pitfall: Refresh is costly; schedule via pg_cron (SELECT cron.schedule('0 2 *', 'REFRESH...');).

Window functions and CTEs for expert analytics queries

05_window_cte.sql
WITH ranked_sales AS (
  SELECT
    sale_date,
    amount,
    customer_id,
    ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY sale_date DESC) as rn,
    LAG(amount) OVER (PARTITION BY customer_id ORDER BY sale_date) as prev_amount
  FROM sales
  WHERE sale_date >= '2025-01-01'
)
SELECT
  customer_id,
  SUM(amount) FILTER (WHERE rn = 1) as latest_monthly,
  AVG(amount - COALESCE(prev_amount, 0)) as growth
FROM ranked_sales
GROUP BY customer_id
HAVING COUNT(*) > 5
ORDER BY growth DESC
LIMIT 20;

-- EXPLAIN ANALYZE to validate
EXPLAIN (ANALYZE, BUFFERS) WITH ranked_sales AS ... [same query];

Window ROW_NUMBER/LAG ranks per customer without correlated subqueries. FILTER conditions aggregates. Reusable CTE. EXPLAIN shows seq_scan vs index_scan. Pitfall: Missing PARTITION tanks perf; buffers reveal cache hits.

EXPLAIN Analysis: The Bottleneck Detector

EXPLAIN (ANALYZE, BUFFERS, VERBOSE) dissects plans: avoid Seq Scan (bad), aim for Index Only Scan (ideal). cost estimates CPU/I/O time, actual time measures reality. Tip: SET work_mem = '256MB'; for in-memory sorts/hashes.

Essential Best Practices

  • Autovacuum tuning: autovacuum_vacuum_scale_factor = 0.05 for active tables.
  • Connection pooling: PgBouncer or Pgpool-II (max 100 connections).
  • Monitoring: pg_stat_statements + Prometheus exporter.
  • Backups: pg_basebackup + WAL (point-in-time recovery).
  • Prod config: shared_buffers=25% RAM, effective_cache_size=75% RAM, work_mem=4GB/max_parallel_workers.

Common Mistakes to Avoid

  • Skipping ANALYZE: Stale stats → bad execution plans.
  • Indexing low-cardinality columns (<10% distinct): Useless overhead.
  • Forgetting CONCURRENTLY on matviews/indexes: Locks block production.
  • Unindexed JSONB: Sequential scans on TB-scale data.

Next Steps

Dive into advanced PostgreSQL training at Learni. Resources: PG 17 docs on partitioning, EXPLAIN, Citus for sharding. Open-source project: pgBadger for log analysis.