Skip to content
Learni
View all tutorials
Bases de données

How to Optimize Elasticsearch for Production in 2026

18 minADVANCED
Lire en français

Introduction

Elasticsearch has become the go-to solution for real-time search and analytics. In 2026, scalability and latency requirements demand precise control over cluster configuration, mappings, and indexing strategies. This tutorial walks you through setting up a robust production environment, from JVM sizing to advanced aggregations. You will learn how to avoid common memory pitfalls and optimize performance on multi-terabyte datasets.

Prerequisites

  • Java 21+ and Elasticsearch 8.15+
  • Solid knowledge of Linux and YAML
  • Access to a cluster with at least 3 nodes
  • curl or the official Elasticsearch client

Cluster Configuration

elasticsearch.yml
cluster.name: prod-cluster-2026
node.name: ${HOSTNAME}
network.host: 0.0.0.0
discovery.seed_hosts: ["10.0.0.1", "10.0.0.2", "10.0.0.3"]
cluster.initial_master_nodes: ["node-1", "node-2", "node-3"]
indices.memory.index_buffer_size: 30%
thread_pool.write.queue_size: 1000

This file configures a high-availability cluster with multicast discovery disabled and an indexing buffer optimized for heavy workloads.

Advanced Index Template

index-template.json
{
  "index_patterns": ["logs-2026-*"],
  "settings": {
    "number_of_shards": 5,
    "number_of_replicas": 1,
    "refresh_interval": "30s"
  },
  "mappings": {
    "properties": {
      "@timestamp": { "type": "date" },
      "message": { "type": "text", "analyzer": "standard" },
      "level": { "type": "keyword" }
    }
  }
}

The template automatically applies 5 shards and a slow refresh interval to reduce pressure on SSD disks.

Optimized Bulk Indexing

bulk-index.sh
curl -X POST "localhost:9200/_bulk?pretty" -H 'Content-Type: application/json' --data-binary @- << EOF
{ "index" : { "_index" : "logs-2026-01" } }
{ "@timestamp": "2026-01-15T10:00:00Z", "message": "Erreur critique", "level": "ERROR" }
{ "index" : { "_index" : "logs-2026-01" } }
{ "@timestamp": "2026-01-15T10:00:01Z", "message": "Requête traitée", "level": "INFO" }
EOF

Bulk indexing with batches of 1000 documents reduces request overhead and improves throughput by up to 10x.

Query with Aggregations

advanced-search.json
{
  "size": 0,
  "query": { "range": { "@timestamp": { "gte": "now-1h" } } },
  "aggs": {
    "errors_per_minute": {
      "date_histogram": { "field": "@timestamp", "calendar_interval": "1m" },
      "aggs": { "error_count": { "filter": { "term": { "level": "ERROR" } } } }
    }
  }
}

This date_histogram aggregation calculates errors per minute without retrieving all documents.

JVM Monitoring Script

monitor.sh
#!/bin/bash
curl -s localhost:9200/_nodes/stats/jvm | jq '.nodes[].jvm.mem.heap_used_percent'

Continuously monitor heap usage percentage to trigger alerts before garbage collection occurs.

Best Practices

  • Always explicitly define the number of shards and replicas
  • Use index templates to standardize mappings
  • Configure refresh_interval based on data volume
  • Monitor JVM heap and circuit breakers
  • Prefer filtered queries over scoring queries when possible

Common Mistakes to Avoid

  • Forgetting to limit result size (from + size)
  • Creating too many shards on small indexes
  • Ignoring circuit breaker warnings
  • Using custom analyzers without testing relevance

Further Reading

Deepen your skills with our advanced Elasticsearch training.