Skip to content
Learni
View all tutorials
DevOps & SRE

How to Measure DORA Metrics in Production 2026

Lire en français

Introduction

DORA metrics remain the gold standard in 2026 for evaluating DevOps performance. They help correlate delivery speed with system stability. This tutorial guides you through an expert implementation including real-time collection, distributed computation, and advanced visualization. You will learn to instrument your CI/CD pipelines and observability systems to obtain reliable, actionable data.

Prerequisites

  • Kubernetes 1.29+ or equivalent with Prometheus
  • GitLab CI or GitHub Actions
  • TimescaleDB or ClickHouse database
  • Strong knowledge of observability and scripting
  • Access to deployment logs and incidents

Deployment Collection

collect-deployments.sh
#!/bin/bash
set -e
DEPLOY_TIME=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
COMMIT_SHA=$(git rev-parse HEAD)
SERVICE_NAME="api-gateway"
echo "{\"timestamp\": \"$DEPLOY_TIME\", \"commit\": \"$COMMIT_SHA\", \"service\": \"$SERVICE_NAME\", \"status\": \"success\"}" | curl -X POST http://metrics-collector:8080/deployments -d @-

This script sends metadata for each successful deployment to a centralized collector. It captures the UTC timestamp and commit SHA to enable precise lead time and frequency calculations.

Deployment Frequency Calculation

dora_calculator.py
from datetime import datetime, timedelta
import psycopg2

def calculate_deployment_frequency(service: str, days: int = 30):
    conn = psycopg2.connect("dbname=metrics user=metrics")
    cur = conn.cursor()
    cur.execute("""SELECT COUNT(*) FROM deployments 
                   WHERE service = %s AND timestamp > %s""", 
                   (service, datetime.utcnow() - timedelta(days=days)))
    count = cur.fetchone()[0]
    return count / days  # deployments per day

This function calculates the average daily frequency over 30 days. It queries the deployments database directly for precise, historical measurements.

Lead Time Measurement

lead_time.sql
SELECT 
  service,
  PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY EXTRACT(EPOCH FROM (deploy_time - commit_time))/3600) AS p95_lead_time_hours
FROM deployments
WHERE deploy_time > NOW() - INTERVAL '30 days'
GROUP BY service;

This SQL query calculates lead time at the 95th percentile. It measures the time between commit and production deployment to identify bottlenecks.

MTTR Instrumentation

incident-tracking.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: dora-mttr-config
data:
  rules.yaml: |
    - alert: ServiceDown
      expr: up{service="api-gateway"} == 0
      annotations:
        incident_start: "{{ $value }}"
    - alert: ServiceRestored
      expr: up{service="api-gateway"} == 1
      annotations:
        incident_end: "{{ $value }}"

This Prometheus configuration automatically detects incidents and their resolution. The timestamps are then used to calculate Time to Restore Service.

DORA Grafana Dashboard

dora-dashboard.json
{
  "dashboard": {
    "title": "DORA Metrics 2026",
    "panels": [
      {
        "title": "Deployment Frequency",
        "targets": [{"expr": "sum(rate(deployments_total[24h]))"}]
      },
      {
        "title": "Change Failure Rate",
        "targets": [{"expr": "sum(failed_deployments)/sum(total_deployments)"}]
    ]
  }
}

This JSON file defines a ready-to-use Grafana dashboard. It displays the four DORA metrics with optimized PromQL queries for real-time viewing.

Best Practices

  • Always store raw data with a minimum 90-day retention
  • Use percentiles rather than averages for lead time and MTTR
  • Separate environments (prod vs staging) in calculations
  • Automate alerts when metrics fall below elite thresholds
  • Correlate DORA metrics with business objectives

Common Mistakes

  • Calculating frequency only on manual deployments and ignoring automated pipelines
  • Forgetting to filter hotfix deployments in change failure rate
  • Using local timestamps instead of UTC for international comparisons
  • Not versioning metric calculation scripts

Further Reading

Explore our advanced training on observability and DevOps excellence: https://learni-group.com/formations. You will learn how to build scalable internal metrics platforms.