Skip to content
Learni
View all tutorials
Google Cloud

How to Deploy Cloud SQL in High Availability with Terraform in 2026

Lire en français

Introduction

Cloud SQL is Google Cloud's managed relational database service, supporting PostgreSQL, MySQL, and SQL Server. In 2026, with the rise of cloud-native applications, deploying an instance in high availability (HA) is essential for >99.99% resilience and automatic scalability. This expert tutorial guides you step-by-step to create a PostgreSQL HA instance with Terraform, including a private VPC, automated backups, monitoring via Cloud Monitoring, and secure connections without a public proxy.

Why is this crucial? Poor configuration leads to costly downtimes (up to 5 minutes during failover without HA) and network vulnerabilities. We use Terraform for infrastructure as code (IaC), making deployments reproducible and auditable. By the end, you'll have a production-ready stack with complete code examples for migration and queries. Ideal for DevOps architects managing critical workloads.

Prerequisites

  • Active Google Cloud Platform (GCP) account with billing enabled.
  • gcloud CLI installed (2026+ version).
  • Terraform 1.9+ installed.
  • Advanced knowledge of IaC, VPC networking, and PostgreSQL.
  • Dedicated GCP project with APIs enabled: Cloud SQL Admin, Compute Engine, Cloud Monitoring.
  • Editor like VS Code with HashiCorp Terraform extension.

Initialize GCP Environment and Terraform

init.sh
#!/bin/bash

gcloud auth login
gcloud config set project YOUR_PROJECT_ID
gcloud services enable sqladmin.googleapis.com compute.googleapis.com monitoring.googleapis.com

mkdir cloudsql-ha-terraform && cd cloudsql-ha-terraform

cat > versions.tf << EOF
terraform {
  required_version = ">= 1.9.0"
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 6.0"
    }
  }
}
EOF

terraform init

This script authenticates gcloud, enables required APIs, and initializes Terraform with the latest Google provider. Replace YOUR_PROJECT_ID with your project ID. Common pitfalls: Forgetting to enable APIs blocks resources; use gcloud projects describe to verify your project.

Set Up VPC and Private Network

Before Cloud SQL, create a VPC with private subnets to isolate the database. This enables private IPs and Serverless VPC Access for connections without public IPs, reducing attack surface by 90%.

Define VPC and Subnets

vpc.tf
provider "google" {
  project = var.project_id
  region  = var.region
}

resource "google_compute_network" "vpc" {
  name                    = "cloudsql-vpc"
  auto_create_subnetworks = false
  mtu                     = 1460
}

resource "google_compute_subnetwork" "private" {
  name                     = "cloudsql-subnet"
  ip_cidr_range            = "10.0.0.0/20"
  region                   = var.region
  network                  = google_compute_network.vpc.id
  private_ip_google_access = true
  description              = "Sous-réseau privé pour Cloud SQL"
}

variable "project_id" {
  description = "GCP Project ID"
  type        = string
}

variable "region" {
  description = "Region GCP"
  type        = string
  default     = "europe-west1"
}

output "vpc_id" {
  value = google_compute_network.vpc.id
}

This module creates a custom VPC with a private subnet enabling private Google access. MTU 1460 optimizes TCP for Cloud SQL. Pitfall: Without private_ip_google_access, GCP metadata fails; apply with terraform apply -var='project_id=your-id' -var='region=europe-west1'.

Deploy the HA Cloud SQL Instance

cloudsql.tf
resource "google_sql_database_instance" "ha_postgres" {
  name             = "ha-postgres"
  database_version = "POSTGRES_16"
  region           = var.region

  settings {
    tier              = "db-custom-4-15360"  # 4 vCPU, 15GB RAM pour prod
    availability_type = "REGIONAL"           # HA avec failover auto
    disk_autoresize   = true
    disk_size         = "100"
    disk_type         = "PD_SSD"

    ip_configuration {
      ipv4_enabled    = false               # Pas d'IP publique
      private_network = google_compute_network.vpc.id
      require_ssl     = true
    }

    backup_configuration {
      enabled                        = true
      start_time                     = "02:00"
      location                       = "EU"
      point_in_time_recovery_enabled = true
      retention_period_days          = 7
    }

    maintenance_window {
      day  = 7  # Dimanche
      hour = 3
    }

    database_flags {
      name  = "cloudsql.iam_authentication"
      value = "on"
    }
  }

  deletion_protection = true
}

resource "google_sql_database" "app_db" {
  name     = "appdb"
  instance = google_sql_database_instance.ha_postgres.name
}

resource "google_sql_user" "app_user" {
  name     = "appuser"
  instance = google_sql_database_instance.ha_postgres.name
  password = "SecurePass123!"  # Utilisez secrets manager en prod
}

Deploys a regional PostgreSQL 16 HA instance with private IP, PITR backups (Point-In-Time Recovery), and IAM auth. Custom tier scales with load. Pitfall: deletion_protection=true prevents accidental deletions; IAM flags enable passwordless auth.

Set Up Monitoring and Alerts

Integrate Cloud Monitoring to track CPU >80%, connections, and latency. Slack/Email alerts proactively prevent incidents.

Enable Monitoring and Dashboards

monitoring.tf
resource "google_monitoring_dashboard" "cloudsql_dashboard" {
  dashboard_json = jsonencode({
    displayName = "Cloud SQL HA Dashboard"
    row = [{
      title = "Métriques Clés"
      charts = [{
        displayName = "CPU Utilisation"
        metricType = "cloudsql.googleapis.com/database/cpu/utilization"
        resourceType = "cloud_sql_database"
      }, {
        displayName = "Connexions Actives"
        metricType = "cloudsql.googleapis.com/database/postgresql/connections"
        resourceType = "cloud_sql_database"
      }]
    }]
  })
}

resource "google_monitoring_alert_policy" "high_cpu" {
  display_name = "Alerte CPU Cloud SQL >80%"
  combiner     = "OR"

  conditions {
    display_name = "cpu_high"
    condition_threshold {
      filter     = "resource.type=\"cloud_sql_database\" AND metric.type=\"cloudsql.googleapis.com/database/cpu/utilization\" AND resource.label.\"database_id\"=\"${google_sql_database_instance.ha_postgres.name}\""
      threshold_value = 0.8
      duration   = "60s"
      comparison = "COMPARISON_GT"
    }
  }

  notification_channels = []  # Ajoutez vos channels
}

Creates a dashboard and high CPU alert. JSONencode for dynamic dashboard. Pitfall: Exact filter on database_id; test with terraform apply and check in GCP Console > Monitoring.

Create a Table and Insert Sample Data

init.sql
-- Connexion via gcloud sql connect ha-postgres --user=postgres --database=appdb

CREATE TABLE users (
  id SERIAL PRIMARY KEY,
  email VARCHAR(255) UNIQUE NOT NULL,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

INSERT INTO users (email) VALUES 
  ('user1@example.com'),
  ('user2@example.com');

-- Vérifier
SELECT * FROM users;

-- Index pour perf
CREATE INDEX idx_users_email ON users(email);

Initialize the database with a users table, sample data, and index. Run via gcloud sql connect. Pitfall: Without indexes, queries don't scale; SERIAL for Postgres auto-increment.

Connect from a Python App with Private IP

connect.py
import os
import pymysql  # pip install pymysql

from google.cloud.sql.connector import Connector, IPTypes
import sqlalchemy

# Instance connexion ID: projects/PROJECT/locations/REGION/instances/INSTANCE
INSTANCE_CONNECTION_NAME = "YOUR_PROJECT:REGION:ha-postgres"

connector = Connector(ip_type=IPTypes.PRIVATE)

def getconn() -> sqlalchemy.engine.base.Connection:
    conn = connector.connect(
        INSTANCE_CONNECTION_NAME,
        "tcp:5432",
        user="appuser",
        password="SecurePass123!",
        db="appdb"
    )
    return conn

engine = sqlalchemy.create_engine(
    "postgresql://",
    creator=getconn,
)

with engine.connect() as conn:
    result = conn.execute(sqlalchemy.text("SELECT * FROM users"))
    print([dict(row._mapping) for row in result])

connector.close()

Uses Cloud SQL Python Connector for private IP without proxy. Replace INSTANCE_CONNECTION_NAME. Pitfall: IPTypes.PRIVATE requires VPC; install pip install cloud-sql-python-connector sqlalchemy psycopg2.

Apply Terraform and Clean Up

apply.sh
#!/bin/bash

terraform validate
terraform plan -var="project_id=YOUR_PROJECT_ID" -var="region=europe-west1"
terraform apply -auto-approve -var="project_id=YOUR_PROJECT_ID" -var="region=europe-west1"

# Outputs
terraform output

# Cleanup
# terraform destroy -auto-approve

Validates, plans, and applies the stack. Outputs show IDs. Pitfall: Always plan before apply; destroy for labs, but deletion_protection safeguards production.

Best Practices

  • Secrets Manager: Store passwords in Secret Manager, reference via google_secret_manager_secret_version.
  • HA + Multi-Zone: Use availability_type=REGIONAL and failover_replica for RTO <60s.
  • Audit Logs: Enable database_flags { name="log_min_duration_statement" value="1000" } to trace slow queries.
  • Auto Scaling: Monitor and upscale tier via Terraform variables.
  • CI/CD: Integrate with GitHub Actions using terraform-cloud for approvals.

Common Errors to Avoid

  • Public IP Forgotten: ipv4_enabled=true exposes to the internet; enforce private_network.
  • Non-EU Backups: Choose location=EU for GDPR compliance.
  • IAM Not Enabled: Without cloudsql.iam_authentication=on, SQL auth is vulnerable.
  • Missing VPC Peering: For external apps, add Serverless VPC Access Connector.