Introduction
Cloud SQL is Google Cloud's managed relational database service, supporting PostgreSQL, MySQL, and SQL Server. In 2026, with the rise of cloud-native applications, deploying an instance in high availability (HA) is essential for >99.99% resilience and automatic scalability. This expert tutorial guides you step-by-step to create a PostgreSQL HA instance with Terraform, including a private VPC, automated backups, monitoring via Cloud Monitoring, and secure connections without a public proxy.
Why is this crucial? Poor configuration leads to costly downtimes (up to 5 minutes during failover without HA) and network vulnerabilities. We use Terraform for infrastructure as code (IaC), making deployments reproducible and auditable. By the end, you'll have a production-ready stack with complete code examples for migration and queries. Ideal for DevOps architects managing critical workloads.
Prerequisites
- Active Google Cloud Platform (GCP) account with billing enabled.
- gcloud CLI installed (2026+ version).
- Terraform 1.9+ installed.
- Advanced knowledge of IaC, VPC networking, and PostgreSQL.
- Dedicated GCP project with APIs enabled: Cloud SQL Admin, Compute Engine, Cloud Monitoring.
- Editor like VS Code with HashiCorp Terraform extension.
Initialize GCP Environment and Terraform
#!/bin/bash
gcloud auth login
gcloud config set project YOUR_PROJECT_ID
gcloud services enable sqladmin.googleapis.com compute.googleapis.com monitoring.googleapis.com
mkdir cloudsql-ha-terraform && cd cloudsql-ha-terraform
cat > versions.tf << EOF
terraform {
required_version = ">= 1.9.0"
required_providers {
google = {
source = "hashicorp/google"
version = "~> 6.0"
}
}
}
EOF
terraform initThis script authenticates gcloud, enables required APIs, and initializes Terraform with the latest Google provider. Replace YOUR_PROJECT_ID with your project ID. Common pitfalls: Forgetting to enable APIs blocks resources; use gcloud projects describe to verify your project.
Set Up VPC and Private Network
Before Cloud SQL, create a VPC with private subnets to isolate the database. This enables private IPs and Serverless VPC Access for connections without public IPs, reducing attack surface by 90%.
Define VPC and Subnets
provider "google" {
project = var.project_id
region = var.region
}
resource "google_compute_network" "vpc" {
name = "cloudsql-vpc"
auto_create_subnetworks = false
mtu = 1460
}
resource "google_compute_subnetwork" "private" {
name = "cloudsql-subnet"
ip_cidr_range = "10.0.0.0/20"
region = var.region
network = google_compute_network.vpc.id
private_ip_google_access = true
description = "Sous-réseau privé pour Cloud SQL"
}
variable "project_id" {
description = "GCP Project ID"
type = string
}
variable "region" {
description = "Region GCP"
type = string
default = "europe-west1"
}
output "vpc_id" {
value = google_compute_network.vpc.id
}This module creates a custom VPC with a private subnet enabling private Google access. MTU 1460 optimizes TCP for Cloud SQL. Pitfall: Without private_ip_google_access, GCP metadata fails; apply with terraform apply -var='project_id=your-id' -var='region=europe-west1'.
Deploy the HA Cloud SQL Instance
resource "google_sql_database_instance" "ha_postgres" {
name = "ha-postgres"
database_version = "POSTGRES_16"
region = var.region
settings {
tier = "db-custom-4-15360" # 4 vCPU, 15GB RAM pour prod
availability_type = "REGIONAL" # HA avec failover auto
disk_autoresize = true
disk_size = "100"
disk_type = "PD_SSD"
ip_configuration {
ipv4_enabled = false # Pas d'IP publique
private_network = google_compute_network.vpc.id
require_ssl = true
}
backup_configuration {
enabled = true
start_time = "02:00"
location = "EU"
point_in_time_recovery_enabled = true
retention_period_days = 7
}
maintenance_window {
day = 7 # Dimanche
hour = 3
}
database_flags {
name = "cloudsql.iam_authentication"
value = "on"
}
}
deletion_protection = true
}
resource "google_sql_database" "app_db" {
name = "appdb"
instance = google_sql_database_instance.ha_postgres.name
}
resource "google_sql_user" "app_user" {
name = "appuser"
instance = google_sql_database_instance.ha_postgres.name
password = "SecurePass123!" # Utilisez secrets manager en prod
}Deploys a regional PostgreSQL 16 HA instance with private IP, PITR backups (Point-In-Time Recovery), and IAM auth. Custom tier scales with load. Pitfall: deletion_protection=true prevents accidental deletions; IAM flags enable passwordless auth.
Set Up Monitoring and Alerts
Integrate Cloud Monitoring to track CPU >80%, connections, and latency. Slack/Email alerts proactively prevent incidents.
Enable Monitoring and Dashboards
resource "google_monitoring_dashboard" "cloudsql_dashboard" {
dashboard_json = jsonencode({
displayName = "Cloud SQL HA Dashboard"
row = [{
title = "Métriques Clés"
charts = [{
displayName = "CPU Utilisation"
metricType = "cloudsql.googleapis.com/database/cpu/utilization"
resourceType = "cloud_sql_database"
}, {
displayName = "Connexions Actives"
metricType = "cloudsql.googleapis.com/database/postgresql/connections"
resourceType = "cloud_sql_database"
}]
}]
})
}
resource "google_monitoring_alert_policy" "high_cpu" {
display_name = "Alerte CPU Cloud SQL >80%"
combiner = "OR"
conditions {
display_name = "cpu_high"
condition_threshold {
filter = "resource.type=\"cloud_sql_database\" AND metric.type=\"cloudsql.googleapis.com/database/cpu/utilization\" AND resource.label.\"database_id\"=\"${google_sql_database_instance.ha_postgres.name}\""
threshold_value = 0.8
duration = "60s"
comparison = "COMPARISON_GT"
}
}
notification_channels = [] # Ajoutez vos channels
}Creates a dashboard and high CPU alert. JSONencode for dynamic dashboard. Pitfall: Exact filter on database_id; test with terraform apply and check in GCP Console > Monitoring.
Create a Table and Insert Sample Data
-- Connexion via gcloud sql connect ha-postgres --user=postgres --database=appdb
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
INSERT INTO users (email) VALUES
('user1@example.com'),
('user2@example.com');
-- Vérifier
SELECT * FROM users;
-- Index pour perf
CREATE INDEX idx_users_email ON users(email);Initialize the database with a users table, sample data, and index. Run via gcloud sql connect. Pitfall: Without indexes, queries don't scale; SERIAL for Postgres auto-increment.
Connect from a Python App with Private IP
import os
import pymysql # pip install pymysql
from google.cloud.sql.connector import Connector, IPTypes
import sqlalchemy
# Instance connexion ID: projects/PROJECT/locations/REGION/instances/INSTANCE
INSTANCE_CONNECTION_NAME = "YOUR_PROJECT:REGION:ha-postgres"
connector = Connector(ip_type=IPTypes.PRIVATE)
def getconn() -> sqlalchemy.engine.base.Connection:
conn = connector.connect(
INSTANCE_CONNECTION_NAME,
"tcp:5432",
user="appuser",
password="SecurePass123!",
db="appdb"
)
return conn
engine = sqlalchemy.create_engine(
"postgresql://",
creator=getconn,
)
with engine.connect() as conn:
result = conn.execute(sqlalchemy.text("SELECT * FROM users"))
print([dict(row._mapping) for row in result])
connector.close()Uses Cloud SQL Python Connector for private IP without proxy. Replace INSTANCE_CONNECTION_NAME. Pitfall: IPTypes.PRIVATE requires VPC; install pip install cloud-sql-python-connector sqlalchemy psycopg2.
Apply Terraform and Clean Up
#!/bin/bash
terraform validate
terraform plan -var="project_id=YOUR_PROJECT_ID" -var="region=europe-west1"
terraform apply -auto-approve -var="project_id=YOUR_PROJECT_ID" -var="region=europe-west1"
# Outputs
terraform output
# Cleanup
# terraform destroy -auto-approveValidates, plans, and applies the stack. Outputs show IDs. Pitfall: Always plan before apply; destroy for labs, but deletion_protection safeguards production.
Best Practices
- Secrets Manager: Store passwords in Secret Manager, reference via
google_secret_manager_secret_version. - HA + Multi-Zone: Use
availability_type=REGIONALandfailover_replicafor RTO <60s. - Audit Logs: Enable
database_flags { name="log_min_duration_statement" value="1000" }to trace slow queries. - Auto Scaling: Monitor and upscale tier via Terraform variables.
- CI/CD: Integrate with GitHub Actions using
terraform-cloudfor approvals.
Common Errors to Avoid
- Public IP Forgotten:
ipv4_enabled=trueexposes to the internet; enforce private_network. - Non-EU Backups: Choose
location=EUfor GDPR compliance. - IAM Not Enabled: Without
cloudsql.iam_authentication=on, SQL auth is vulnerable. - Missing VPC Peering: For external apps, add Serverless VPC Access Connector.
Next Steps
Dive deeper with our advanced GCP DevOps training. Resources: Cloud SQL HA Docs, Terraform GCP Registry, Cloud SQL Proxy vs Connector.