Skip to content
Learni
View all tutorials
Cloud & Bases de données

How to Master Azure Cosmos DB in 2026

Lire en français

Introduction

Azure Cosmos DB is Microsoft Azure's multi-model NoSQL database, designed for highly scalable apps with global low latency. Launched in 2017, by 2026 it's the gold standard for distributed applications, handling billions of daily requests for giants like Coca-Cola and Verizon.

Why choose it? Unlike traditional relational databases like SQL Server, Cosmos DB shines in Big Data, IoT, and e-commerce scenarios with heterogeneous data and unpredictable traffic spikes. It delivers transparent global distribution (multi-active regions), 99.999% availability SLAs, and pay-per-use billing via Request Units (RU/s).

This beginner tutorial, 100% conceptual, takes you from theoretical basics to advanced best practices. By the end, you'll know how to model data, select optimal settings, and sidestep expensive mistakes. No code: pure actionable theory for bookmarking and quick reference.

Prerequisites

  • A free Azure account (200€ credit to get started).
  • Basic knowledge of databases (relational or NoSQL).
  • Understanding of JSON and flexible schemas.
  • Access to the Azure portal (portal.azure.com).

What is Azure Cosmos DB? The Fundamentals

Azure Cosmos DB is a globally distributed database, serverless by default in 2026, that abstracts away the complexity of replication and partitioning.

Analogy: Picture a massive supermarket chain with identical aisles in every city. Cosmos DB is that chain: your data is automatically replicated across multiple Azure regions, with strong or eventual read/write consistency based on your needs.

Key features:

  • Multi-model: SQL (Core), MongoDB, Cassandra, Gremlin (graph), Table (Azure Table).
  • Horizontal scalability: Add RU/s or containers without downtime.
  • Automatic indexing: Everything indexed by default, but fully customizable.

Real-world example: An e-commerce app stores products (JSON), users (MongoDB), and graph relationships (recommendations) in a single Cosmos DB account.

Supported Data Models

Cosmos DB stands out for its versatility: pick your API at the account level.

ModelIdeal UseExample
------------------------------
SQL (Core)Flexible JSON documentsProduct catalogs, IoT logs
MongoDBExisting Mongo appsCMS, mobile apps
CassandraHigh-write tabular workloadsTime-series, sensors
GremlinComplex graphsSocial networks, fraud detection
TableSimple key-value dataConfigs, sessions
Strategic choice: For beginners, start with SQL API (covers 99% of cases). It supports SQL-like queries on JSON: SELECT * FROM c WHERE c.age > 30. Avoid mixing APIs in one account to keep management simple.

Key Concepts: Containers, Partitions, and RU/s

Object hierarchy:

  1. Account: Global entry point (regions, API keys).
  2. Database: Logical grouping (containers).
  3. Container: Scalable storage unit (like a table).

Logical partitioning: Partition key (e.g., /userId) spreads data across physical partitions. Golden rule: Pick a key with high cardinality (millions of unique values) for optimal scaling.

Request Units (RU/s): Throughput unit. 1 RU/s ≈ 1 KB read. Use autoscale for traffic spikes (10-100x automatic).

Example: In an 'Orders' container, /customerId key puts all a customer's orders in the same physical partition → Fast client-specific queries.

Global Scalability and Consistency

Cosmos DB excels at multi-region setups: Enable 2+ regions for zero latency (<10 ms).

Consistency models (5 levels):

  • Strong: Reads see all writes (rare, expensive).
  • Bounded Staleness: Caps freshness (great for e-commerce).
  • Session: Consistent per user session (default for web apps).
  • Consistent Prefix: Preserves write order.
  • Eventual: Eventually consistent (chat apps).

Analogy: Like a global orchestra: 'Session' keeps rhythm per musician, 'Strong' syncs everyone perfectly.

Case study: Netflix uses Cosmos DB for over 1 billion user profiles, with EU/US replication under 150 ms.

Essential Best Practices

  • Model for queries: Denormalize data (duplicate to skip costly JOINs). E.g., embed 'user details' in every 'order'.
  • Pick the ideal partition key: High cardinality + no hotspots (avoid /status if 90% are 'active').
  • Use autoscale RU/s: Save 60-80% vs. fixed provisioning for variable workloads.
  • Enable Free Tier: 1000 RU/s + 25 GB free per account for prototyping.
  • Monitor with Metrics: Azure portal → Charts for RU consumption, latency, throttles.

Common Mistakes to Avoid

  • Poor partition key: Creates hotspots → 429 Throttled errors. Fix: Test with Query Metrics.
  • Over-indexing: Defaults to full indexing → High RU costs. Set an indexing policy to exclude unused fields.
  • Suboptimal provisioning: Too low → Throttles; too high → Waste. Use Azure's Capacity Calculator.
  • Ignoring consistency: Session default is fine, but verify for critical apps (banking → Strong).

Next Steps

Master Cosmos DB through hands-on practice:


Check out our Learni trainings on Azure: Complete Courses with practical labs on Cosmos DB, real code examples, and DP-420 certification prep.