How to Integrate Azure OpenAI Service in 2026

Introduction

Azure OpenAI Service is the managed version of OpenAI on Azure, providing enterprise scalability, GDPR compliance, advanced security, and dedicated quotas. Unlike the public OpenAI API, Azure enables private deployments, monitoring via Azure Monitor, and native integration with Azure AI Search for RAG (Retrieval-Augmented Generation).

This advanced tutorial is aimed at senior developers: we cover CLI deployment, the Node.js SDK API for streaming chat, tool calling (function execution), embeddings, and vector indexing. Imagine an app that queries a knowledge base in real time—that's what we'll build.

Why 2026? GPT-4o-mini and o1-preview models dominate, with native support for multi-tool agents. By the end, you'll have a complete, bookmark-worthy project with 100% functional code. Estimated time: 2 hours to implement.

Prerequisites

Free Azure account (with $200 credit)
Azure CLI 2.60+ installed
Node.js 20+ and npm
Advanced knowledge of TypeScript and async/await
VS Code with Azure and TypeScript extensions
Admin access to an Azure subscription (to create resources)

Install Azure CLI and Log In

setup-azure.sh

#!/bin/bash

# Installer Azure CLI si pas présent (macOS/Linux)
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash

# Login interactif
az login

# Définir subscription par défaut (remplacez YOUR_SUB_ID)
# az account list --output table
az account set --subscription "YOUR_SUBSCRIPTION_ID"

# Vérifier
az account show --query name -o tsv

This script installs Azure CLI, connects your account, and selects the subscription. Replace YOUR_SUBSCRIPTION_ID with your actual ID (get it via az account list). Pitfall: Without login, all commands fail with 401 Unauthorized.

Creating the Azure OpenAI Resource

We'll create an OpenAI resource in the learni-openai-rg resource group. Use a nearby location (e.g., francecentral) to minimize latency. The resource provisions in 5-10 minutes.

Create Resource and GPT-4o Deployment

deploy-openai.sh

#!/bin/bash

RESOURCE_GROUP="learni-openai-rg"
LOCATION="francecentral"
RESOURCE_NAME="learni-openai-$(date +%s)"

# Créer groupe de ressources
az group create --name $RESOURCE_GROUP --location $LOCATION

# Créer ressource OpenAI (S0 pour prod, quota 1000 RPM)
az cognitiveservices account create \
  --name $RESOURCE_NAME \
  --resource-group $RESOURCE_GROUP \
  --kind OpenAI \
  --sku S0 \
  --location $LOCATION

# Récupérer endpoint et clé
ENDPOINT=$(az cognitiveservices account show --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP --query properties.endpoint -o tsv)
KEY=$(az cognitiveservices account keys list --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP --query key1 -o tsv)

echo "Endpoint: $ENDPOINT"
echo "Key: $KEY"

# Déployer GPT-4o-mini (économique, rapide)
az cognitiveservices account deployment create \
  --name $RESOURCE_NAME \
  --resource-group $RESOURCE_GROUP \
  --deployment-name gpt-4o-mini \
  --model-name gpt-4o-mini \
  --model-version 2024-07-18 \
  --model-format ChatCompletion \
  --sku-capacity 1

Creates the resource, retrieves the endpoint/key (store in a secret manager for production), and deploys GPT-4o-mini. Use gpt-4o for more power. Pitfall: F0 SKU is free but limited; S0 for advanced features. Check quotas in the Portal.

Installing the Node.js SDK

Use the official @azure/openai package, which is safer than openai as it handles Azure proxies. Create a Node.js project with dotenv for secrets.

Initialize Project and Install Dependencies

init-project.sh

#!/bin/bash

mkdir azure-openai-app && cd azure-openai-app
npm init -y
npm install @azure/openai dotenv typescript @types/node ts-node
npm install -D @types/node

cat > .env << EOF
AZURE_OPENAI_ENDPOINT=https://YOUR_RESOURCE.openai.azure.com/
AZURE_OPENAI_KEY=your_key_here
AZURE_OPENAI_DEPLOYMENT=gpt-4o-mini
EOF

cat > tsconfig.json << 'EOF'
{
  "compilerOptions": {
    "target": "ES2022",
    "module": "NodeNext",
    "moduleResolution": "NodeNext",
    "strict": true,
    "esModuleInterop": true
  }
}
EOF

Initializes the project with the Azure OpenAI SDK and strict TS config. Copy the endpoint/key from the previous step into .env. Pitfall: Forget dotenv.config() and env vars will be undefined, causing 401 errors.

Basic Client for Chat Completions

chat-basic.ts

import { AzureOpenAI } from '@azure/openai';
import dotenv from 'dotenv';

dotenv.config();

const client = new AzureOpenAI({
  endpoint: process.env.AZURE_OPENAI_ENDPOINT!,
  apiKey: process.env.AZURE_OPENAI_KEY!,
  apiVersion: '2024-10-21',
});

async function chatBasic() {
  const response = await client.getChatCompletions(
    process.env.AZURE_OPENAI_DEPLOYMENT!,
    [
      { role: 'system', content: 'Tu es un expert Azure.' },
      { role: 'user', content: 'Explique tool calling en 3 points.' },
    ],
    {
      temperature: 0.7,
      maxTokens: 500,
    }
  );

  console.log(response.choice[0]?.message?.content);
}

chatBasic().catch(console.error);

Creates an AzureOpenAI client and sends a simple chat. The apiVersion is critical for 2026 features. Pitfall: Without the exact deployment name, you'll get a 404 error. Run with npx ts-node chat-basic.ts.

Advanced Streaming and Tool Calling

Let's move to streaming for reactive UX, then tool calling to execute real code (e.g., a weather API query). Analogy: streaming is like a Netflix live stream vs. a full video.

Chat with Streaming and Tools

chat-stream-tools.ts

import { AzureOpenAI } from '@azure/openai';
import dotenv from 'dotenv';

dotenv.config();

const client = new AzureOpenAI({
  endpoint: process.env.AZURE_OPENAI_ENDPOINT!,
  apiKey: process.env.AZURE_OPENAI_KEY!,
  apiVersion: '2024-10-21',
});

// Tool exemple: fonction météo mock
const tools = [
  {
    type: 'function',
    function: {
      name: 'get_weather',
      description: 'Obtenir la météo pour une ville',
      parameters: {
        type: 'object',
        properties: {
          city: { type: 'string' },
        },
        required: ['city'],
      },
    },
  },
];

function getWeather(city: string): string {
  return `Météo à ${city}: 22°C, ensoleillé.`;
}

async function chatStreamTools() {
  const stream = await client.getChatCompletionsStream(
    process.env.AZURE_OPENAI_DEPLOYMENT!,
    [{ role: 'user', content: 'Quelle est la météo à Paris ?' }],
    { tools, temperature: 0 }
  );

  let toolCalls: any[] = [];
  for await (const event of stream) {
    const delta = event.choice[0]?.delta;
    if (delta?.toolCalls) {
      toolCalls.push(...delta.toolCalls);
    } else if (delta?.content) {
      process.stdout.write(delta.content);
    }
  }

  // Exécuter tools si appelés
  for (const toolCall of toolCalls) {
    if (toolCall.function.name === 'get_weather') {
      const args = JSON.parse(toolCall.function.arguments);
      const result = getWeather(args.city);
      console.log(`\nRésultat tool: ${result}`);
    }
  }
}

chatStreamTools().catch(console.error);

Uses getChatCompletionsStream for incremental responses; tool calling parses and executes functions. Implement real auth for production tools. Pitfall: Handle delta correctly, or tools will be ignored.

Embeddings and Vector Search with AI Search

For RAG: generate embeddings, index in Azure AI Search, and perform vector queries. Requires an AI Search resource (create it via CLI).

Create AI Search for Embeddings

setup-search.sh

#!/bin/bash

RESOURCE_GROUP="learni-openai-rg"
SEARCH_NAME="learni-search-$(date +%s)"
LOCATION="francecentral"

# Créer AI Search (Basic pour test)
az search service create --name $SEARCH_NAME --resource-group $RESOURCE_GROUP --sku Basic --location $LOCATION

SEARCH_ENDPOINT=$(az search service show --name $SEARCH_NAME --resource-group $RESOURCE_GROUP --query "hostingConfiguration.publicNetworkAccess" -o tsv | grep -o 'https[^ ]*')
SEARCH_KEY=$(az search admin-key show --service-name $SEARCH_NAME --resource-group $RESOURCE_GROUP --query primaryKey -o tsv)

echo "Search Endpoint: $SEARCH_ENDPOINT"
echo "Search Key: $SEARCH_KEY"

# Ajouter à .env
cat >> .env << EOF
AZURE_SEARCH_ENDPOINT=$SEARCH_ENDPOINT
AZURE_SEARCH_KEY=$SEARCH_KEY
EOF

Creates Azure AI Search for vector indexing. Basic SKU is fine for development. Pitfall: Admin key vs. query key; use query key for apps.

Generate Embeddings and RAG Query

embeddings-rag.ts

import { AzureOpenAI } from '@azure/openai';
import dotenv from 'dotenv';
import { SearchClient, AzureKeyCredential, VectorizedSearchOptions } from '@azure/search-documents';

dotenv.config();

const openai = new AzureOpenAI({
  endpoint: process.env.AZURE_OPENAI_ENDPOINT!,
  apiKey: process.env.AZURE_OPENAI_KEY!,
  apiVersion: '2024-10-21',
});

const searchClient = new SearchClient(
  process.env.AZURE_SEARCH_ENDPOINT! + '/indexes/rag-index',
  'contentVector',
  new AzureKeyCredential(process.env.AZURE_SEARCH_KEY!)
);

async function generateEmbeddings(text: string) {
  const emb = await openai.getEmbeddings('text-embedding-3-small', [{ input: text }]);
  return emb.embedding?.[0];
}

async function ragQuery(query: string) {
  const queryEmb = await generateEmbeddings(query);

  const searchOptions: VectorizedSearchOptions = {
    vectorQueries: [{ kind: 'vector', vector: queryEmb!, k: 3, fields: 'contentVector' }],
  };

  const results = await searchClient.search('*', searchOptions);
  console.log('Top docs:', results.results.map(r => r.document));

  // RAG: chat avec context
  const context = results.results.map(r => r.document.pageContent).join('\n');
  const chatResponse = await openai.getChatCompletions(
    process.env.AZURE_OPENAI_DEPLOYMENT!,
    [
      { role: 'system', content: 'Réponds basé sur ce contexte:' },
      { role: 'user', content: `${query}\nContexte: ${context}` },
    ]
  );
  console.log(chatResponse.choice[0]?.message?.content);
}

ragQuery('Quels sont les prérequis Azure OpenAI ?').catch(console.error);

Generates embeddings with text-embedding-3-small (deploy it first), performs vector queries on the rag-index index (create manually in the Portal). Full RAG implementation. Pitfall: Embedding dimensions must match the index schema (1536 for small); index your docs first.

Best Practices

Secrets management: Use Azure Key Vault, never .env in production.
Rate limiting: Implement retries with exponential backoff using the SDK's retry options.
Monitoring: Enable Application Insights for traces and costs.
Quotas: Monitor TPM/RPM via Metrics; scale deployments.
Security: Use RBAC on the OpenAI resource; private endpoints.

Common Errors to Avoid

404 Deployment not found: Check the exact name with az cognitiveservices account deployment list.
401 Unauthorized: Expired key or wrong endpoint (must end with /).
429 Too Many Requests: No batching; use stream for long outputs.
Vector mismatch: Embedding dimensions don't match the index schema.

Next Steps

Official docs: Azure OpenAI
JS SDK: npm @azure/openai
Advanced: Agents with Assistants API and fine-tuning (GPT-4o).
Learni Azure AI Training for DP-600 certification.