Skip to content
Learni
View all tutorials
IA & Développement

How to Master OpenRouter for AI APIs in 2026

Lire en français

Introduction

OpenRouter revolutionizes access to AI models in 2026 with a unified OpenAI-compatible API that routes to over 100 providers (Anthropic, OpenAI, Google, Mistral, and more). Unlike direct APIs, OpenRouter automatically handles fallbacks for failing models, optimizes costs via price/latency leaderboards, and supports server-sent events streaming for real-time chats.

Why this expert tutorial? For senior devs: implement custom routing (e.g., prioritize Claude-3.5-Sonnet if under $0.5/M tokens), model scouting (auto-test best model), monitoring via X-Usage headers, and resilience with exponential retries. Result: AI APIs with 99.99% uptime and 30% cost savings. Ideal for SaaS, AI agents, or scalable RAG. We start with a Next.js 15 project and build a production-ready chat API.

Prerequisites

  • Node.js 20+ and npm/yarn/pnpm
  • Next.js 15+ with TypeScript (npx create-next-app@latest)
  • OpenRouter account: dashboard.openrouter.ai for free API key (initial credits)
  • Advanced knowledge: OpenAI SDK, async/await, Zod validation, Vercel deployment
  • Tools: .env.local, curl for testing

Initialize the Next.js Project

terminal
npx create-next-app@latest openrouter-app --typescript --tailwind --eslint --app --src-dir --import-alias "@/*"
cd openrouter-app
npm install openai@latest zod@latest @types/node
npm install -D @types/cors
cp .env.example .env.local

Creates a Next.js 15 App Router project with TypeScript and Tailwind. Installs OpenAI SDK (OpenRouter-compatible via baseURL), Zod for input validation, and Node types. Copies .env for secrets. Pitfall: Use --app for modern App Router, not Pages.

Configure .env and OpenRouter Client

lib/openrouter.ts
import OpenAI from 'openai';
import 'dotenv/config';

const openrouter = new OpenAI({
  apiKey: process.env.OPENROUTER_API_KEY ?? '',
  baseURL: 'https://openrouter.ai/api/v1',
});

export default openrouter;

export type OpenRouterMessage = OpenAI.Chat.ChatCompletionMessageParam;

Sets up a customized OpenAI client for OpenRouter: baseURL points to the proxy API, apiKey via dotenv. Reusable type alias for messages. Pitfall: Verify OPENROUTER_API_KEY in .env.local (e.g., OPENROUTER_API_KEY=sk-or-...). No default provider for flexibility.

First Call: Basic Chat Completion

Let's test a simple call to Claude-3.5-Sonnet, the 2026 leaderboard model for reasoning. OpenRouter adds traceability headers (X-Provider, X-Requested-Model). Use curl to validate before coding.

Implement Basic Chat API Route

src/app/api/chat/route.ts
import { NextRequest, NextResponse } from 'next/server';
import openrouter from '@/lib/openrouter';
import { z } from 'zod';

const schema = z.object({
  model: z.string().default('anthropic/claude-3.5-sonnet'),
  messages: z.array(z.object({ role: z.enum(['user', 'system']), content: z.string() })),
});

export async function POST(req: NextRequest) {
  try {
    const { model, messages } = schema.parse(await req.json());
    const completion = await openrouter.chat.completions.create({
      model,
      messages,
      temperature: 0.7,
    });
    return NextResponse.json({ result: completion.choices[0].message });
  } catch (error) {
    return NextResponse.json({ error: 'Validation or API failed' }, { status: 400 });
  }
}

POST /api/chat route validates inputs with Zod, calls chat.completions. Returns AI message. Analogy: Zod as a gatekeeper against prompt injection. Pitfall: Without try/catch, Zod errors crash; test with curl -X POST http://localhost:3000/api/chat -H 'Content-Type: application/json' -d '{"messages":[{"role":"user","content":"Hello"}] }'.

Add Server-Sent Events Streaming

src/app/api/chat/stream/route.ts
import { NextRequest, NextResponse } from 'next/server';
import openrouter from '@/lib/openrouter';
import { z } from 'zod';

const schema = z.object({
  model: z.string().default('anthropic/claude-3.5-sonnet'),
  messages: z.array(z.object({ role: z.enum(['user', 'system']), content: z.string() })),
});

export async function POST(req: NextRequest) {
  try {
    const { model, messages } = schema.parse(await req.json());
    const stream = await openrouter.chat.completions.create({
      model,
      messages,
      stream: true,
      temperature: 0.7,
    });
    const encoder = new TextEncoder();
    const streamResponse = new ReadableStream({
      async start(controller) {
        try {
          for await (const chunk of stream) {
            const data = chunk.choices[0]?.delta?.content || '';
            controller.enqueue(encoder.encode(`data: ${data}\n\n`));
          }
          controller.enqueue(encoder.encode('data: [DONE]\n\n'));
        } catch (err) {
          controller.error(err);
        } finally {
          controller.close();
        }
      },
    });
    return new Response(streamResponse, {
      headers: { 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache' },
    });
  } catch (error) {
    return NextResponse.json({ error: 'Stream failed' }, { status: 500 });
  }
}

Enables stream: true for SSE. ReadableStream parses deltas in real-time. SSE headers essential for browsers. Pitfall: Forgetting [DONE] blocks clients; handle errors to avoid memory leaks in production.

Advanced Routing and Fallbacks

OpenRouter excels at intelligent routing: specify provider or models array for automatic fallbacks (e.g., fallback to Gemini if Sonnet is down). Use sort: 'price' to optimize costs.

API with Fallbacks and Custom Routing

src/app/api/chat/fallback/route.ts
import { NextRequest, NextResponse } from 'next/server';
import openrouter from '@/lib/openrouter';
import { z } from 'zod';

const schema = z.object({
  messages: z.array(z.object({ role: z.enum(['user', 'system']), content: z.string() })),
});

export async function POST(req: NextRequest) {
  try {
    const { messages } = schema.parse(await req.json());
    const completion = await openrouter.chat.completions.create({
      model: 'anthropic/claude-3.5-sonnet',
      messages,
      provider: {  // Fallback si provider principal down
        allow_fallbacks: true,
        force: false,
        order: ['anthropic', 'openai'],
        sort: 'auto',
      },
      temperature: 0.7,
      extra_headers: {
        'X-Title': 'Mon App IA',
      },
    });
    const usage = completion.usage;
    return NextResponse.json({
      result: completion.choices[0].message,
      usage,
      headers: { 'X-Provider': completion.headers?.['x-provider'] },
    });
  } catch (error: any) {
    return NextResponse.json({ error: error.message }, { status: 500 });
  }
}

provider object enables ordered fallbacks by sort: 'auto' (latency/price). extra_headers for traceability. Parses usage for billing. Pitfall: Without allow_fallbacks: true, no resilience; log headers['x-provider'] for debugging.

Advanced Monitoring and Model Scouting

src/app/api/models/route.ts
import { NextResponse } from 'next/server';
import openrouter from '@/lib/openrouter';

export async function GET() {
  try {
    const models = await openrouter.models.list({
      sort: 'performance',
    });
    const filtered = models.data.filter(m => 
      m.id.includes('claude') || m.id.includes('gpt-4o')
    );
    return NextResponse.json({ models: filtered.slice(0, 10) });
  } catch (error) {
    return NextResponse.json({ error: 'Models fetch failed' }, { status: 500 });
  }
}

export async function POST() {
  // Beascout : test auto meilleur modèle
  const testPrompt = [{ role: 'user', content: 'Résume ce JSON en français' }];
  const results = await Promise.all([
    openrouter.chat.completions.create({ model: 'anthropic/claude-3.5-sonnet', messages: testPrompt }),
    openrouter.chat.completions.create({ model: 'openai/gpt-4o', messages: testPrompt }),
  ]);
  const best = results.reduce((best, res, i) => 
    (res.usage?.total_tokens ?? 0) < (best.usage?.total_tokens ?? Infinity) ? res : best
  );
  return NextResponse.json({ bestModel: best.model });
}

GET lists models sorted by performance, filters leaders. POST scouts models by testing/comparing tokens. Analogy: Leaderboard like F1 Top 10 drivers. Pitfall: Rate limits (60 req/min free tier); cache results with Redis in production.

Best Practices

  • Always validate with Zod: Protects against injections and API abuse.
  • Implement exponential retries with p-retry for 99.99% uptime.
  • Cache responses (Upstash Redis) for repeated prompts, save 50% costs.
  • Monitor headers X-Usage, X-Provider via Vercel Logs/Prometheus.
  • Secure your key: Vercel env vars, never commit .env; rotate keys monthly.

Common Errors to Avoid

  • Forgetting baseURL: Calls go to OpenAI directly, double billing.
  • Ignoring rate limits: 20k req/day free; implement queues (BullMQ).
  • No stream: true parsing: Frontend clients hang on empty deltas.
  • Fallbacks without allow_fallbacks: 100% downtime if model down (e.g., Anthropic outages).

Next Steps

Dive deeper with OpenRouter docs. Implement RAG with Pinecone + OpenRouter. Check out our advanced AI trainings at Learni for autonomous agents and multi-model fine-tuning.