Skip to content
Learni
View all tutorials
Intelligence Artificielle

How to Get Started with the Mistral API in 2026

Lire en français

Introduction

The Mistral API, developed by Mistral AI, provides a powerful interface to access cutting-edge language models like Mistral Nemo or Mistral Large. In 2026, it's a top choice for developers seeking a performant open-source alternative to OpenAI, with optimized costs and reduced latency. This tutorial walks you through integrating it into a Node.js app, from your first API call to real-time streaming.

Why Mistral? It excels in multilingual tasks (including French), supports complex reasoning, and offers embeddings for semantic search. Imagine building a conversational chatbot or code assistant: in just a few lines, you can generate smart responses. We cover the basics for beginners with 100% working code. By the end, you'll be ready to scale your AI projects seamlessly.

Prerequisites

  • Node.js 20+ installed
  • A free account on console.mistral.ai to get an API key
  • Basic JavaScript knowledge (async/await)
  • A code editor like VS Code

Initialize the project and install dependencies

terminal
mkdir mistral-api-demo
cd mistral-api-demo
npm init -y
npm install @mistralai/mistralai dotenv
npm install -D nodemon

These commands create an empty Node.js project, install the official Mistral SDK for simplified API calls, and dotenv for secure API key management. nodemon is optional for development (auto-restarts). Run them in your terminal for a ready setup in 30 seconds.

Configure the API key

Create a .env file in the root directory:

MISTRAL_API_KEY=your_api_key_here

Replace your_api_key_here with the key from console.mistral.ai. Add .env to .gitignore to avoid committing it. The SDK automatically loads this variable via dotenv.

First call: Simple chat completion

chat-simple.js
require('dotenv').config();
const { Mistral } = require('@mistralai/mistralai');

const client = new Mistral({ apiKey: process.env.MISTRAL_API_KEY });

async function chatSimple() {
  const response = await client.chat.completions.create({
    model: 'mistral-tiny',
    messages: [
      {
        role: 'user',
        content: 'Explique-moi l\'IA en 3 phrases simples.'
      }
    ]
  });

  console.log(response.choices[0].message.content);
}

chatSimple().catch(console.error);

This complete script sends a user message to the fast, cost-effective 'mistral-tiny' model and prints the response. Use async/await to handle promises. Run with node chat-simple.js. Common pitfall: Ensure .env is loaded before creating the client.

Add a system prompt

chat-system.js
require('dotenv').config();
const { Mistral } = require('@mistralai/mistralai');

const client = new Mistral({ apiKey: process.env.MISTRAL_API_KEY });

async function chatAvecSystem() {
  const response = await client.chat.completions.create({
    model: 'mistral-tiny',
    messages: [
      {
        role: 'system',
        content: 'Tu es un assistant expert en programmation. Réponds de manière concise et technique.'
      },
      {
        role: 'user',
        content: 'Comment optimiser une boucle for en JavaScript ?'
      }
    ],
    max_tokens: 200
  });

  console.log(response.choices[0].message.content);
}

chatAvecSystem().catch(console.error);

The 'system' role sets the model's behavior, like a technical coach here. Add 'max_tokens' to limit responses and control costs. Think of it as giving instructions to an employee before a task. Test with node chat-system.js.

Managing conversation history

For a realistic chatbot, pass an array of messages with alternating roles (system, user, assistant). This maintains context, improving response coherence.

Conversation with history

chat-historique.js
require('dotenv').config();
const { Mistral } = require('@mistralai/mistralai');

const client = new Mistral({ apiKey: process.env.MISTRAL_API_KEY });

async function conversation() {
  const messages = [
    {
      role: 'system',
      content: 'Tu es un guide touristique français.'
    },
    {
      role: 'user',
      content: 'Quels sont les meilleurs sites à Paris ?'
    },
    {
      role: 'assistant',
      content: 'La Tour Eiffel, le Louvre et Notre-Dame sont incontournables.'
    },
    {
      role: 'user',
      content: 'Plus de détails sur le Louvre ?'
    }
  ];

  const response = await client.chat.completions.create({
    model: 'open-mistral-nemo',
    messages: messages
  });

  console.log('Assistant:', response.choices[0].message.content);
}

conversation().catch(console.error);

History simulates a real conversation: the model 'sees' previous exchanges. Use 'open-mistral-nemo' for more power. Update 'messages' dynamically in a real app. Avoid overly long histories (>10k tokens) to stay within limits.

Stream responses in real time

chat-stream.js
require('dotenv').config();
const { Mistral } = require('@mistralai/mistralai');

const client = new Mistral({ apiKey: process.env.MISTRAL_API_KEY });

async function chatStream() {
  const stream = await client.chat.completions.create({
    model: 'mistral-tiny',
    messages: [{ role: 'user', content: 'Raconte une histoire courte et effrayante.' }],
    stream: true
  });

  for await (const chunk of stream) {
    process.stdout.write(chunk.choices[0]?.delta?.content || '');
  }
  console.log('\n');
}

chatStream().catch(console.error);

Enable 'stream: true' to receive tokens as they generate, perfect for reactive UIs. Use a 'for await' loop to iterate. It's like watching a movie load token by token. Run node chat-stream.js and see the magic.

Generate embeddings

embeddings.js
require('dotenv').config();
const { Mistral } = require('@mistralai/mistralai');

const client = new Mistral({ apiKey: process.env.MISTRAL_API_KEY });

async function genererEmbedding() {
  const response = await client.embeddings.create({
    model: 'mistral-embed',
    input: 'Recherche sémantique pour IA en français.'
  });

  const embedding = response.embeddings[0].embedding;
  console.log('Embedding (premiers 10 dims):', embedding.slice(0, 10));
  console.log('Dimension totale:', embedding.length);
}

genererEmbedding().catch(console.error);

Embeddings convert text to vectors for similarity tasks (e.g., search). 'mistral-embed' produces 1024-dimensional vectors. Great for RAG or clustering. Avoid overly long inputs (>512 tokens).

Best practices

  • Always use environment variables for the API key: never hardcode it.
  • Limit tokens with max_tokens and monitor quotas in the Mistral console.
  • Handle errors with try/catch and retries for rate limits (e.g., exponential backoff).
  • Choose the right model: 'mistral-tiny' for quick tests, 'mistral-large' for precision.
  • Cache repetitive responses with Redis to scale.

Common errors to avoid

  • Forgetting dotenv.config(): API key is undefined, 401 error.
  • Overly large history: exceeds context window (128k tokens max), truncates response.
  • Ignoring rate limits: 100 req/min, implement queues.
  • Not streaming for interactive apps: high perceived latency.

Next steps

Master fine-tuning and agents with our Learni trainings. Check the official Mistral docs, Mistral playground for testing, or explore LangChain for complex chains.