Introduction
Groq delivers unmatched LLM inference speeds thanks to its LPU hardware. For intermediate developers, mastering its API enables building responsive applications such as chatbots, agents, or RAG pipelines. This tutorial covers installing the official SDK, basic calls, real-time streaming, tool usage, and robust error handling. You will get 100% functional, production-ready code. The focus is on TypeScript for type safety and maintainability.
Prerequisites
- Node.js 20+
- Groq account with an API key
- Basic knowledge of TypeScript and async/await
- npm or pnpm
Installation and Configuration
npm install groq-sdk dotenv
# .env
GROQ_API_KEY=gsk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxThe official groq-sdk handles authentication and retries. dotenv securely loads the API key without exposing it in source code.
Client Initialization
import Groq from 'groq-sdk';
import 'dotenv/config';
export const groq = new Groq({
apiKey: process.env.GROQ_API_KEY!,
});Create a reusable singleton client. The TypeScript exclamation mark ensures the key exists at runtime.
Simple Completion Call
import { groq } from './groqClient';
async function simpleChat() {
const completion = await groq.chat.completions.create({
model: 'llama-3.3-70b-versatile',
messages: [{ role: 'user', content: 'Explique Groq en une phrase' }],
temperature: 0.7,
max_tokens: 200,
});
console.log(completion.choices[0].message.content);
}
simpleChat();Basic synchronous call to Llama 3.3. Choose the fastest model based on your use case.
Streaming Responses
import { groq } from './groqClient';
async function streamChat() {
const stream = await groq.chat.completions.create({
model: 'llama-3.3-70b-versatile',
messages: [{ role: 'user', content: 'Raconte une histoire courte' }],
stream: true,
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || '';
process.stdout.write(content);
}
}
streamChat();Streaming provides a smooth user experience. Handle empty chunks to avoid display errors.
Advanced Function Calling
import { groq } from './groqClient';
async function toolsChat() {
const tools = [{
type: 'function' as const,
function: {
name: 'getWeather',
description: 'Obtenir la météo',
parameters: {
type: 'object',
properties: { city: { type: 'string' } },
required: ['city'],
},
},
}];
const completion = await groq.chat.completions.create({
model: 'llama-3.3-70b-versatile',
messages: [{ role: 'user', content: 'Météo à Paris ?' }],
tools,
tool_choice: 'auto',
});
console.log(completion.choices[0].message.tool_calls);
}
toolsChat();Groq natively supports tools. Always validate received arguments before execution.
Best Practices
- Always use updated models and test latencies
- Implement retry logic with exponential backoff
- Limit max_tokens and track costs through logging
- Validate tool JSON outputs with Zod
- Cache frequent responses with Redis
Common Errors to Avoid
- Forgetting to handle rate limits (429) and timeouts
- Not typing tool_calls with TypeScript
- Using overly long prompts without truncation
- Ignoring parsing errors from streamed responses
Going Further
Explore our advanced training on LLM agents and inference optimization: https://learni-group.com/formations