Introduction
Corrective RAG is a technique that enhances classic RAG systems by adding a critique and correction step. Instead of blindly accepting retrieved documents, the system evaluates their relevance and corrects them when necessary. This approach reduces hallucinations and improves response reliability. In 2026, it has become essential for professional applications that demand high precision. This tutorial shows you how to implement a complete, working Corrective RAG pipeline in Python.
Prerequisites
- Python 3.10 or higher
- OpenAI API key or compatible local model
- Basic knowledge of LLMs and embeddings
- Virtual environment recommended
Install Dependencies
pip install langchain langchain-openai langchain-community chromadbThis command installs LangChain and ChromaDB required to build the Corrective RAG pipeline. Make sure you are using a clean virtual environment.
Configure the LLM and Embeddings
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
embeddings = OpenAIEmbeddings()We configure the language model and embeddings. Setting temperature=0 ensures deterministic and reproducible responses.
Create the Vector Store
from langchain_community.vectorstores import Chroma
from langchain_core.documents import Document
docs = [
Document(page_content="Paris est la capitale de la France."),
Document(page_content="La Tour Eiffel mesure 324 mètres.")
]
vectorstore = Chroma.from_documents(docs, embeddings)Create a simple vector database with Chroma to store documents. This vector store will serve as the retrieval source in the pipeline.
Implement the Critic
from langchain_core.prompts import ChatPromptTemplate
critique_prompt = ChatPromptTemplate.from_template(
"Évalue si ce document répond à la question : {question}\n"
"Document : {document}\n"
"Réponds uniquement par OUI ou NON."
)
critic = critique_prompt | llmThe critic evaluates the relevance of each retrieved document. It returns YES or NO to decide whether correction is needed.
Complete Corrective RAG Pipeline
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
def correct_documents(docs, question):
corrected = []
for doc in docs:
result = critic.invoke({"question": question, "document": doc.page_content})
if "OUI" in result.content.upper():
corrected.append(doc)
return corrected
prompt = ChatPromptTemplate.from_template("Réponds à la question en utilisant ces documents : {context}\nQuestion : {question}")
chain = (
{"context": retriever | (lambda docs: correct_documents(docs, question)), "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)The full pipeline retrieves documents, critiques them, and keeps only validated ones. This correction function is the core of Corrective RAG.
Best Practices
- Always test the critic with edge cases
- Limit the number of documents sent to the critic to control costs
- Use a lightweight model for critique and a powerful model for final generation
- Log critic decisions for analysis
- Add a minimum similarity threshold before critique
Common Mistakes to Avoid
- Forgetting to handle the case when no documents are validated by the critic
- Using the same temperature for both critic and generator
- Not testing the pipeline on out-of-domain questions
- Ignoring the extra LLM call costs generated by the critique step
Going Further
Explore advanced variants such as Self-RAG and CRAG. Check out our Learni courses to master modern RAG architectures.