Skip to content
Learni
View all tutorials
Sécurité et Conformité

How to Automate PII Redaction in Production 2026

Lire en français

Introduction

Redacting PII (Personally Identifiable Information) has become mandatory for any application handling personal data in 2026. This expert tutorial guides you step-by-step through building a robust, high-performance redaction system that meets GDPR requirements. You will learn to combine optimized regular expressions with semantic detection to process emails, phone numbers, IBANs, and contextual data. The approach is designed for direct production use with minimal latency.

Prerequisites

  • Node.js 20+ and TypeScript 5.4+
  • Advanced knowledge of regex and text processing
  • AWS or GCP account for optional NLP services
  • Familiarity with Express/Fastify middlewares

Project Initialization

terminal
npm init -y
npm install typescript @types/node tsx
npm install --save-dev @types/node
npx tsc --init

Initialize a strict TypeScript project. tsx enables direct execution without compilation during development.

Solution Architecture

We will create a modular redaction module with a simple interface. Each PII type will be handled by a dedicated detector to enable efficient maintenance and unit testing.

PII Detector Definitions

src/detectors.ts
export interface PIIDetector {
  name: string;
  regex: RegExp;
  replacement: string;
}

export const detectors: PIIDetector[] = [
  { name: 'email', regex: /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/g, replacement: '[EMAIL_REDACTED]' },
  { name: 'phone', regex: /\b(?:\+33|0)[1-9](?:[\s.-]?\d{2}){4}\b/g, replacement: '[PHONE_REDACTED]' },
  { name: 'iban', regex: /\b[A-Z]{2}\d{2}(?:[\s]?[A-Z0-9]{4}){4,7}\b/g, replacement: '[IBAN_REDACTED]' }
];

The regex patterns are optimized for French and European formats. Each detector is independent to simplify adding new PII types.

Redaction Engine Implementation

src/redactor.ts
import { detectors, PIIDetector } from './detectors';

export function redactPII(text: string, customDetectors: PIIDetector[] = []): string {
  let result = text;
  const allDetectors = [...detectors, ...customDetectors];
  for (const detector of allDetectors) {
    result = result.replace(detector.regex, detector.replacement);
  }
  return result;
}

export function redactObject(obj: any): any {
  if (typeof obj === 'string') return redactPII(obj);
  if (Array.isArray(obj)) return obj.map(redactObject);
  if (obj && typeof obj === 'object') {
    const newObj: any = {};
    for (const key in obj) {
      newObj[key] = redactObject(obj[key]);
    }
    return newObj;
  }
  return obj;
}

The redactObject function recursively processes complete JSON objects. This approach is essential for REST APIs and structured logs.

Fastify Middleware for APIs

src/middleware.ts
import { FastifyRequest, FastifyReply } from 'fastify';
import { redactObject } from './redactor';

export async function piiRedactionMiddleware(request: FastifyRequest, reply: FastifyReply) {
  if (request.body) {
    request.body = redactObject(request.body);
  }
  if (request.query) {
    request.query = redactObject(request.query);
  }
  const originalSend = reply.send;
  reply.send = function (payload: any) {
    return originalSend.call(this, redactObject(payload));
  };
}

This middleware intercepts both inputs and outputs to ensure no PII is transmitted in plaintext, including error responses.

Complete Unit Tests

src/redactor.test.ts
import { redactPII, redactObject } from './redactor';

describe('PII Redaction', () => {
  it('redacts email correctly', () => {
    expect(redactPII('Contact: test@example.com')).toBe('Contact: [EMAIL_REDACTED]');
  });
  it('redacts nested objects', () => {
    const input = { user: { email: 'john@doe.fr', phone: '0612345678' } };
    expect(redactObject(input)).toEqual({ user: { email: '[EMAIL_REDACTED]', phone: '[PHONE_REDACTED]' } });
  });
});

Tests cover both simple cases and nested structures. Run with npx tsx src/redactor.test.ts.

Production Configuration

redactor.config.json
{
  "detectors": ["email", "phone", "iban"],
  "performance": {
    "maxTextLength": 100000,
    "timeoutMs": 50
  },
  "logging": {
    "redactionCount": true,
    "sampleRate": 0.01
  }
}

External configuration allows enabling/disabling detectors and monitoring production performance without restarts.

Best Practices

  • Always test regex patterns on real datasets before deployment
  • Implement logging of redaction statistics (without storing the data)
  • Provide a dry-run mode for audits
  • Version detector sets for traceability
  • Combine with NLP solutions (Presidio, AWS Comprehend) for complex cases

Common Mistakes to Avoid

  • Forgetting to handle arrays and nested objects
  • Using overly broad regex that breaks legitimate data
  • Failing to manage timeouts on very large text volumes
  • Storing logs before redaction

Going Further

Integrate advanced NLP models and explore our Learni training programs on data compliance and security.

How to Automate PII Redaction in Node.js 2026 | Learni