Skip to content

Presidio Compatibility

PII Eraser provides drop-in compatibility with Microsoft Presidio Analyzer, allowing you to upgrade your PII detection accuracy and performance without rewriting your existing application logic or to continue using Microsoft Presidio Anonymizer.

PII Eraser Benefits

If your application currently relies on Presidio Analyzer, you can use PII Eraser instead and benefit from:

  • Higher detection accuracy from PII Eraser's highly optimized models, particularly on real-world text that doesn't fit rigid patterns, diverse country-specific entity types and long inputs.
  • Automatic multilingual support without needing to specify a language parameter.
  • Better long input support via 1M token context window.
  • Faster processing via PII Eraser's extensive inference optimizations on the latest x86 CPUs.
  • Easier Security Compliance & Less CVEs as PII Eraser is built on a Chainguard distroless base image and doesn't use libraries such as cryptography.

Compatibility Endpoints

Enable Presidio Aliases

For true drop-in replacement (where your application calls /analyze directly), set enable_presidio_aliases: true in your config.yaml file.

PII Eraser provides the following Presidio-compatible routes:

PII Eraser Route PII Eraser Alias Presidio Equivalent Method Description
/compatibility/presidio/analyze /analyze /analyze POST Detect PII entities in a single text string.
/compatibility/presidio/recognizers /recognizers /recognizers GET List available recognizers.
/compatibility/presidio/supportedentities /supportedentities /supportedentities GET List all supported entity types.

Parameter Compatibility

Presidio Parameter Name PII Eraser Behavior
language Accepted but ignored. Detection is automatic.
correlation_id Accepted but ignored.
return_decision_process Accepted but ignored. analysis_explanation is always omitted.
ad_hoc_recognizers Accepted but ignored. Use PII Eraser's block list instead.
entities Fully supported. Maps to PII Eraser's entity_types.
score_threshold Fully supported. Maps to PII Eraser's score_threshold.

Migration Guide

Step 1: Enable Aliases (Optional)

To minimize changes or to use Presidio Anonymizer, add the following to your config.yaml:

enable_presidio_aliases: true

Step 2: Update the Endpoint URL

Change the base URL in your application from your Presidio Analyzer instance to your PII Eraser instance:

# Before (Presidio)
PRESIDIO_URL = "http://presidio-analyzer:3000"

# After (PII Eraser)
PRESIDIO_URL = "http://pii-eraser:8000"

Step 3: Review Entity Types

PII Eraser uses its own entity type taxonomy, which covers a broader range of types than Presidio's defaults. Review the supported entity types and update the entities parameter in your requests if needed or set entity_types in your config file - the Presidio compatibility endpoints will respect them.

Step 4: Configure Allow/Block lists (Optional)

Configure any custom detection logic such as ad hoc recognizers via PII Eraser's allow list and block list functionality.

Step 5: Remove Unnecessary Parameters (Optional)

You can safely remove ignored parameters from your requests, but leaving it in will not cause errors.

Example Presidio Analyze Request

Request:

This example works with both Presidio Analyzer and PII Eraser, allowing for the different port.

import json
import requests

payload = {
    "text": "My name is Jean-Pierre Dupont and my medicare number is 3278851195.",
    "entities": ["NAME", "HEALTHCARE_ID"],
    "score_threshold": 0.6
}

r = requests.post("http://localhost:8000/analyze", json=payload)
print(json.dumps(r.json(), indent=4))

Response:

[
    {
        "start": 11,
        "end": 29,
        "score": 0.9943283796310425,
        "entity_type": "NAME"
    },
    {
        "start": 56,
        "end": 66,
        "score": 0.9866441488265991,
        "entity_type": "HEALTHCARE_ID"
    }
]

The response format matches Presidio Analyzer's output structure, with one exception: the analysis_explanation field is omitted.

Example Config File

Here is a complete config.yaml example for a team migrating from Presidio Analyzer:

# Enable /analyze, /recognizers, /supportedentities short routes
enable_presidio_aliases: true

# Use the corresponding entity types the Presidio setup was configured for
entity_types:
  - NAME
  - EMAIL
  - PHONE
  - ADDRESS
  - ORGANIZATION
  - CUSTOMER

# Allow through the company's name (previously handled by Presidio's allow list)
allow_list:
  - "TechNova GmbH"
  - "TechNova"

# Block customer names (previously handled by Presidio's ad-hoc recognizers)
block_list:
  CUSTOMER:
    - "ACME"
    - "Roadrunner Enterprises"

See the French Legal Tech example for a full Presidio migration configuration in a real-world scenario.