Skip to content

Running with Docker

This guide covers deploying PII Eraser as a standalone Docker container — ideal for local development, testing, and single-host production deployments. For orchestrated, auto-scaling production deployments, see AWS Deployment or Other Platforms.

Prerequisites

  • Docker Engine 20.10+ installed and running.
  • PII Eraser container image available in a registry accessible from your host. See Getting Started for distribution options.
  • Minimum 7 GB RAM available for the container (16 GB recommended for production).

Quickstart

Run the Container

Run the container, exposing the API on port 8000:

docker run -p 8000:8000 --name pii-eraser \
  --read-only \
  --tmpfs /tmp \
  <your-registry>/pii-eraser:latest

The container will load models at startup. This typically takes 30–60 seconds depending on hardware. Once the health check passes, the API begins accepting requests.

Completely Airgapped Operation

PII Eraser functions completely offline and does not phone home. There are no usage analytics, no telemetry, and no external API calls made by the container. It is safe to run in air-gapped environments.

Verify the Deployment

Wait for the container to finish loading, then check the health endpoint:

curl http://localhost:8000/health

Expected response:

"healthy"

Send a test request:

curl -X 'POST' \
    'http://localhost:8000/text/transform' \
    -H 'Content-Type: application/json' \
    -d '{
    "text": ["Hello Max Mustermann"],
    "operator": "redact"
}'
import json
import requests

response = requests.post(
    "http://localhost:8000/text/transform",
    json={
        "text": ["Hello Max Mustermann"],
        "operator": "redact"
    }
)
print(json.dumps(response.json(), indent=4))

Expected response:

{
    "text": [
        "Hello <NAME>"
    ],
    "entities": [
        [
            {
                "entity_type": "NAME",
                "output_start": 6,
                "output_end": 12
            }
        ]
    ],
    "stats": {
        "total_tokens": 7,
        "tps": 4718.14
    }
}

Built-in Health Checks

The PII Eraser image includes a built-in HEALTHCHECK instruction. Docker automatically handles and reports the container's health status, ensuring traffic is safely routed only when the API is fully ready.

Configuration File

In addition to REST API parameters shown above, PII Eraser can be configured using a config.yaml file that is mounted into the container or passed as an environment variable. See the Config File Reference for all available parameters and the example configurations for real-world usage patterns.

Production Hardening

The quickstart command above already includes the two most important hardening flags (--read-only and --tmpfs /tmp). For production deployments, apply the full set of recommended flags:

docker run -p 8000:8000 --name pii-eraser \
  --read-only \
  --tmpfs /tmp \
  --cap-drop ALL \
  --security-opt no-new-privileges \
  --memory 8g \
  --cpus 4 \
  -v $(pwd)/config.yaml:/app/config.yaml:ro \
  <your-registry>/pii-eraser:latest
Flag Purpose
--read-only Makes the container's root filesystem read-only, preventing malware persistence or binary tampering at runtime.
--tmpfs /tmp Provides a writable temporary directory in memory. Required because the root filesystem is read-only.
--cap-drop ALL Drops all Linux kernel capabilities, minimizing container breakout risk.
--security-opt no-new-privileges Prevents processes inside the container from gaining additional privileges via setuid or setgid binaries.
--memory 8g (Optional) Sets a hard memory limit. Adjust based on your workload (minimum 7 GB).
--cpus 4 (Optional) Limits CPU usage. Set this to match your intended allocation.

See Security for a comprehensive overview of PII Eraser's security architecture.

Dedicate the Host to PII Eraser

PII Eraser's inference engine is optimized for dedicated CPU access. Running other CPU-intensive workloads on the same machine — or running multiple PII Eraser containers on the same host — will significantly degrade throughput. For optimal performance, dedicate the host to a single PII Eraser container. If co-location is unavoidable, use --cpuset-cpus to pin PII Eraser to specific cores (e.g., --cpuset-cpus="0-3"). See the Resource Isolation section for details.

Docker Compose

For environments that use Docker Compose, here is a reference docker-compose.yaml:

services:
  pii-eraser:
    image: <your-registry>/pii-eraser:latest
    ports:
      - "8000:8000"
    read_only: true
    tmpfs:
      - /tmp
    cap_drop:
      - ALL
    security_opt:
      - no-new-privileges
    deploy:
      resources:
        limits:
          memory: 8g
          cpus: "4"
    volumes:
      - ./config.yaml:/app/config.yaml:ro
    healthcheck:
      test: ["/venv/bin/python", "/app/healthcheck.py"]
      interval: 10s
      timeout: 5s
      retries: 3
      start_period: 120s
    restart: unless-stopped

Start the service:

docker compose up -d

Check the logs:

docker compose logs -f pii-eraser