Running with Docker
This guide covers deploying PII Eraser as a standalone Docker container — ideal for local development, testing, and single-host production deployments. For orchestrated, auto-scaling production deployments, see AWS Deployment or Other Platforms.
Prerequisites
- Docker Engine 20.10+ installed and running.
- PII Eraser container image available in a registry accessible from your host. See Getting Started for distribution options.
- Minimum 7 GB RAM available for the container (16 GB recommended for production).
Quickstart
Run the Container
Run the container, exposing the API on port 8000:
docker run -p 8000:8000 --name pii-eraser \
--read-only \
--tmpfs /tmp \
<your-registry>/pii-eraser:latest
The container will load models at startup. This typically takes 30–60 seconds depending on hardware. Once the health check passes, the API begins accepting requests.
Completely Airgapped Operation
PII Eraser functions completely offline and does not phone home. There are no usage analytics, no telemetry, and no external API calls made by the container. It is safe to run in air-gapped environments.
Verify the Deployment
Wait for the container to finish loading, then check the health endpoint:
Expected response:
Send a test request:
Expected response:
{
"text": [
"Hello <NAME>"
],
"entities": [
[
{
"entity_type": "NAME",
"output_start": 6,
"output_end": 12
}
]
],
"stats": {
"total_tokens": 7,
"tps": 4718.14
}
}
Built-in Health Checks
The PII Eraser image includes a built-in HEALTHCHECK instruction. Docker automatically handles and reports the container's health status, ensuring traffic is safely routed only when the API is fully ready.
Configuration File
In addition to REST API parameters shown above, PII Eraser can be configured using a config.yaml file that is mounted into the container or passed as an environment variable. See the Config File Reference for all available parameters and the example configurations for real-world usage patterns.
Production Hardening
The quickstart command above already includes the two most important hardening flags (--read-only and --tmpfs /tmp). For production deployments, apply the full set of recommended flags:
docker run -p 8000:8000 --name pii-eraser \
--read-only \
--tmpfs /tmp \
--cap-drop ALL \
--security-opt no-new-privileges \
--memory 8g \
--cpus 4 \
-v $(pwd)/config.yaml:/app/config.yaml:ro \
<your-registry>/pii-eraser:latest
| Flag | Purpose |
|---|---|
--read-only |
Makes the container's root filesystem read-only, preventing malware persistence or binary tampering at runtime. |
--tmpfs /tmp |
Provides a writable temporary directory in memory. Required because the root filesystem is read-only. |
--cap-drop ALL |
Drops all Linux kernel capabilities, minimizing container breakout risk. |
--security-opt no-new-privileges |
Prevents processes inside the container from gaining additional privileges via setuid or setgid binaries. |
--memory 8g |
(Optional) Sets a hard memory limit. Adjust based on your workload (minimum 7 GB). |
--cpus 4 |
(Optional) Limits CPU usage. Set this to match your intended allocation. |
See Security for a comprehensive overview of PII Eraser's security architecture.
Dedicate the Host to PII Eraser
PII Eraser's inference engine is optimized for dedicated CPU access. Running other CPU-intensive workloads on the same machine — or running multiple PII Eraser containers on the same host — will significantly degrade throughput. For optimal performance, dedicate the host to a single PII Eraser container. If co-location is unavoidable, use --cpuset-cpus to pin PII Eraser to specific cores (e.g., --cpuset-cpus="0-3"). See the Resource Isolation section for details.
Docker Compose
For environments that use Docker Compose, here is a reference docker-compose.yaml:
services:
pii-eraser:
image: <your-registry>/pii-eraser:latest
ports:
- "8000:8000"
read_only: true
tmpfs:
- /tmp
cap_drop:
- ALL
security_opt:
- no-new-privileges
deploy:
resources:
limits:
memory: 8g
cpus: "4"
volumes:
- ./config.yaml:/app/config.yaml:ro
healthcheck:
test: ["/venv/bin/python", "/app/healthcheck.py"]
interval: 10s
timeout: 5s
retries: 3
start_period: 120s
restart: unless-stopped
Start the service:
Check the logs: