Benchmarks

Processing Throughput

This section provides processing speed benchmarks. The benchmarks were conducted against the service endpoint created by the CloudFormation reference implementation using a single EC2 instance inserted into the same Security Group to act as a load generator. The stack was configured to run a single instance of the specified type.

Notes and Observations

mand c instance types (e.g. c8i and m8i) have nearly identical performance, so we don't show m benchmarks
The benchmarks were conducted with 1 text per request. Sending multiple texts per API request had a very minor impact processing throughput.
Instance types with modern CPUs like the c8i and c8a with special ML instruction sets like AVX512 VNNI and AMX perform far better than older instance types like the c5 that don't have feature these instruction sets.
Bigger instances aren't always better, such as for c8a instances. This is likely due to the instance being split over multiple CCDs (Core Complex Dies).
PII Eraser currently doesn't support AWS Graviton ARM64 instance types, however we are planning to add support in the future.

EC2

Instance Type	1 Concurrent Req (tok/s)	4 Concurrent Reqs (tok/s)
c7a.xlarge	1739	1634
c7i.xlarge	2000	2190
c8a.xlarge	3515	3430
c8i.xlarge	2204	2456
m8i.xlarge	2157	2415
c5.2xlarge	747	805
c7a.2xlarge	2875	2932
c7i.2xlarge	3130	3823
c8a.2xlarge	5676	5837
c8i.2xlarge	3497	4444
c7a.4xlarge	2327	3064
c7i.4xlarge	4543	6648
c8a.4xlarge	3549	4615
c8i.4xlarge	4833	7545