Benchmarking Kafka vs. Akka brokerless pub/sub
With the release of 22.10 and specifically Akka Projections 1.3.0 we introduced a Brokerless Pub/Sub over gRPC. When we benchmarked it against Kafka we were very happy with the results seeing great reductions in latency.
Methodology
The hardware
We run this benchmark on typical cloud hardware, in this case in a GKE cluster in google cloud. We provision the hardware in advance and clear the database / message broker between runs.
- App Servers ( 8 x node GKE n2-standard-4 k8s cluster)
- 4 x producer pods with requested memory 2Gi, cpu 3500m
- 4 x consumer pods with requested memory 2Gi, cpu 3500m
- Database
- Google CloudSQL Database: Postgres 14.4 (8 vCPU, 52 GB memory)
- Message Broker
- Confluent Cloud Kafka, free tier, topic with 32 partitions, 3 replicas single zone, on the same gcp region as database and app servers
The software
We are using the latest version of Akka and a custom application for calculating the benchmark.
The procedure
We run the software in 2 passes; Kafka and Akka’s Brokerless. We ignore the first messages until the liveness of both producer and consumer and we want to benchmark stable operation.
For the Kafka pass:
- We use the same producer and consumer applications for testing both scenarios but change the mode by overriding some configuration properties with the environment variable
JAVA_TOOL_OPTIONS
. Kafka is enabled by-Dshopping-cart-service.kafka.enabled=on
. - Create Postgres tables by executing ddl-scripts/create_tables.sql
- Kafka broker connection information must be provided inside resources/kafka.conf
- Postgres connection details need to be provided inside resources/persistence.conf
- Run both the producer and consumer applications
- Log outputs contain the histogram percentiles of messages
For the Akka Brokerles pass:
- Update the container environment variable
JAVA_TOOL_OPTIONS
command line argument value to switch broker to 'off'-Dshopping-cart-service.kafka.enabled=off
- Kafka broker connection configuration is not required in this case since. This can be left as it is.
- Rest of the steps is same as above (from step 5)
Benchmark findings
Max sustained throughput
Consumer lag (end-to-end latency) should be stable below a few seconds and not increase.
8 pods producer service (shopping-cart-service)
4 pods consumer service (shopping-analytics-service)
Kafka parameters:
-Dshopping-cart-service.kafka.enabled=on
-Dakka.kafka.producer.kafka-clients.linger.ms=200
-Dakka.persistence.r2dbc.journal.publish-events=off
-Dshopping-cart-service.simulator-count=100
-Dshopping-cart-service.simulator-delay=100ms
Akka Brokerless parameters:
-Dshopping-cart-service.kafka.enabled=off
-Dakka.persistence.r2dbc.journal.publish-events=off
-Dshopping-cart-service.simulator-count=100
-Dshopping-cart-service.simulator-delay=100ms
This generated a stable load of 16,000 events/s. For both Kafka and Akka Brokerless the consumer lag (end-to-end latency) was below a few seconds and didn’t increase. The bottleneck was writing the events to the database on the producer side. CPU utilization of the database was close to 100%.
Throughput was measured by counting rows inserted into the event_journal table during a period of 60 seconds.
Latency
4 pods producer service (shopping-cart-service)
4 pods consumer service (shopping-cart-service)
Kafka parameters:
-Dshopping-cart-service.kafka.enabled=on
-Dakka.kafka.producer.kafka-clients.linger.ms=0
-Dakka.persistence.r2dbc.journal.publish-events=on
-Dshopping-cart-service.simulator-count=8
-Dshopping-cart-service.simulator-delay=200ms
Akka Brokerless parameters:
-Dshopping-cart-service.kafka.enabled=off
-Dakka.persistence.r2dbc.journal.publish-events=on
-Dshopping-cart-service.simulator-count=8
-Dshopping-cart-service.simulator-delay=200m
This configuration generated a stable load of 940 events/s with the following latency distribution.
View the results in the repo here.
Consumer lag (end-to-end latency) was measured by the difference between the timestamp when the consumer processed the event and the timestamp when the producer wrote the event. Note that this is comparing wall clock timestamps on different machines so clock skew may have introduced measurement errors, but it is assumed that clocks are fairly well synchronized in this Google Cloud GKE environment.
Consumer lag percentiles were collected with HdrHistogram over 3 minutes.
Conclusions
In our throughput tests we found that both Kafka and Akka Brokerless had great results and showed that the bottleneck was the database in both cases, however, latency results showed that Akka Brokerless Pub/Sub connections is vastly superior.
We use benchmarks as a way of validating ideas and looking for potential undiscovered issues under load. We want to avoid benchmarking to improve performance under very specific scenarios as this can lead to optimization fallacies. This test for example only validates one size of message and we may see changes for tiny and large messages.
We also find benchmarks to be a useful tool for development. In developing this benchmark we were able to diagnose and repair issues in an early version of this software. It enabled us to validate our hypothesis that latency would be positively affected by this change and we were able to verify the benefit of this feature both in overhead saving and performance gains. We are pleased with this result and it helps pave the way for more Akka features that will allow you to ensure your data is where it needs to be when it needs to be there.
Posts by this author