Configure Kafka for Low Latency

Configure Kafka for Low Latency

Latencies in Kafka

  • Production Latency (between Producer and Leader Broker)
  • Consumption latency (between Consumer and Leader Broker)
  • End-to-end latency
    • The end-to-end latency is measured as the time duration from the point when the message was produced till the point it is available for the consumers to read [1].
    • Apache Kafka provides very low end-to-end latency for large volumes of data. According to Confluent, "adding additional Confluent Kafka Units (CKUs) can reduce latency", which in simple terms is adding Kafka brokers only.
    • Relevant variables other than configuration parameters that also affect end-to-end latency:
      1. Implementation of Kafka Client applications, i.e., Kafka Producer and Consumer applications
      2. Partitioning and keying strategy
      3. Produce and consume patterns
      4. Network latency and QoS, and more.

The default values for the configurations discussed in Configuring Kafka for High Throughput - Orderbox optimize Kafka for latency, so those configurations should not typically require any adjustment.

Also there is an inherent trade-off between throughput and latency, so most of the contents we see here are very much linked and similar to, if not exactly the same as, the above article (Configuring Kafka for High Throughput - Orderbox).

Therefore, I recommend going through Configuring Kafka for High Throughput - Orderbox before proceeding with the content that follows in this article.

Number of Partitions

Topic partitions is a unit of parallelism in Kafka, therefore increasing the number of partitions of a given topic may increase the throughput.

However increasing the number of partitions carries a trade-off between throughput and latency, that is, increasing partitions increases latency. Increasing the number of partitions increases the average number of leader partitions per broker, therefore a follower has more partitions to replicate which is carried out by one thread per follower broker by default (number of threads replicating from source is controlled by num.replica.fetchers broker config). This also means that a message will take longer to be considered "committed" which happens after a message is replicated across all in-sync replicas.

Messages are available for reads only after they are "committed". Thus, increasing number of partitions increases the end-to-end latency. It has been discussed in more detail here -

Batching Messages

Batching messages improves the throughput of Kafka cluster. However, this trades-off with higher latency. Batching on Producer and Consumer side, along with its trade-offs with latency have been detailed here -


Enabling compression increases CPU cycles, and reduces network bandwidth usage when done on the Producer side. Disabling compression will spare the CPU usage, but this also means more network usage. Depending on the performance of the compression codec, we may choose to enable or disable compression by setting compression.type=none, however, a good codec should also lower the latency.

Latency/Throughput and CPU/Network trade-offs described in more detail here -

Producer acks for 'Production' latency and 'end-to-end' Latency

Production latency increases when we configure acks for greater durability. However, end-to-end latency is not affected by whatever value we set for acks.

More details on acks and related trade-offs here -

Optimizing Latency of Kafka Streams Application

This requires a separate article of its own which I plan to do soon.

Did you find this article valuable?

Support Krishna Kumar Mahto by becoming a sponsor. Any amount is appreciated!