<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Elad Eldor</title>
    <description>The latest articles on Forem by Elad Eldor (@eeldor).</description>
    <link>https://forem.com/eeldor</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3939934%2F2931f13c-b15c-4d4e-942e-bfb467f64827.png</url>
      <title>Forem: Elad Eldor</title>
      <link>https://forem.com/eeldor</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/eeldor"/>
    <language>en</language>
    <item>
      <title>Kafka's Real Compression Problem Is Batch Depth</title>
      <dc:creator>Elad Eldor</dc:creator>
      <pubDate>Thu, 21 May 2026 22:09:32 +0000</pubDate>
      <link>https://forem.com/eeldor/kafkas-real-compression-problem-is-batch-depth-515k</link>
      <guid>https://forem.com/eeldor/kafkas-real-compression-problem-is-batch-depth-515k</guid>
      <description>&lt;h3&gt;
  
  
  Kafka compression waste is usually a batch depth problem, not a codec problem. Better batching improves producer compression, which reduces consumer CPU and cross-AZ cost downstream.
&lt;/h3&gt;

&lt;p&gt;In one production deployment, changing batch sizing and linger settings cut the consumer fleet in half and moved compression from under 10% to over 50% - with no codec change. The cause wasn't the codec. It was batch depth.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why batch depth controls what the codec sees
&lt;/h3&gt;

&lt;p&gt;Kafka producers compress batches, not individual messages. The compression codec sees whatever the producer has accumulated by the time it flushes. &lt;code&gt;linger.ms&lt;/code&gt; sets how long the producer waits to accumulate records. &lt;code&gt;batch.size&lt;/code&gt; caps how large that accumulation can grow.&lt;/p&gt;

&lt;p&gt;Both settings are conservative by default. When per-producer throughput is low - because traffic is light, or because it's spread across too many producer instances - the linger window closes before much data has arrived.&lt;/p&gt;

&lt;p&gt;That matters because compression ratio is a function of (1) how much data the codec can see at once and (2) how much redundancy exists across that data. A compressor working on a single JSON record finds repetition only within that record. Working on a hundred records from the same schema, it finds the same field names, the same value patterns, and the same structural redundancy repeated across every record. &lt;/p&gt;

&lt;p&gt;At shallow batch depth, redundancy is limited to a single record. At depth, the compressor finds the same field names, value patterns, and structural repetition across every record in the batch - a qualitatively different input. This batch shape problem doesn't stay at the producer.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzifygvaaixakw37we6jl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzifygvaaixakw37we6jl.png" alt=" " width="800" height="398"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Small producer batches create a consumer CPU tax
&lt;/h3&gt;

&lt;p&gt;When producer batches are small, the broker stores small compressed record batches. Consumers fetching from that topic receive small responses, so to get more data they issue more fetch requests to Kafka brokers. &lt;/p&gt;

&lt;p&gt;Each fetch request carries fixed overhead: a network round trip, broker-side processing, client-side dispatch, metadata handling, bookkeeping. When responses are small, that overhead is paid repeatedly on little data. The consumer fleet burns CPU on round-trip mechanics rather than on processing records.&lt;/p&gt;

&lt;p&gt;In one production deployment, a high-throughput topic had &lt;code&gt;batch.size&lt;/code&gt; at 16KB (the default) and &lt;code&gt;fetch.min.bytes&lt;/code&gt; at 1 byte (also the default). Tuning &lt;code&gt;batch.size&lt;/code&gt; to 80KB and &lt;code&gt;fetch.min.bytes&lt;/code&gt; to 512KB cut the consumer fleet from 60 to 30 pods. Per-pod CPU increased by roughly 30%, but the fleet was processing the same volume of data with half the pods - it had stopped spending the majority of its time on fetch overhead. Compression ratio on the same topic improved from 10% to 50% with no codec change.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkzbr0wderbk15o8lj8vy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkzbr0wderbk15o8lj8vy.png" alt=" " width="800" height="552"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The overhead is fixed per fetch. What changes is how much data it buys you.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl3vk41g8akq0cq5u4ltc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl3vk41g8akq0cq5u4ltc.png" alt=" " width="800" height="482"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The producer's batch decision bills every consumer group
&lt;/h3&gt;

&lt;p&gt;In cloud deployments, data crossing availability zone boundaries is billed per byte - producer-to-broker, inter-broker replication, and broker-to-consumer are all billable paths. Batch depth affects all three paths simultaneously:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Smaller wire size from better compression reduces the bytes in the producer-to-broker path. &lt;/li&gt;
&lt;li&gt;Replication copies those same bytes, so smaller compressed batches reduce replication traffic proportionally. &lt;/li&gt;
&lt;li&gt;Every consumer group fetches its own copy of those bytes - fan-out multiplies the savings across every downstream reader automatically.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A meaningful reduction in compressed batch size propagates through producer ingress, replication, and every consumer fan-out stream.&lt;/p&gt;

&lt;p&gt;The prioritization rule follows directly: throughput × fan-out. A 20% wire-size reduction on a topic with 8× fan-out matters more than a 50% reduction on a topic with 1× fan-out. The highest ROI comes from fixing the topics where the multiplier is largest.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5gr6tbvr72b4qqcld0ah.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5gr6tbvr72b4qqcld0ah.png" alt=" " width="799" height="408"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Diagnosing the problem
&lt;/h3&gt;

&lt;p&gt;The following queries use metric names common to the standard JMX exporter - verify names against your specific client library and exporter version before relying on them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Batch fill rate:&lt;/strong&gt; &lt;br&gt;
&lt;code&gt;kafka_producer_batch_size_avg / kafka_producer_batch_size_max&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Values consistently below 0.3 indicate that batches are flushing before they are meaningfully filled.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Compression ratio by topic:&lt;/strong&gt;&lt;br&gt;
&lt;code&gt;rate(kafka_producer_compression_rate_avg[5m])&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This metric reports the ratio of compressed to uncompressed size - lower is better. A value near 1.0 means the codec is doing nothing. On a zstd-configured producer with structured data, sustained values well below 1.0 are achievable with proper batch depth - if you're seeing values near 1.0 consistently, batches are too shallow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Consumer fetch size:&lt;/strong&gt;&lt;br&gt;
&lt;code&gt;rate(kafka_consumer_fetch_size_avg[5m])&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Consistently small values indicate consumers are issuing many small fetches - a downstream symptom of small producer batches.&lt;br&gt;
These three metrics, read together, identify whether the problem is at the producer (batch fill), at the codec (compression rate), or propagated to the consumer (fetch size). They also identify which topics to fix first: sort by &lt;em&gt;bytes_out_per_sec × consumer_group_count&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fglas5q4t7b53mvaupk4q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fglas5q4t7b53mvaupk4q.png" alt=" " width="800" height="478"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  What to fix, in order
&lt;/h3&gt;

&lt;p&gt;For each prioritized topic:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Batch depth:&lt;/strong&gt; Increase &lt;code&gt;linger.ms&lt;/code&gt; to 20–50ms. This adds a hard latency floor - every message waits up to that window before flushing. On latency-sensitive paths - fraud detection, ad bidding, synchronous request-reply over Kafka - this is unacceptable. Apply only where end-to-end latency tolerance is measured in seconds, not milliseconds. &lt;/p&gt;

&lt;p&gt;Increase &lt;code&gt;batch.size&lt;/code&gt; to 64–256KB depending on message size and throughput and measure batch fill rate before and after.&lt;/p&gt;

&lt;p&gt;One constraint before raising &lt;code&gt;batch.size&lt;/code&gt;: Kafka producers allocate memory pools per partition from a shared &lt;code&gt;buffer.memory&lt;/code&gt; budget (default 32MB). On a producer writing to many partitions simultaneously, large &lt;code&gt;batch.size&lt;/code&gt; values can exhaust this budget under load, causing blocked &lt;code&gt;send()&lt;/code&gt; calls or client-side exceptions. Check partition count per producer instance and raise &lt;code&gt;buffer.memory&lt;/code&gt; proportionally before making the change.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Codec:&lt;/strong&gt; Switch to &lt;code&gt;compression.type=zstd&lt;/code&gt; with &lt;code&gt;compression.zstd.level=1&lt;/code&gt;, not &lt;code&gt;zstd-3&lt;/code&gt;. If the topic is already on zstd, check the level - the Kafka default is not optimal for structured data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Consumer fetch settings:&lt;/strong&gt; Align &lt;code&gt;fetch.min.bytes&lt;/code&gt; and &lt;code&gt;fetch.max.wait.ms&lt;/code&gt; with the new batch sizes. Without this, consumers issue small fetches against larger broker batches, negating part of the gain.&lt;/p&gt;

&lt;p&gt;Broker disk usage drops as a side effect - Kafka stores compressed record batches on disk, so whatever reduces wire size reduces storage without additional work.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyfgfa8wy446luq624nk3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyfgfa8wy446luq624nk3.png" alt=" " width="800" height="425"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Closing
&lt;/h3&gt;

&lt;p&gt;Kafka compression waste is usually a batch depth problem. Once the batch is deep enough, the codec does its job; until then, the producer is starving it of useful input.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is part 2 of the Kafka Network Cost series. Part 1: Kafka Compute Is Cheap. Network Is Not. Part 3: Fix the Codec Before You Touch the Schema. Part 4: the S3 indirection pattern for analytical consumers.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Keywords: Kafka batch tuning, Kafka compression zstd, linger.ms batch.size optimization, Kafka producer tuning, cross-AZ network cost, fetch.min.bytes.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>distributedsystems</category>
      <category>infrastructure</category>
      <category>performance</category>
    </item>
    <item>
      <title>Kafka Compute Is Cheap. Network Is Not</title>
      <dc:creator>Elad Eldor</dc:creator>
      <pubDate>Wed, 20 May 2026 14:28:42 +0000</pubDate>
      <link>https://forem.com/eeldor/kafka-compute-is-cheap-network-is-not-2bdh</link>
      <guid>https://forem.com/eeldor/kafka-compute-is-cheap-network-is-not-2bdh</guid>
      <description>&lt;h3&gt;
  
  
  Cross-AZ network transfer often costs more than compute. Here's why it's invisible and what to do about it
&lt;/h3&gt;

&lt;p&gt;Your most expensive Kafka topic probably isn't the one with the most data. It's the one with the most consumers, because cross-AZ network transfer often costs more than compute in real Kafka deployments - sometimes by 5–10x when fan-out is high and pod placement is unlucky.&lt;/p&gt;

&lt;p&gt;While the Data Transfer cost shows up in cloud billing, the line items don't point back to Kafka topics. The AWS CUR (Cost and Usage Report) shows EC2 / Data Transfer, Kafka dashboard shows producer and consumer metrics, and nobody looks at both at once. That gap is why Data Transfer cost persists at companies that are otherwise rigorous about infrastructure spend.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkqgwb5ztabh71m3ki3az.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkqgwb5ztabh71m3ki3az.png" alt=" " width="799" height="291"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This article is about this hidden cost and what to do about it, and it's relevant when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Kafka brokers span multiple AZs (Availability Zones)&lt;/li&gt;
&lt;li&gt;Producers and Consumers run in different AZs than the Brokers&lt;/li&gt;
&lt;li&gt;You run Kafka on AWS or GCP (Azure doesn't charge on cross-AZ networking)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Quick Diagnostic
&lt;/h3&gt;

&lt;p&gt;If you already have bytes_in and bytes_out metrics per topic, you can estimate fan-out.&lt;br&gt;
For a topic with 200 GB/hour in and 600 GB/hour out at RF=2:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Producer throughput ≈ 200 ÷ 2 = 100 GB/hour
Replication outbound ≈ 100 GB/hour
Consumer outbound ≈ 600–100 = 500 GB/hour
Fan-out ≈ 500 ÷ 100 = 5x
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That estimate is enough to rank topics by cost impact. It's not exact enough for chargeback, because compression, retries, and internal broker traffic can distort the numbers.&lt;br&gt;
If bytes_out is much larger than bytes_in, the gap is usually fan-out.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5riy95rs48uvvv7cpuut.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5riy95rs48uvvv7cpuut.png" alt=" " width="800" height="269"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  How Data Transfer Billing Works
&lt;/h3&gt;

&lt;p&gt;In AWS CUR, AZ-to-AZ traffic in the same region appears under DataTransfer-Regional-Bytes. On AWS, this is typically about $0.01 per GB (before discount) for data leaving an AZ within a region. GCP is similar, but exact rates vary by region and discount agreement.&lt;br&gt;
This means a single GB can be charged multiple times as it moves through Kafka:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;producer to leader broker&lt;/li&gt;
&lt;li&gt;leader to follower broker&lt;/li&gt;
&lt;li&gt;broker to each consumer group&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Kafka also generates extra bidirectional traffic from fetches, acknowledgments, heartbeats, and recovery activity, so the effective cost of a topic is usually a bit higher than the raw payload size suggests.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Three Paths
&lt;/h3&gt;

&lt;p&gt;Kafka traffic has three cost-bearing paths.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Producer to broker:&lt;/strong&gt; Producers write to partition leaders. If the producer pod and the leader live in different AZs, that write crosses an AZ boundary. Producers must reach the leader, so this cannot be avoided by configuration alone.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Replication between brokers:&lt;/strong&gt; Leaders replicate to followers. RF=2 copies each write once. RF=3 copies it twice. In a multi-AZ cluster, replication is part of durability.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Broker to consumer:&lt;/strong&gt; Consumers fetch data from brokers. Each consumer group reads the topic independently, so this path scales with fan-out.
The mental model is:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;billable transfers=1+(RF−1)+fan-out

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5v74i7xeqqkg4h3t0op.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5v74i7xeqqkg4h3t0op.png" alt=" " width="800" height="675"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is a worst-case upper bound, but it is a useful one. It explains why a topic with many consumers can cost far more than a topic with more writes.&lt;/p&gt;

&lt;p&gt;Kafka transfers compressed batches, so Data Transfer cost is based on bytes on the wire, not logical message size - better compression and batching reduce every term in the model. Consumer-side filtering doesn't reduce network cost, since Kafka still ships the full records - filtering saves CPU, not bandwidth.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fan-Out Drives Cost
&lt;/h3&gt;

&lt;p&gt;The main surprise is that consumer count often matters more than producer throughput - a topic with one consumer group has far less cross-AZ cost than the same topic with five consumer groups, even when producer traffic is identical. The extra cost comes entirely from the broker-to-consumer path.&lt;/p&gt;

&lt;p&gt;Fan-out is often understated in steady-state measurements. Rebalances, pod restarts, backfills, and offset resets can replay old data and temporarily amplify Data Transfer cost. That's why optimization should target &lt;em&gt;throughput×fan-out&lt;/em&gt; instead of throughput alone.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzn5bt3mn3rbfa0lhg8uu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzn5bt3mn3rbfa0lhg8uu.png" alt=" " width="800" height="406"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Placement Matters
&lt;/h3&gt;

&lt;p&gt;Compute optimization and network cost optimization pull in opposite directions, since Kubernetes autoscalers usually optimize for compute, not Kafka topology. When pods are rescheduled, they land wherever capacity is available, not wherever Kafka brokers happen to be.&lt;/p&gt;

&lt;p&gt;That matters because a pod in a non-broker AZ pays extra on every Kafka interaction:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Producer-heavy services are affected on writes&lt;/li&gt;
&lt;li&gt;Consumer-heavy services are affected on fetches&lt;/li&gt;
&lt;li&gt;Mixed services pay on both&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In a three-AZ cluster with brokers in two of them, a randomly placed pod has a baseline ~67% chance of landing outside a broker AZ. K8s autoscalers can push that higher since they bin-pack into whatever AZ has spare capacity, so in practice the effective cross-AZ exposure for consumer pods can run 73–90%+ on some clusters.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc97qklqggmw7uazyj8is.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc97qklqggmw7uazyj8is.png" alt=" " width="800" height="494"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  RF=3 Can Be Cheaper Than RF=2
&lt;/h3&gt;

&lt;p&gt;When looking only at the replication path, RF=2 is cheaper on storage and replication compared to RF=3. Counter-intuitively, it's not always true for total network cost. On high fan-out topics, RF=3 can reduce cross-AZ consumer traffic because each AZ has a replica available for local reads. The extra replication cost is fixed per write, while the read-side savings scale with fan-out.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq6eikjovpcnlhrcfb4gk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq6eikjovpcnlhrcfb4gk.png" alt=" " width="800" height="432"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This requires client.rack on consumers, a rack-aware assignor, and reasonably balanced AZ distribution. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica" rel="noopener noreferrer"&gt;KIP-392&lt;/a&gt; enables consumers to fetch from the closest replica when rack-aware selection is configured. &lt;a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-881:+Rack-aware+Partition+Assignment+for+Kafka+Consumers" rel="noopener noreferrer"&gt;KIP-881&lt;/a&gt; improves rack-aware consumer assignment. However &lt;a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage" rel="noopener noreferrer"&gt;KIP-405&lt;/a&gt; is different since it moves cold log segments to remote storage, which reduces storage cost but doesn't remove broker-mediated Data Transfer cost.&lt;br&gt;
If those conditions are met, RF=3 can lower total cross-AZ traffic even though it stores more data. On read-heavy topics, that can make it cheaper overall.&lt;br&gt;
The right question isn't is RF=3 more expensive? - Instead it's which costs more on this topic: extra replication or repeated cross-AZ reads?&lt;/p&gt;

&lt;h3&gt;
  
  
  What To Do With This
&lt;/h3&gt;

&lt;p&gt;The biggest savings usually lie within these steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Start with the top throughput topics&lt;/li&gt;
&lt;li&gt;Derive fan-out from bytes_in and bytes_out&lt;/li&gt;
&lt;li&gt;Sort topics by throughput × fan-out&lt;/li&gt;
&lt;li&gt;Map pod distribution against broker AZs&lt;/li&gt;
&lt;li&gt;Audit client.rack on all consumers&lt;/li&gt;
&lt;li&gt;Revisit RF on high fan-out topics&lt;/li&gt;
&lt;li&gt;Check for AZ mismatches across clusters&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnqs2835ohojc66ipe25t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnqs2835ohojc66ipe25t.png" alt=" " width="800" height="535"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A few things worth keeping in mind as you work through that list:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Services with significant pod presence outside broker AZs are paying a topology tax in the form of Data Transfer cost&lt;/li&gt;
&lt;li&gt;RF=3 with proper rack configuration may be cheaper than RF=2 on read-heavy topics&lt;/li&gt;
&lt;li&gt;The conventional Data Transfer cost ranking assumes single-consumer topics and doesn't generalize for high fanout ones&lt;/li&gt;
&lt;li&gt;Compression is an important lever - better batching and better codecs reduce bytes on the wire, and that lowers cross-AZ cost directly&lt;/li&gt;
&lt;li&gt;For non-real-time consumers, another option is to remove them from Kafka entirely and serve them from S3 instead - one Kafka consumer writes to S3, and many analytical readers consume from S3 over a VPC gateway endpoint. That avoids Kafka fan-out for workloads that can tolerate seconds to minutes of latency&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Closing
&lt;/h3&gt;

&lt;p&gt;Kafka often looks expensive because the bill is driven by network topology, consumer fan-out, and placement - not just EC2 compute.&lt;/p&gt;

&lt;p&gt;The fix is usually the same:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;measure fan-out&lt;/li&gt;
&lt;li&gt;align placement with broker topology where possible&lt;/li&gt;
&lt;li&gt;improve compression&lt;/li&gt;
&lt;li&gt;reconsider RF or S3 indirection where read traffic dominates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The cost is rarely where people first look. Once you can see the fan-out, the leverage is obvious.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is Part 1 of the Kafka Network Cost series. Part 2: Kafka's Real Compression Problem Is Batch Depth. Part 3: Fix the Codec Before You Touch the Schema. Part 4: The Cheapest Kafka Consumer Is One That Doesn't Read From Kafka. Part 5: The S3 GET Limit Nobody Plans For&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Keywords: Kafka cross-AZ cost, AWS DataTransfer-Regional-Bytes, Kafka network optimization, fan-out cost model, Kafka VPC topology, KIP-392, KIP-881, KIP-405.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>kafka</category>
      <category>aws</category>
      <category>infrastructure</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
