Kafka Cheatsheet — cvam.sight

Kafka is a distributed, partitioned, replicated commit log. Producers append records to partitions; consumers read at their own offset; the broker just stores an immutable, ordered log per partition and serves reads from the page cache. Everything — ordering, parallelism, delivery guarantees, throughput — falls out of "partition = the unit of everything". Senior interviews probe the why: replication, exactly-once, rebalancing, and the trade-offs.

1. The mental model

Concept	What it is
Topic	A named stream, split into partitions. Logical; not a unit of parallelism on its own.
Partition	An ordered, immutable, append-only log. The unit of ordering, parallelism, replication, and retention. Ordering is guaranteed only within a partition.
Offset	Monotonic position of a record in a partition. Consumers track committed offsets; brokers don't track per-consumer position.
Broker	A server holding partition replicas, serving produce/fetch.
Replica / Leader / Follower	Each partition has R replicas; one leader (all reads/writes), the rest followers (replicate).
ISR	In-Sync Replicas — followers caught up within `replica.lag.time.max.ms`. Only ISR members are eligible to become leader (with unclean election off).
Consumer group	Consumers sharing a `group.id`; partitions are divided among members — each partition to exactly one member.
Consumer lag	log-end-offset − committed-offset. The key health metric.

ordering is per-partition only There is no global order across a topic. If you need ordering for an entity, route it to one partition via a stable key. More partitions = more parallelism but weaker ordering scope.

2. Storage internals

A partition is a directory of segment files (.log) + indexes (.index offset→position, .timeindex). The active segment is appended; older segments roll at segment.bytes/segment.ms.
Writes are sequential appends to the active segment — why Kafka is fast on spinning disk and SSD alike.
Reads are served from the OS page cache; zero-copy (sendfile) ships bytes from cache straight to the socket — no userspace copy.
Retention deletes whole segments past retention.ms/retention.bytes — not individual records.
Records are stored in batches (compressed as a unit) — compression + batching are the throughput levers.

3. Producer internals & config

acks=all                 # leader waits for all ISR before ack (durability)
enable.idempotence=true  # dedupe retries -> exactly-once *per partition* on produce
                         #   (implies acks=all, retries>0, max.in.flight<=5)
linger.ms=10             # wait to batch (throughput vs latency)
batch.size=64KB          # max bytes per partition batch
compression.type=zstd    # lz4/snappy/zstd/gzip — big throughput win
max.in.flight.requests.per.connection=5   # >1 + idempotence keeps order
retries=Integer.MAX ; delivery.timeout.ms  # bound total time, not retry count

acks: 0 fire-and-forget (lose on failure), 1 leader-only (lose if leader dies before replication), all + min.insync.replicas=2 = no acknowledged-then-lost writes.
Idempotent producer stamps a producer ID + sequence number so a retried batch isn't written twice — exactly-once on the produce path, per partition.
Partitioner: keyed → hash(key) % partitions (sticky ordering per key); null key → sticky partitioning (batch to one partition then rotate).

acks=all is not "no data loss" alone Pair acks=all with min.insync.replicas=2 and RF=3. With RF=3/min.isr=2 you survive one broker loss with no acknowledged-write loss; if ISR shrinks below min.isr, producers get NotEnoughReplicas and block — durability over availability, by design.

4. Consumer internals & config

group.id=orders-svc
enable.auto.commit=false          # commit manually AFTER processing (at-least-once)
auto.offset.reset=earliest|latest # when no committed offset exists
max.poll.records=500 ; max.poll.interval.ms=300000   # processing budget
session.timeout.ms=45000 ; heartbeat.interval.ms=3000
partition.assignment.strategy=CooperativeStickyAssignor   # incremental rebalance
isolation.level=read_committed    # skip aborted txn records (EOS reads)

Offset commit: commit after processing for at-least-once; commit before = at-most-once. Offsets live in the internal __consumer_offsets topic.
Group coordinator (a broker) manages membership; a rebalance reassigns partitions when members join/leave or the topic changes.
Heartbeats on a background thread; but exceeding max.poll.interval.ms (slow processing) ejects the member → rebalance. This, not network, is the #1 rebalance cause.
Static membership (group.instance.id) avoids rebalances on rolling restarts.

5. Rebalance protocols

Protocol	Behaviour
Eager (range/round-robin)	Stop-the-world: every consumer revokes all partitions, then reassigns. Simple, disruptive.
Cooperative sticky	Incremental: only the partitions that must move are revoked; others keep processing. Modern default.
Static membership	Members keep identity across restarts (`group.instance.id` + longer session timeout) — no rebalance on a quick bounce.

6. Replication & leader election

Replication factor (RF) = copies per partition. RF=3 is standard for prod.
ISR shrinks when a follower lags > replica.lag.time.max.ms; expands when it catches up.
Leader election: on leader failure, the controller promotes an ISR member. unclean.leader.election.enable=false (default) means it will not promote an out-of-sync replica — choosing consistency (no data loss) over availability.
Controller: one broker coordinates leadership/metadata. In KRaft mode the controllers form a Raft quorum (no ZooKeeper); ZooKeeper is removed in modern Kafka.

7. Delivery semantics & EOS

Guarantee	How
At-most-once	Commit offset before processing — crash loses the record.
At-least-once	Commit after processing — crash redelivers. Default target; handlers must be idempotent.
Exactly-once (EOS)	Idempotent producer + transactions (`transactional.id`, `initTransactions`/`commitTransaction`) + consumer `read_committed`. Atomic "consume-process-produce".

EOS is end-to-end only inside Kafka Transactions give exactly-once for read-from-Kafka → write-to-Kafka (+ offset commit) atomically. A side effect to an external system (DB, email) is not covered — make those idempotent or use the outbox/transactional-outbox pattern.

8. Partitioning & keys

Partition count caps consumer parallelism (active consumers ≤ partitions). Pick for peak throughput + headroom; you can add partitions but it breaks key→partition mapping (and ordering) for existing keys.
Key choice = ordering scope + load distribution. A hot key (e.g. one big customer) creates a skewed/hot partition.
Rule of thumb: target partitions so each handles a manageable MB/s; more partitions = more open files, more rebalance cost, longer leader-election.

9. Retention & compaction

Delete (default): drop segments past retention.ms / retention.bytes.
Log compaction (cleanup.policy=compact): keep the latest value per key forever; a tombstone (null value) deletes a key. Powers changelog/state topics, __consumer_offsets, KTables.
Can combine compact,delete.

10. Ecosystem

Kafka Connect — declarative source/sink connectors (DB CDC via Debezium, S3, ES) — no code, scales as workers.
Kafka Streams — JVM stream-processing library (KStream/KTable, joins, windows, exactly-once) running in your app.
Schema Registry — Avro/Protobuf/JSON schemas + compatibility (backward/forward/full) so producers/consumers evolve safely.
ksqlDB — SQL over streams.

11. CLI & ops

kafka-topics.sh --bootstrap-server B --create --topic t --partitions 12 --replication-factor 3
kafka-topics.sh --bootstrap-server B --describe --topic t        # leaders, ISR, replicas
kafka-topics.sh --bootstrap-server B --under-replicated-partitions
kafka-console-producer.sh --bootstrap-server B --topic t
kafka-console-consumer.sh --bootstrap-server B --topic t --from-beginning --group g
kafka-consumer-groups.sh --bootstrap-server B --describe --group g    # LAG per partition
kafka-consumer-groups.sh --bootstrap-server B --group g --reset-offsets --to-earliest --topic t --execute
kafka-configs.sh --bootstrap-server B --alter --entity-type topics --entity-name t \
  --add-config retention.ms=604800000
kafka-reassign-partitions.sh ...   # move replicas / rebalance

Key metrics: UnderReplicatedPartitions (must be 0), OfflinePartitionsCount, ActiveControllerCount (=1), IsrShrinksPerSec, request handler idle ratio, consumer records-lag-max.

12. Senior interview Q&A

How does Kafka guarantee ordering?Only within a partition. Route records that must be ordered to the same partition via a stable key. No cross-partition/topic order.
acks=all vs min.insync.replicas?acks=all = leader waits for all current ISR. min.insync.replicas = the floor of ISR required to accept a write; below it, produce fails. Together (RF3/min.isr2) = survive one broker loss without acknowledged-write loss.
How does exactly-once work?Idempotent producer (PID+seq dedup) + transactions (atomic produce + offset commit) + consumer read_committed. Exactly-once within Kafka; external side effects need idempotency/outbox.
What triggers a rebalance and how to avoid storms?Member join/leave, topic change, or exceeding max.poll.interval.ms (slow processing). Use cooperative-sticky assignor + static membership; size max.poll.records to your processing time.
What is the ISR and why does it matter?In-sync replicas caught up to the leader. Only ISR members can be elected leader (clean election); shrinking ISR risks availability/durability trade-offs.
Unclean leader election?Promoting an out-of-sync replica when no ISR survives — restores availability but loses data. Off by default (consistency first).
Log compaction vs retention?Retention deletes old segments by time/size. Compaction keeps the latest value per key forever (tombstone deletes) — for changelog/state topics.
Why is Kafka fast?Sequential append-only writes, OS page cache reads, zero-copy (sendfile), batching + compression, partition parallelism.
How do you scale consumer throughput?Add consumers up to partition count; beyond that add partitions (mind key remapping); parallelize processing; raise fetch/poll sizes if you can keep up.
Can you add partitions safely?You can increase (not decrease) — but it changes hash(key)%n, so existing keys may move partitions, breaking per-key ordering. Plan capacity up front.
KRaft vs ZooKeeper?KRaft replaces ZooKeeper with a Raft-based controller quorum inside Kafka — simpler ops, faster metadata, no separate ZK cluster. ZK is deprecated/removed.
How do you handle a poison message?Catch, route to a dead-letter topic, commit past it. Don't block the partition forever; bound retries.
Kafka vs RabbitMQ?Kafka = durable replayable log, high-throughput streaming, pull, ordered per partition. RabbitMQ = smart broker, flexible routing, per-message ack/requeue, push, lower-throughput task queues.
How to debug rising consumer lag?Check consumer-groups LAG per partition (skew?), processing time vs max.poll.interval, rebalance loops, partition count cap, and broker URP/ISR health.

Kafka — The Senior Interview Cheatsheet.