Kafka's Architecture Fundamentals

Overview

Before working with Kafka, it’s important to understand its architecture.

Terminologies

Here are the most important concepts in Kafka

Broker/server

These are machines, containers that accepts requests to the clusters. A broker also holds data.

Topics

These are the virtual container of your messages. The publisher (the one that send messages) and the consumer (the one who consume messages) work directly with topics

Partitions

A topic can have multiple partitions. A partition stores a subset of the topic’s messages. If a topic has only 1 partition then that partition stores 100% of the topic’s messages.

Replication factor

For high availability, messages can be replicated and spread over multiple server/broker. By default, replication factor is 1. If your topic has one partition and replication factor is also 1 then all of the topic’s messages are in a single broker/server.

Cluster

A cluster is a group of brokers that working together to serve as one.

Illustrations

Some diagrams may help with your understanding.

Here, T stands for topic, P stands for partition and R stands for replication

A cluster with 3 brokers, Topic T1 with 2 partitions and replication factor is 1

A cluster with 3 brokers, Topic T1 with 2 partitions and replication factor is 1

A cluster with 3 brokers, Topic T1 with 3 partitions and replication factor is 1

A cluster with 3 brokers, Topic T1 with 3 partitions and replication factor is 1

A cluster with 3 brokers, Topic T1 with 3 partitions and replication factor is 2

A cluster with 3 brokers, Topic T1 with 3 partitions and replication factor is 2

A cluster with 3 brokers, Topic T1 with 3 partitions and replication factor is 3

A cluster with 3 brokers, Topic T1 with 3 partitions and replication factor is 3

Topic, messages and offset

The following diagram illustrates the relationship between topic, parititions, messages and their offets

The following diagram illustrates the relationship between topic, parititions, messages and their offets

Messages don’t reside on topic but on partition. Kafka only guarantees message ordering on a partition, not on topic. Consumers can choose to consumer messages from any particular offset of a partition.

Conclusion

In this post, I’ve shown you the fundamental of Kafka’s architecture with concepts such as broker, topic, partitions. Knowing these will help you with the next posts.

Leave a Comment