Apache Kafka: Creating Kafka topics

Creating Kafka Topics: Examples and Syntax

Introduction

In Apache Kafka, topics are the fundamental abstraction that serves as a category or a bucket in which we store data. Producers publish data on the Kafka topic, while consumers subscribe to the topic and read the data. In this blog, we explain creating Kafka topics with practical examples. We’ll also look at some advanced configurations you can pass to the create command when creating your topic.

Creating Kafka Topics: The basics.

Let’s look at a simple Kafka Topic create command. For this blog, we assume you are already inside your Kafka installation directory or have set your environment’s $PATH variable to include your Kafka bin directory. Therefore, the commands we explain here assume you are running them from the Kafka Installation Directory.

The basic topic create command would look something like this:

This command will create a topic on the Kafka cluster with the default number of partitions and replication factor configured in the Kafka broker’s properties file.

Here’s an example of how the command would look like:

Output:

To confirm whether the topic was created use the --list command option:

Advanced configurations and options for creating Kafka topics

When creating a Kafka topic, you can pass additional arguments to the command to tailor it to your requirements. Here’s a list of all the arguments you can pass with the create command. We’ll explore some examples after this section.

  • Replication Factor (--replication-factor) [integer]: Specify the number of replicas of each partition you’d like for this topic. This should not exceed more than the number of brokers available. Replication ensures data durability and fault tolerance. The default value is 1.
  • Partition Count (--partitions) [integer]: Specify the number of partitions you want for the topic. While partitions enable parallel processing and scalability, you must pay attention to the resource utilization and zookeeper overhead. The default value is 1
  • Retention Time (--retention-ms) [long_integer]: This kafka topic configuration controls the maximum time a message is retained in the topic in milliseconds. After this time, the message becomes eligible for deletion.
  • Cleanup Policy (--cleanup-policy) [“delete” / “compact”]: Defines the criteria for deleting old log segments. Options include ‘delete’ and ‘compact’.
  • Minimum In-Sync Replicas (--min-insync-replicas): [integer] Specifies the minimum number of in-sync replicas required for a produce request to be considered successful.
  • Segment Size (--segment-bytes) [long_integer]: Sets the size of each log segment. Smaller segments may lead to faster log compaction.
  • Unclean Leader Election (--unclean-leader-election) [boolean]: Determines whether a replica not in sync with the leader can become the leader during an election. We recommended keeping this value false.
  • Maximum Message Bytes (--max-message-bytes) [long_integer]: Defines the maximum size of a message that can be published on the topic.

Examples

Here are some examples of creating Kafka topics with advanced configurations.

Specify replication factor and number of partitions:

Let’s create a topic with 10 partitions and 3 replicas for each partition.

Specify the retention time:

In this example, we create a topic with a 24-hour retention period. Notice that we are passing this value in milliseconds.

Disable Unclean Leader Election:

While this setting is disabled by default in most Kafka installations, if you need to disable the unclean leader election specifically, you can pass the --unclean-leader-election argument and set it to false.

Maximum Message Bytes:

To control the maximum size of a message that can be published to a Kafka topic, we can use the --max-message-bytes option with the create command.

So, the example below creates a topic that can receive messages up to 10 MB in size.

References and further reading

Leave a Reply

Your email address will not be published. Required fields are marked *