Skip to content
This repository has been archived by the owner on Feb 8, 2024. It is now read-only.

Kafka Server Setup

Justin Woo edited this page Oct 4, 2021 · 66 revisions


Follow below steps on all nodes where kafka is to be installed & configured (1 node / multi-node deployment of kafka)

1. Install kafka

Preferred way is to get the kafka rpm from below location & install the same. This will install kafka binaries, and create required 'kafka' user and 'kafka' group. Note that this user is 'nohome' user.

tar -xvf third-party-centos-7.8.2003-1.0.0-0.tar.gz
cd centos-7.8.2003-2.0.0-*/commons/kafka
yum install kafka-2.13_2.7.0-el7.x86_64.rpm

Validate 'kafka' user and group has been created. If not found follow below steps to create them

1.a Create user kafka

We will use Kafka downloaded from Seagate's repository; and by default, that Kafka is configured to be run by kafka user in kafka group, thus we have to create such user and group.

sudo su 
adduser kafka
usermod -aG wheel kafka
echo "kafka ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers.d/90-cloud-init-users

groupadd --force kafka
usermod --append --groups kafka kafka

Installation Mode:

A. Kafka 1 node Setup

Kafka Configuration (

The following has to be configured in /opt/kafka/config/

Configure the below if hostname or FQDN (fully qualified domain name) is used in zookeeper.connect

# The address the socket server listens on. It will get the value returned from
# if not configured.
#     listeners = listener_name://host_name:port
#     listeners = PLAINTEXT://

Configure the below to make delete(purge) interface work

# The interval at which log segments are checked to see if they can be deleted according 
# to the retention policies 

Configure below to indicate log directory for kafka broker


Make below changes to indicate data and log directory for zookeeper in file /opt/kafka/config/ Since kafka is nohome user, make use of below directories.


If in case any directory like datadir/logdir is already present, then clean the content of that directory before starting zookeeper and kafka broker. Ensure that these directories have proper ownership (kafka:kafka). Use the following command to change the ownership

If in case datadir and datalogdir are not present, please create them.

mkdir -p /var/log/zookeeper
mkdir -p /var/lib/zookeeper
mkdir -p /var/local/data/kafka

# Make sure that kafka:kafka has access to the dataDir and logDir, including the parent directories
chown -R kafka:kafka /var/lib/zookeeper
chown -R kafka:kafka /var/log/zookeeper
chown -R kafka:kafka /var/local/data/kafka

Using systemctl command for controlling kafka and zookeeper services.

Enable services

systemctl enable kafka-zookeeper
systemctl enable kafka

Start services and check the status

systemctl start kafka-zookeeper
sleep 5 # (kafka service needs zookeeper service to be up and running.)
systemctl status kafka-zookeeper
systemctl start kafka 
systemctl status kafka
# Make sure that you see `Active: active (running)` when checking the status of both systems.

How to stop Services

systemctl stop kafka
systemctl stop kafka-zookeeper

B. Kafka 3 node Cluster Setup

Download the kafka rpm using the command and install it in all the nodes

curl "" -o kafka.rpm
yum install kafka.rpm

If above location is not reachable, then find the Kafka rpm in this tar image -

Kafka Configuration

Kafka configuration involves setting up,, creating myid file and setting correct ownership to datadir. configuration

The following has to be configured in /opt/kafka/config/ across nodes

Define a unique broker id for each kafka server. 

Define a directory for storing of log files


To form a cluster of 3 nodes, add a comma separated list of node and port addresses in the zookeeper.connect parameter so that if a zookeeper instance fails, the node will automatically try to connect to the next available address

zookeeper.connect= <node 1 address>:2181,<node 2 address>:2181,<node 3 address>:2181

Configure the below if hostname or FQDN is used in zookeeper.connect

# The address the socket server listens on. It will get the value returned from
# if not configured.
#     listeners = listener_name://host_name:port
#     listeners = PLAINTEXT://

Configure the below to make delete(purge) interface work,

# The interval at which log segments are checked to see if they can be deleted according 
# to the retention policies 

Note : It is possible to have multiple kafka server instances on a single node. In that case we need to define separate file for each instance.

Set proper replication factor for metadata and transaction states. This is required in multi-node setup.

transaction.state.log.min.isr=2 configuration

Define the configuration for the zookeeper in the /opt/kafka/config/ file by the following configuration parameters

dataDir=/var/lib/zookeeper     (myid file will be created inside this directory)>
dataLogDir=/var/log/zookeeper  (if not defined, then datadir will be used)>
server.1=<node 1 address>:2888:3888
server.2=<node 2 address>:2888:3888
server.3=<node 3 address>:2888:3888

The details for the configuration parameters can be found at

Repeat the above steps for each node in the cluster.

Create myid file

In the dataDir folder, add a file myid and add the node id as 1 to the file in the first node. (This must be a single integer value).
Similarly, for nodes 2 and 3, add their respective ids in dataDir/myid file on the respective nodes.  

Set correct ownership of datadir and myid file to kafka:kafka

If in case any directory like datadir/logdir is already present, then clean the content of that directory before starting zookeeper and kafka broker. Ensure that these directories have proper ownership (kafka:kafka).

chown -R kafka:kafka <path/to/datadir>

Using systemctl command for controlling kafka and zookeeper services.

Enable services

Enable the services on each node.

systemctl enable kafka-zookeeper
systemctl enable kafka

Start Services

Start the services on each node.

systemctl start kafka-zookeeper
sleep 5 # (kafka service needs zookeeper service to be up and running.)
systemctl start kafka

How to stop Services

To Stop service on each node.

systemctl stop kafka
systemctl stop kafka-zookeeper