Learn How to Start Working with KAFKA and More

Top 10 Courses to Learn Apache Kafka in 2021 | by javinpaul | Javarevisited | Medium

Kafka is a distributed publish-subscribe messaging system that is designed to be fast, scalable, and durable. It was developed by LinkedIn and open-sourced in the year 2011. It makes an extremely desirable option for data integration with the increasing complexity in real-time data processing challenges. It is a great solution for applications that require large-scale message processing.

Components of Kafka are :

  1. Zookeeper
  2. Kafka Cluster – which contains one or more servers called brokers
  3. Producer – which publishes messages to Kafka
  4. Consumer – which consumes messages from Kafka.

Components :

Kafka Architecture and Its Fundamental Concepts - DataFlair

It saves messages on a disk and allows subscribers to read from it. Communication between producers, Kafka clusters, and consumers takes place with the TCP protocol. All the published messages will be retained for a configurable period of time. Each Kafka broker may contain multiple topics into which producers publish messages. Each topic is broken into one or more ordered partitions. Partitions are replicated across multiple servers for fault tolerance. Each partition has one Leader server and zero or more follower servers depending upon the replication factor of the partition.

When a publisher publishes to a Kafka cluster, it queries which partitions exist for that topic and which brokers are responsible for each partition. Publishers send messages to the broker responsible for that partition (using some hashing algorithm).

Consumers keep track of what they consume (partition id) and store it in Zookeeper. In case of consumer failure, a new process can start from the last saved point. Each consumer in the group gets assigned a set of partitions to consume from.

Producers can attach key with messages, in which all messages with same key goes to same partition. When consuming from a topic, it is possible to configure a consumer group with multiple consumers. Each consumer in a consumer group will read messages from a unique subset of partitions in each topic they subscribe to, so each message is delivered to one consumer in the group, and all messages with the same key arrive at the same consumer.

Role of Zookeeper in –

What Is Apache Kafka? | How Kafka Works | OpenLogic

It provides access to clients in a tree-like structure. It use ZooKeeper for storing configurations and use them across the cluster in a distributed fashion. It maintains information like topics under a broker, offset of consumers.

Steps to get started with (For UNIX):

  1. Download Kafka – wget http://mirror.sdunix.com/apache/kafka/0.8.2.0/kafka_2.10-0.8.2.0.tgz
  2. tar -xzf kafka_2.10-0.8.2.0.tgz
  3. cd kafka_2.10-0.8.2.0/
  4. nside Config folder you will see server, zookeeper config files
  5. Inside bin folder you will see bash files for starting zookeeper, server, producer, consumer
  6. Start zookeeper – bin/zookeeper-server-start.sh config/zookeeper.properties
  7. Start server – bin/kafka-server-start.sh config/server.properties
  8. creating topic –bin/kafka-topics.sh –create –zookeeper localhost:2181 –replication-factor <your_replication_factor> –partitions <no._of_partitions> –topic <your_topic_name>
    This will create a topic with specified name and will be replicated in to brokers based on replication factor and topic will be partitioned based on partition number. Replication factor should not be greater than no. of brokers available.
  9. view topic – bin/kafka-topics.sh –list –zookeeper localhost:2181
  10. delete a topic – add this line to server.properties file delete.topic.enable=true
    then fire this command after starting zookeeper
    bin/kafka-topics.sh –zookeeper localhost:2181 –delete –topic <topic_name>
  11. alter a topic – bin/kafka-topics.sh –zookeeper localhost:2181 –alter –topic <topic_name>
  12. Start producer – bin/kafka-console-producer.sh –broker-list localhost:9092 –topic <your_topic_name> and send some messages
  13. Start consumer – bin/kafka-console-consumer.sh –zookeeper localhost:2181 –from-beginning –topic <your_topic_name> and view messages

If you want to have more than one server, say for ex : 4 (it comes with single server), the steps are:

  1. create server config file for each of the servers :
    cp config/server.properties config/server-1.proeprties
    cp config/server.properties config/server-2.properties
    cp config/server.properties config/server-3.properties
  2. Repeat these steps for all property files you have created with different brokerId, port.
    vi server-1.properties and set following properties
    broker.id = 1
     port = 9093
    log.dir = /tmp/kafka-logs-1
  3. Start Servers :
    bin/kafka-server-start.sh config/server-1.properties &
    bin/kafka-server-start.sh config/server-2.properties &
    bin/kafka-server-start.sh config/server-3.properties &

Now we have four servers running (server, server-1,server-2,server-3)

PROGRAMMING

The program in java includes producer class and consumer class.

Producer Class :

Producer class is used to create messages and specify the topic name with an optional partition.

The maven dependency jar to be included is

We need to define properties for a producer to find brokers, serialize messages, and sends them to the partitions it wants.

Once producer sends data, if we pass an extra item (say id) via data :
ex :

before publishing data to brokers, it goes to the partition class which is mentioned in the properties and selects the partition to which data has to be published.

Consumer Class :

Topic Creation :

It is a distributed commit log service that functions much like a publish/subscribe messaging system, but with better throughput, built-in partitioning, replication, and fault tolerance.

Applications:

  1. Stream processing
  2. Messaging
  3. Multiplayer online game
  4. Log aggregation

Share:

Share on facebook
Facebook
Share on twitter
Twitter
Share on pinterest
Pinterest
Share on linkedin
LinkedIn
<b><strong>Karan Makan</strong></b>

Karan Makan

Technology Engineer and Entrepreneur. Currently working with International Clients and helping them scale their products through different ventures. With over 8 years of experience and strong background in Internet Product Management, Growth & Business Strategy.

On Key

Related Posts

Hook Up on Tinder

Since dating can be stressful, there is the possibility of humor to try to reduce tensions. In a new study published in the Proceedings of

error: Content is protected !!