A producer is the one which publishes or writes data to the topics within different partitions. Producers automatically know that, what data should be written to which partition and broker. The user does not require to specify the broker and the partition.
How does the producer write data to the cluster?
A producer uses following strategie//s to write data to the cluster:
Apache Kafka enables the concept of the key to send the messages in a specific order. The key enables the producer with two choices, i.e., either to send data to each partition (automatically) or send data to a specific partition only. Sending data to some specific partitions is possible with the message keys. If the producers apply key over the data, that data will always be sent to the same partition always. But, if the producer does not apply the key while writing the data, it will be sent in a round-robin manner. This process is called load balancing. In Kafka, load balancing is done when the producer writes data to the Kafka topic without specifying any key, Kafka distributes little-little bit data to each partition.
Therefore, a message key can be a string, number, or anything as we wish.
There are two ways to know that the data is sent with or without a key:
Let's see an example
Consider a scenario where a producer writes data to the Kafka cluster, and the data is written without specifying the key. So, the data gets distributed among each partition of Topic-T under each broker, i.e., Broker 1, Broker2, and Broker 3.
Consider another scenario where a producer specifies a key as Prod_id. So, data of Prod_id_1(say) will always be sent to partition 0 under Broker 1, and data of Prod_id_2 will always be in partition 1 under Broker 2. Thus, the data will not be distributed to each partition after applying the key (as saw in the above scenario).
In order to write data to the Kafka cluster, the producer has another choice of acknowledgment. It means the producer can get a confirmation of its data writes by receiving the following acknowledgments:
Let' see an example
Suppose, a producer writes data to Broker1, Broker 2, and Broker 3.
Case1: Producer sends data to each of the Broker, but not receiving any acknowledgment. Therefore, there can be a severe data loss, and the correct data could not be conveyed to the consumers.
Case2: The producers send data to the brokers. Broker 1 holds the leader. Thus, the leader asks Broker 1 whether it has successfully received data. After receiving the Broker's confirmation, the leader sends the feedback to the Producer with ack=1.
Case3: The producers send data to each broker. Now, the leader and its replica/ISR will ask their respective brokers about the data. Finally, acknowledge the producer with the feedback.
Note: In the above figure, Broker 1 and Broker 2 has successfully received the data. Thus, both brokers have responded 'Yes' to its respective topics.