There can be several consumers which can read data from the Kafka and producers which can produce data. But, it is also necessary to ensure the security of data.
Need for Kafka Security
There are certain reasons which describe the need for security:
- There can be multiple consumers which read data. There can be cases where the user wants to share data with one or two specific consumers. Thus, the data need to be secured from other consumers.
- The consumers are allowed to write data/messages to any topic. Therefore, it is possible that any unwanted consumer may break the user?s already existing consumer. It is disastrous and requires authorization security.
- It is also possible that any unauthorized user may delete any topic from the user cluster.
Components of Kafka Security
Mainly, there are three major components of Kafka Security:
- Encryption: The exchange of data between the Kafka broker and clients is kept secure through encryption. Kafka encryption ensures that no other client could intercept and steal or read data. The data will be shared in an encrypted format only.
- Authentication: This ensures the connection of various applications with the Kafka brokers. Only the applications which are authorized will be able to publish or consume messages. The authorized applications will have their respective user id and password to access the message from the cluster.
- Authorization: This comes after authentication. When the client becomes authenticated, it is allowed to publish or consume messages. The applications can also be restricted from write access to prevent data pollution.
Following are the security models used in Apache Kafka:
- PLAINTEXT: Generally, the data is sent in the string form to Kafka. PLAINTEXT does not require any authentication and authorization. Thus, data is not secure. PLAINTEXT is only used for ?Proof-of-Concept?. So, it is not recommended for the environments which need high data security.
- SSL: It extends for ?Secure Socket Layer?. SSL can be used for both encryption as well as authentication. If any application is using SSL, it is required to configure it. SSL encryption permits 1-way authentication that allows the client to authenticate the server certificate.SSL authentication permits 2-way authentication, where the broker can also authenticate the client certificate. But, enabling SSL can impact the performance due to encryption overhead.
- SASL: It extends for ?Security Authentication and Security Layer?. It is a framework for data security and user authentication over the internet. Apache Kafka enables client authentication through SASL. A number of SASL mechanisms are enabled on the broker altogether, but the client has to choose only one mechanism.
The different SASL mechanisms are:
GSSAPI(Kerberos): If Kerberos server or Active Directory is already in use, there is no requirement to install a new server only for Kafka.
PLAIN: It is a simple traditional security approach that uses a valid username and password for the authentication mechanism. In Kafka, PLAIN is the default implementation. It can be further extended for production use in Kafka. SASL/PLAIN should be used as a transport layer for ensuring that clear passwords are not transmitted over the wire without encrypting it.
SCRAM: It extends for the ?Salted Challenge Response Authentication Mechanism?. It is a family of SASL mechanism which addresses the security concerns by performing username/password authentication such as PLAIN does. The SCRAM in Kafka keeps its credentials in the zookeeper, which can be further used for Kafka installations.
OUATHBEARER: The OUATH2 Authorization Framework allows a third party application for accessing HTTP service in a limit. The default OAUTHBEARER in Kafka is suitable for use in non-production installations of Apache Kafka. Also, it creates as well as validates Unsecured JSON web tokens.
Delegation Tokens: It is a lightweight authentication mechanism used for completing the SASL/SSL methods. Delegation tokens work as a shared secret between Kafka brokers and clients. These tokens help frameworks in distributing the workload to the available workers in a secure environment. Also, no additional distribution cost is required.
Thus, through encryption, authentication as well as authorization, one can enable the Kafka security for the data.