Javatpoint Logo
Javatpoint Logo

Key concepts of Stream Processing

There are the following concepts that a user should know about stream processing:

Key concepts of Stream Processing

Time

It is essential as well as the most confusing concept. In stream processing, most operations rely on time. Therefore, a common notion of time is a typical task for such stream applications.

Kafka Stream processing refers to following notions of time:

  1. Event Time: The time when an event had occurred, and the record was originally created. Thus, event time matters during the processing of stream data.
  2. Log append time: It is that point of time when the event arrived for the broker to get stored.
  3. Processing Time: The time when a stream-processing application received the event to apply some operations. The time can be in milliseconds, days, or hours. Here, different timestamps are assigned to the same event, depending on exactly when each stream processing application happened to read the event. Also, the timestamp can differ for two threads in the same application. Thus, the processing time is highly unreliable, as well as best avoided.

State

There are different states maintained in the stream processing applications.

The states are:

  1. Internal or local state: The state which can be accessed only by a specific stream-processing application?s instance. The internal state is managed and maintained with an embedded, in-memory database within the application. Although local states are extremely fast, the memory size is limited.
  2. External state: It is the state which is maintained in an external data store such as a NoSQL database. Unlike the internal state, it provides virtually unlimited memory size. Also, it can be accessed either from different applications or from their instances. But, it carries extra latency and complexity, which makes it avoidable by some applications.

Stream-Table Duality

A Table is a collection of records which is uniquely identified by a primary key. Queries are fired to check the state of data at a specific point of time. Tables do not contain history, specifically unless we design it. On the other hand, streams contain a history of changes. Streams are the strings of events where each event causes a change. Thus, tables and streams are two sides of the same coin. So, to convert a table into streams, the user needs to capture the commands which modify the table. The commands such as insert, update, and delete are captured and stored into streams. Also, if the user wants to convert streams into a table, it is required to convert all changes which a stream contains. This process of conversion is also called materializing the stream. So, we can have the dual process of changing streams into tables as well as tables to streams.

Time Windows

The term time windows means windowing the total time into parts. Therefore, there are some operations on streams which depend on the time window. Such operations are called Windowed operations. For example, join operation performed on two streams are windowed. Although people rarely care about the type of window they need for their operations.






Help Others, Please Share

facebook twitter pinterest

Learn Latest Tutorials


Preparation


Trending Technologies


B.Tech / MCA