Before knowing about the Kinesis, you should know about the streaming data.
What is streaming data?
Streaming data is data which is generated continuously from thousands of data sources, and these data sources can send the data records simultaneously and in small size.
Following are the examples of streaming data:
What is Kinesis?
Kinesis is a platform on AWS that sends your streaming data. It makes it easy to analyze load streaming data and also provides the ability for you to build custom applications based on your business needs.
Core Services of Kinesis
Architecture of Kinesis Stream
Suppose we have got the EC2, mobile phones, Laptops, IOT which are producing the data. They are known as producers as they produce the data. The data is moved to the Kinesis streams and stored in the shard. By default, the data is stored in shards for 24 hours. You can increase the time to 7 days of retention. Once the data is stored in shards, then you have EC2 instances which are known as consumers. They take the data from shards and turned it into useful data. Once the consumers have performed its calculation, then the useful data is moved to either of the AWS services, i.e., DynamoDB, S3, EMR, Redshift.
Architecture of Kinesis Firehose
Suppose you have got the EC2, mobile phones, Laptop, IOT which are producing the data. They are also known as producers. Producers send the data to Kinesis Firehose. Kinesis Firehose does not have to manage the resources such as shards, you do not have to worry about streams, you do not have to worry about manual editing the shards to keep up with the data, etc. It?s completely automated. You do not have to worry even about the consumers. Data can be analyzed by using a Lambda function. Once the data has been analyzed, the data is sent directly over to the S3. The analytics of data is optional. One important thing about Kinesis Firehouse is that there is no automatic retention window, but the Kinesis stream has an automatic retention window whose default time is 24 hours and it can be extended up to 7 days. Kinesis Firehose does not work like this. It essentially either analyzes or sends the data over directly to S3 or other location.
The other location can be Redshift. First, you have to write to S3 and then copy it to the Redshift.
If the location is Elastic search cluster, then the data is directly sent to the Elastic search cluster.
Kinesis Analytics is a service of Kinesis in which streaming data is processed and analyzed using standard SQL.
Architecture of Kinesis Analytics
We have got the kinesis firehose and kinesis stream. Kinesis Analytics allows you to run the SQL Queries of that data which exist within the kinesis firehose. You can use the SQL Queries to store the data in S3, Redshift or Elasticsearch cluster. Essentially, data is analyzed inside the kinesis using SQL type query language.
Differences b/w Kinesis Streams & Kinesis Firehose