Kinesis Essentials:
- Kinesis is a real-time data processing service that continuously captures (and stores) large amounts of data that can power real-time streaming dash boards.
- Using the AWS provided SDKs, you can create real-time dashboards, integrate dynamic pricing strategies, and export data from Kinesis to other AWS services.
Including:
EMR (analytics)
S3 (storage)
RedShift (big data)
Lambda (event driven actions)
Kinesis Components
- Stream
- Producers (data creators)
- Consumers (data consumers)
- Shards (processing power)
Kinesis Benefits:
- Real-time processing:
- Continuously collect and build applications that analyze the data as it's generated.
Parallel processing:
- Multiple Kinesis applications can be processing the same incoming data streaming concurrently.
Durable:
- Kinesis synchronously replicates the streaming data across three data centers within a single AWS region and preserves the data for up to 24 hours.
Scales:
- Can stream from as little as a few megabytes to several terabytes per hour.
When to use Kinesis:
Gaming:
- Collect gaming data such as player actions and feed the data into the gaming platform. For example. a reactive environment based off of real-time actions of the player.
Real-time analytics:
- Collect IOT (sensors) from many sources and high amounts of frequency and process it using Kinesis to gain insights as data arrives in your environment.
Application alerts:
- Build a Kinesis application that monitors incoming application logs in real-time and trigger events based off the data.
Log/Event data collection:
- Log data from any number of devices and use Kinesis application to continuously process the incoming data, power real-time dashboards and store the data in S3 when completed.
Mobile data capture:
- Mobile applications can push data to Kinesis from countless number of devices which makes the data available as soon as it is produced.
Kinesis Workflow, Producers, and Consumers:
Kinesis Producers:
- Producers are devices that collect data for Kinesis processing.
- You build producers to continuously input data into a Kinesis stream.
- Producers can include (but not limited to):
- IoT sensors
- Mobile devices (cell phones)
- You can have literally thousands of different producers and scale based on need.
- The move data you want to process, the more "shards" you add to your Kinesis stream.
- Each "shard" can process 2 MB of read data per second, and 1 MB of write data per second.
Kinesis Consumers:
- Consumers consume the stream's data.
- This is done concurrently (multiple consumer can consume the same data at the same time).
- Consumers include (but are not limited to):
- Real-time dashboards
- S3
- Redshift (data warehouse)
- EMR
- Any application (one you create) can consume the streams data.
- Kinesis keeps 24 hours of streaming data stored by default, but can be configured to store up to 7 days.