Sunday 1/30/22 Cloud Studies Update: AWS SQS, Kinesis, Cognito

Adrian Cantrill’s SAA-C02 Study course, 90 minutes: Serverless and Application services section: SQS, Kinesis product family Cognito

Simple Queue Service

– Public Fully Managed, Highly-Available Queues – Standard or FIFO

– Includes VPC’s, if they have connectivity to the services

– FIFO queues guarantee an order

– Standard queues are ‘best efforts’: messages could be received out of order

– Messages up to 256KB in size – for larger sizes, store the data somewhere and link to it inside the message

– The basic architecture is: some clients send to the queue, other clients poll the queue

– polling is the process of looking for messages

– After messages are polled and received, they aren’t deleted from the queue; they’re hidden for a period of time; this is known as a ‘visibility timeout’

– the visibility timeout is the amount of time that a client can take to process a message in some way

– this helps ensure fault tolerance: if the message is not explicitly deleted, the client handles the default action of placing the message back in the queue so it can be accessed again by a different client (in the event of client failure)

– Dead-letter queues can be used for problem messages

– ASG’s can scale and Lambdas invoke based on queue length

– Standard: at-least once, FIFO: exactly-once

– FIFO (performance) 3,000 messages per second with batching, or up to 300 messages per second without; scaling is limited

– standard queues can scale almost infinitely

– billed based on ‘requests’

– 1 request = 1-10 messages up to 256kb total

– two types of polling: short (immediate) vs Long (waitTimeSeconds: up to 20 seconds)

– Long polling is the preferred method: because the billing is based on requests, short polling can become very expensive very quickly

– Encryption at rest (KMS) & in-transit

– Access to a queue is controlled via identity policies or queue policies

– both control access to a queue from the same account

– only queue allow access to a queue from external accounts

– a queue policy is a resource policy similar to the type used on s3 buckets or sns topics

Kinesis Data Streams

– Kinesis is a scalable streaming service

– Kinesis ingests lots of data from lots of applications

– producers send data into a kinesis stream

– Streams can scale from low to near infinite data rates

– public service & highly available by design: no need to worry about replication or availability; everything is presented as a service

– streams store a rolling 24-hour window of database; storage is included

– Kinesis supports multiple producers pushing data into the stream

– Kinesis supports multiple consumers reading data from the stream

– Consumers can access the data from anywhere in the 24-hour window

– Conumers can access the data at different levels of granularity

– The Kinesis stream scales via a ‘shard architecture’

– The stream starts with a single shard

– As the data flow increases, additional shards are added to the stream

– Each shard allows for 1 MB ingestion and 2 MB consumption per second

– More shards equals higher cost and better performance

– The data window length also increases cost

– The default 24 hour window can be increased up to 7 days

– Data is stored via Kinesis data records

– Records are stored across shards; scaling is linear


– Ingestion of data at scale or large throughput = KINESIS

– Worker pool decoupling or asynchronous communication = SQS

– SQS = 1 production group, 1 consumption group

– SQS = Decoupling and Asynchronous communications

– SQS = No persistence of messages, no window

– Kinesis = huge scale ingestion

– Kinesis = multiple consumers, rolling window

– Kinesis = Data ingestion, Analytics, Monitoring, App Clicks

Amazon Kinesis Data Firehose

– Designed to cope with large amounts of streaming data ingestion, consumption and management within AWS

– Kinesis does not offer any way to persist data; it’s only designed for ingestion and consumption

– Once records in Kinesis age past the end of the rolling window, they’re gone forever

– Data Firehose = fully managed service to load data for data lakes, data stores, and analytics services

– Data Firehose = lets data be persisted beyond the rolling window of Kinesis data streams

– Automatic scaling; Fully serverless; resilient

– Near real time delivery (~60 seconds)

– Supports transformation of data on the fly using Lambda; this can add latency depending on the complexity of data

– Firehose = pay as you go, billing based on data volume

– Firehose delivers data to pre-defined valid endpoints:


– Splunk

– Redshift

– ElasticSearch

– S3

– For Redshift, Firehose sends data to an intermediate S3 bucket and then a Redshift copy is sent to Redshift from the bucket

– For the rest, data is transmitted directly from Firehose

– Firehose can directly accept data from producers or a Kinesis data stream

Kinesis Data Analytics

– This is a real-time data processing product

– Kinesis Data Streams: allows large scale ingestion of data into AWS and the consumption of that data by other compute resources known as consumers

– Kinesis Data Firehose provides delivery services; it accepts data in, and then delivers it to supported destinations in near real time; can also use Lambda to perform data transformations as the data passes through

– Kinesis Data Analytics provides real time processing of data as the data flows through using SQL

– Data inputs in one side, queries are run on the data, and then data is output to destinations at the other side

– Kinesis Data Analytics ingests from Kinesis Data Streams or Firehose or static reference data from S3

– Supported Destinations:

– Firehose (S3, Redshift, ElasticSearch, Splunk); near real time

– AWS Lambda; real time

– Kinesis Data Streams; real time

– Destinations are external source; they exist outside Kinesis Data Analytics; sources are not modified in any way

Scenarios for using Kinesis Data Analytics

– Streaming data needing real-time SQL processing

– time-series analytics; real-time dashboards; real-time metrics

Amazon Cognito

– One of the core identity products available in AWS

– Amazon Cognito provides Authentication, Authorization, and user management for web/mobile apps

– Amazon Cognito is comprised of user pools and identity pools

User Pools

– User Pools: Sign in and get a JSON Web Token (JWT)

– The JWT can be used for authentication with applications, certain AWS products like API Gateway and can be accepted directly

– Most Amazon services cannot use JWT’s; actual AWS credentials are needed

– User Pools do not grant access to AWS services; they control sign-in and deliver a JWT (user directory management and profiles, sign-up & sign-in (customisable web UI), MFA and other security features)

– User Pools also allow social sign-in using identities provided by Facebook, Google, Amazon, Apple, and sign-in services using identity types such as SAML identity providers

– User Pools provide a joined-up user management experience

– User Pools cannot be directly used to access most AWS resources

Identity Pools

– Identity Pools: Allow you to offer access to Temporary AWS Credential

– Unauthenticated identities: guest users

– Federated identities: SWAP Google, Facebook, Twitter, SAML 2.0 & User Pool for short term AWS Credentials to access AWS resources

– User Pools and Identity Pools can work together (User Pool Identity obtaining temporary AWS credentials)

– Identity Pools assume an IAM role on behalf of the identity

Published by pauldparadis

Working towards cloud networking security as a profession.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: