Adrian Cantrill’s SAA-C02 study course, 35 minutes: RDS section review
RDS section Review:
Database Refresher and Models part 1:
– Structured Query Language
– Structure in & between tables of data – rigid Schema
– Relationships between tables
– NoSQL: Not one single thing… different models
– generally a much more relaxed Schema
– Relationships handled differently
ACID vs. BASE:
– ACID and BASE are DB transactional models
– CAP Theorem: Consistency, Availability, Partition Tolerant (resilience) – choose 2
– ACID : Consistency
– BASE : Availability
– ACID: Atomic Consistent Isolated Durable (RDS)
– Atomic: All or no components of a transaction succeed or fail
– Consistent: Transactions move the database from one valid state to another: nothing in between is allowed
– Isolated: If multiple transactions occur at once, they don’t interfere with each other. Each executes as if it’s the only one
– Durable: Once committed, transactions are durable, stored on non-volatile memory, resilient to power outages or crashes
– BASE: Basically Available Soft State Eventually Consistent (DynamoDB)
– Basically Available: Read and Write operations are available ‘as much as possible’ but without any consistency guarantees
– Soft State: The database doesn’t enforce consistency, this is offloaded onto the application/user
– Eventually Consistent: If we wait long enough, reads from the system will be consistent
Databases on EC2
Why you might want to use EC2 for database instances:
– Access to the DB instance OS
– Advanced DB option tuning… (DB root)
– Vendor demands
– DB or DB version AWS doesn’t provide
– Specific OS/DB combination AWS doesn’t provide
– Architecture AWS doesn’t provide (replication/resilience)
– Decision makers who ‘just want it’
Why you shouldn’t really
– Admin Overhead – managing EC2 and DBHost
– Backup / DR Management
– EC2 is single AZ
– Features: Some of AWS DB products are amazing
– ECS is on or off – no serverless, no easy scaling
– replication: skills, setup time, monitoring & effectiveness
– performance: AWS invests time into optimisation & features
Relational Database Service (RDS)
– Database-as-a-service (DbaaS)
– DatabaseServer-as-a-service
– Managed Database Instance (1+ Databases)
– Multiple engines MySQL, MariaDB, PostgreSQL, Oracle, Microsoft SQL Server
– Amazon Aurora
RDS High Availability (Multi AZ)
– No Free-Tier: Extra cost for standby replica
– Standby can’t be directly used
– 60-120 seconds failover
– Same region only (other AZ’s in the VPC)
– Backups taken from Standby (removes performance impact)
– AZ Outage, Primary Failure, Manual failover, Instance type change and software patching
RDS Automatic Backup, RDS Snapshots and Restore
– Creates a New RDS Instance: new address
– Snapshots: single point in time, creation time
– Automated: any 5 minute point in time
– Backup is restored and transaction logs are ‘replayed’ to bring DB to desired point in time
– Restores aren’t fast: Think about RTO
RDS Read-Replicas
(read) Performance Improvements
– 5x direct read-replicas per DB instance
– Each providing an additional instance of read performance
– Read-Replicas can have read-replicas – but lag starts to be a problem
– Global performance improvements
Availability improvements
– Snapshots & Backups improve RPO
– RTO’s are a problem
– RR’s offer nr. 0 RPO
– RR’s can be promoted quickly – low RTO
– Failure only – watch for data corruption
– Read only – until promoted
– Global availability improvements: global resilience
RDS Data Security
– SSL/TLS (in transit) is available for RDS, can be mandatory
– RDS supports EBS volume encryption – KMS
– Handled by HOST/EBS
– AWS or Customer Managed CMK generates data keys
– Data Keys uese for encryption operations
– Storage, Logs, Snapshots & replicas are encrypted
– Encryption can’t be removed
– RDS MSSQL and RDS Oracle Support TDE (Transparent Data Encryption)
– Encryption handled within the DB Engine
– RDS Oracle supports integration with CloudHSM
– Much stronger key controls (even from AWS)
Aurora Architecture
Aurora Key Differences
– Aurora architecture is very different from RDS
– Uses a ‘cluster’
– A single primary instance plus 0 or more replicas
– no local storage: uses cluster volume
– faster provisioning & improved availability & performance
Aurora Storage Architecture
– All SSD Based – high IOPS, low latency
– Storage is billed based on what’s used
– high water mark: billed for the most used
– storage which is freed up can be re-used
– replicas can be added and removed without requiring storage provisioning
Cost
– No free-tier option
– Aurora doesn’t support Micro instances
– Beyond RDS singleAZ(micro) Aurora offer better value
– compute: hourly charge, per second, 10 minute minimum
– storage: GB-Month consumed, IO cost per request
– 100% DB Size in backups are included
Aurora Restore, Clone & Backtrack
– Backups in Aurora work in the same way as RDS
– Restores create a new cluster
– Backtrack can be used which allow in-place rewinds to a previous point in time
– Fast clones make a new database much faster than copying all the data: copy-on-write
Aurora Serverless Concepts
– Scalable ACU: Aurora Capacity Units
– Aurora Serverless cluster has a min & Max ACU
– Cluster adjusts based on load
– Can go to 0 and be paused
– Consumption billing per-second basis
– Same resilience as Aurora (6 copies across AZ’s)
Aurora Serverless – Use Cases
– Infrequently used applications
– New applications
– Variable workloads
– Unpredictable workloads
– Development and test databases
– Multi-tenanct applications
Aurora Global Database
– Cross-Region DR and BC
– Global Read Scaling: low latency performance improvements
– ~1s or less replication between regions
– No impact on DB performance
– Secondary regions can have 16 replicas
– Can be promoted to R/W
– Currently MAX 5 secondary regions
Aurora Multi-Master
– Default Aurora mode is Single-Master
– One R/W and 0+ read only replicas
– Cluster Endpoint is used to write, but read endpoint is used for load balanced reads
– failover takes time: replica promoted to R/W
– In Multi-Master mode all instances are R/W
Database Migration Service
– A managed database migration service
– Runs using a replication instance
– Source and destination endpoints at source and target databases
– One endpoint must be on AWS
Schema Conversion Tool (SCT)
– SCT is used when converting one database engine to another
– including DB → S3 (Migrations using DMS)
– SCT is not used when migrating between DB’s of the same type
– On-premises MySQL → RDS MySQL
– Works with OLTP DB Types (MySQL, MSSQL, Oracle)
– Works with OLAP (Teradata, Oracle, Vertica, Greenplum)
– e.g. On-Premises MSSQL → RDS MySQL
– e.g. On-Permises Oracle → Aurora
DMS & Snowball
– Larger migrations might be multi-TB in size
– moving data over networks takes time and consumes capacity
– DMS can utilize snowball…
1: Use SCT to extract data locally and move to a snowball device
2: Ship the device back to AWS. They load onto an S3 bucket
3: DMS migrates from S3 into the target store
4: Change Data Capacity (CDC) can capture changes, and via S3 intermediary they are also written to the target database