This chapter will cover essential database concepts and AWS RDS Concepts. As we know that Every application relies on a database to store data and records for its users. A database engine allows application to access, manage, and search large volumes of data records. Database systems and engines can be grouped into two broad categories:
Relational Database Management Systems (RDBMS) and NoSQL (or non-relational) databases.
Relational databases provide a common interface that lets users read and write from the database using commands or queries written using SQL.
A relational database can be categorized as either an Online Transaction Processing (OLTP) or Online Analytical Processing (OLAP) database system.
Amazon Relational Database Service (Amazon RDS) significantly simplifies the setup and maintenance of OLTP and OLAP databases. Amazon RDS provides support for six popular relational database engines:
Microsoft SQL Server,
and Amazon Aurora.
Let’s understand about DWH and NoSQL DB.
Many companies split their relational databases into two different databases: one database for OLTP transactions, and the other database as their data warehouse for OLAP. OLTP transactions occur frequently and are relatively simple. OLAP transactions occur much less frequently but are much more complex.
Amazon RDS is used for OLTP, but it can also be used for OLAP. Amazon Redshift is a high-performance data warehouse designed specifically for OLAP use cases. Sometimes Amazon RDS with Amazon Redshift is used in the same application and periodically extract recent transactions and load them into a reporting database.
Many application use Hbase, MongoDB, Cassandra, CouchDB, Riak,and Amazon DynamoDBto store large volumes of data with high transaction rates.
We can run any type of NoSQL database on AWS using Amazon EC2, or we can choose Amazon DynamoDB to deal with the heavy lifting involved with building a distributed cluster spanning multiple data centers.
What Is ElastiCache?
ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud. The service improves the performance of web applications by allowing you to retrieve information from fast, managed, in-memory caches,instead of relying entirely on slower disk-based databases. ElastiCache supports two open-source in-memory caching engines:
Amazon Relational Database Service (Amazon RDS)
Q: What is Amazon RDS?
Amazon RDS is a managed service that makes it easy to set up and operate. It provides cost-efficient and resizable capacity, while managing time-consuming database administration tasks.
It supports Amazon Aurora, MySQL, MariaDB, Oracle, SQL Server, and PostgreSQL database engines. We can run RDS on premises using Amazon RDS on Outposts and Amazon RDS on VMware.
It gives us access to the capabilities of MySQL, MariaDB, Oracle, SQL Server, or PostgreSQL database. This means that the code, applications, and tools we already use today with our existing databases should work with Amazon RDS.
RDS has two key features;
Multi-AZ – For Disaster Recovery
Read Replicas – For Performance
Data Warehousing databases use different type of architecture both from a database perspective and infrastructure layer. Amazon’s Data Warehouse Solution Is Called Redshift.
Lets understand RDS concept through Lab.
Step 1: Login to AWS Console and navigate to RDS:
Step 2: Click on Create Database
Step 3: Select MySQL and navigate to next field:
Step 4: Select Templates as free Tier and provide Database Name as cloudvikas.
Step 5: Next enter Credentials details:
Step 6: Fill details for remaining fields as below:
After this, fill Connectivity section details:
After this, navigate to VPC security group:
Fill details as per below screenshot:
Next, we must fill details as below:
Now click on Create DataBase and database is created.
Database cloudvikas is created.
This is RDS lab and have learned its creation process.
Now please delete database else its cost will be generated after few days.
Now we will learn about Multi-AZ & RDS – Back UPS, Read Replicas.
Backup and Recovery
Amazon RDS provides a operational model for backup and recovery procedures. There are two different types of Backups for RDS:
By using a combination of both techniques, we can design a backup recovery model to protect application data.
An automated backup is an Amazon RDS feature that continuously tracks changes and backs up our database. Automated Backups are enabled by default. Since the retention period can be between 1 and 35 days. So in that period, we can recover our database at any point in time.
Automated Backups used to take a full daily snapshot and store transaction logs. When we do a recovery, AWS will first choose the most recent daily back up, and then apply transaction logs.
After that the backup data is stored in S3 and we get free storage space equal to the size of database. We should be aware of that when you delete a DB Instance, all automated backup snapshots are deleted and cannot be recovered. Manual snapshots, however, are not deleted.
Automated backups used to happen daily during a configurable 30-minute maintenance window – called the backup window. Automated backups are kept for a configurable number of days, called the backup retention period.
Manual DB Snapshots
We can perform manual DB snapshots at any time. A DB snapshot is initiated and can be created as frequently. Then restore the DB Instance in the DB snapshot. DB snapshots can be created with the Amazon RDS console also.
Amazon RDS permits you to recover your database quickly whether you are performing automated backups or manual DB snapshots. A new DB Instance is created when we restore. As soon as the restore is complete, we should associate custom DB parameter or security groups used by the instance from which you restored.
High Availability with Multi-AZ
One of the most powerful features of Amazon RDS is Multi-AZ deployments, which allows you to create a database cluster across multiple Availability Zones. Multi-AZ allows you to place a secondary copy of your database in another Availability Zone for disaster recovery purposes. When you create a Multi-AZ DB Instance, a primary instance is created in one Availability Zone and a secondary instance is created in another Availability Zone.
Amazon RDS automatically performs a failover in the event of any of the following:
Loss of availability in primary Availability Zone
Loss of network connectivity to primary database
Compute unit failure on primary database
Storage failure on primary database
Now lets do its practical and will understand its Lab session.
Step 1: Navigate to Database and click on Actions link. We can see multiple options to perform on created Database.
Step 2: Click on Modify and turn on Multi-AZ deployment
Next, click on continue. Here we can see Potential performance impact message and select Apply immediately. Now click on Modify DB instance button.
Step 3: Now navigate to Database and click on configuration tab:
We can see multi az option is as Yes.
We came to know below points:
RDS runs on virtual machines
You cannot log in to these operating systems however.
Patching of the RDS Operating System and DB is Amazon’s responsibility
RDS is NOT Serverless
Aurora Serverless IS Serverless
Can be Multi-AZ.
Used to increase performance.
Must have backups turned on.
Can be in different regions.
Can be Aurora or MySQL.
Can be promoted to master, this will break the Read Replica
How will you create a RDS Subnet Group through AWS CLI?
create-db-subnet-group --db-subnet-group-name <value> --db-subnet-group-description <value> --subnet-ids <value> [--tags <value>] [--cli-input-json <value>] [--generate-cli-skeleton <value>] Example:
aws rds create-db-subnet-group \--db-subnet-group-name cloudvikas\ --db-subnet-group-description "cloudvikas subnet group" \ --subnet-ids $Subnet1ID $Subnet2ID
Explain few points about AWS Availability Zones.
- In AWS, each region has many availability zones
(usually 3, min is 2, max is 6). Example:
- Each availability zone (AZ) is one or more discrete data centers with redundant power,networking, and connectivity
- They’re separate from each other
- They’re isolated from disasters.
- They’re connected with high bandwidth,ultra-low latency networking.
How to create a RDS Parameter Group using AWS CLI?
aws rds create-db-cluster-parameter-group \ --db-cluster-parameter-group-name cloudvikas \ --db-parameter-group-family aurora-postgresql10 \ --description "cloudvikas DB Cluster parameter group"
How to Create a VPC security group for the database?
DBcloudSecurityGroupId=$(aws ec2 create-security-group \ --group-name AWScloudvikas \ --description "Aurora Serverless vikas Security Group" \ --vpc-id $VPCId --output text --query GroupId)
How to Create a database cluster using CLI?
aws rds create-db-cluster \ --db-cluster-identifier cloudvikasdb \ --engine aurora-postgresql \ --engine-mode serverless \ --engine-version 10.16 \ --db-cluster-parameter-group-name cloudvikasdbparamgroup \ --master-username user \ --master-user-password $MasterPassword \ --db-subnet-group-name cloudvikasdbsubnetgroup \ --vpc-security-group-ids $DBSecurityGroupId
How to delete the RDS database cluster?
aws rds delete-db-cluster \ --db-cluster-identifier cloudvikas01 \ --skip-final-snapshot
How to delete the RDS Subnet Group?
aws rds delete-db-subnet-group \ --db-subnet-group-name cloudvikas01
How to delete the security group for the database?
aws ec2 delete-security-group \ --group-id $DBSecurityGroupId01