DynamoDB

What is DynamoDB?

              DynamoDB is a non-relational database for applications that need performance at any scale.

  • NoSQL managed database service
  • Supports both key-value and document data model
  • It’s really fast
    • Consistent responsiveness
    • Single-digit millisecond
  • Unlimited throughput and storage
  • Automatic scaling up or down
  • Handles trillions of requests per day
  • ACID transaction support
  • On -demand backups and point-in-time recovery
  • Encryption at rest
  • Data is replication across multiple Availability zones
  • Service-level agreement (SLA)up to 99.999%
What are the non-relational Databases?

The Non-Relational databases are NoSQL databases.
These databases are categorized into four groups:

  • Key-value stores
  • Graph stores
  • Column stores
  • Document stores
List the Data Types supported by DynamoDB?

DynamoDB supports four scalar data types, and they are:

  • Number
  • String
  • Binary
  • Boolean

DynamoDB supports collection data types such as:

  • Number Set
  • String Set
  • Binary Set
  • Heterogeneous List
  • Heterogeneous Map

DynamoDB also supports Null values.

List the APIs provided by Amazon DynamoDB?
  • CreateTable
  • UpdateTable
  • DeleteTable
  • DescribeTable
  • ListTables
  • PutItem
  • BatchWriteItem
  • UpdateItem
  • DeleteItem
  • GetItem
  • BatchGetItem.
  • Query
  • Scan
what are global secondary indexes?

An index with a different partition and partition-and-sort key from those on the table is called global Secondary index.

List types of secondary indexes supported by Amazons DynamoDB?
  • Global Secondary index – It is an index with a partition or a partition sort key that is different from those on the table. The global secondary index is considered to be global because queries on the index can span all the items in a table, across all the partitions.
  • Local secondary index – An index that has the same partition key as that of the table but different sort key. It is considered to be “local” because every partition of the index is scoped to a table partition that has the same partition key.
How many numbers of global secondary indexes do you create per table?

We can create a maximum of 5 global secondary indexes per table.

Where Does DynamoDB Fit In?

Amazon Relational Database Service (RDS)

Support for Amazon Aurora. PostgreSQL. MySQL MariaDB. Oracle Database, and SQL Server

Amazon DynamoDB

Key-value and document database

Amazon ElastiCache

Managed. Redis- or Memcached compatible in-memory data store

Amazon Neptune

Graph database for applications that work with highly connected data sets

Amazon Redshift

Petabyte-scale data warehouse service

Amazon QLDB

Ledger database providing a cryptographically verifiable transaction log

Amazon DocumentDB MongoDB-compatible database service

Explain Partitions and Data Distribution.

DynamoDB stores data in partitions. A partition is an allocation of storage for a table, backed by solid-state drives (SSDs) and automatically replicated across multiple Availability Zones within an AWS Region.

To get the most out of DynamoDB throughput, create tables where the partition key has a large number of distinct values. Applications should request values fairly uniformly and as randomly as possible.

Table: Collection of data. DynamoDB tables must contain a name, primary key. and the required read and write throughput values. Unlimited size.

Partition Key: A simple primary key. composed of one attribute known as the

partition key This is also called the hash attribute.

Partition and Sort Key: Also Known as a composite primary key. this type of key comprises two attributes. The first attribute is the partition key. and the second attribute is the sort key S also called the range attribute

Explain DynamoDB Performance?

On Demand Capacity: Database series according to demand

Good for -new tables with unknown workloads

Applications with unpredictable traffic

Prefer to pay as you go

Provisioned Capacity

  • Allows us to have consistent and predictable performance
  • Specify expected read and write throughput requirements
  • Read Capacity Units (RCU)
  • Write Capacity Units (WCU)
  • Price is determined by provisioned capacity
  • Cheaper per request than On-Demand mode
  • Good option for

Applications with predictable traffic

Applications whose traffic is consistent or ramps gradually

Capacity requirements can be forecasted, helping to control costs

Both capacity modes have a limit of 40.000 RCUs and 40.000 WCUs.

You can switch between modes only once per 24 hours.

Explain DynamoDB Items?

Item: A table may contain multiple items. An item is a unique group of attributes. Items are similar to rows or records in a traditional relational database. Items are limited to 400 KB.

Attribute: Fundamental data element. Similar to fields or columns in an RDBMS.

Explain Data Types.

Data Types

Scalar: Exactly one value — number, string, binary, boolean, and null. Applications must encode binary values in base64-encoded format before sending them to DynaboDB.

Document: Complex structure with nested attributes (e.g.. JSON) — list

and map.

Document Types

List: Ordered collection of values

FavoriteThings: [“Cookies”, “Coffee”, 3.14159]

Map: Unordered collection of name-value pairs (similar to JSON)

{

Day: ’Monday*,

UnreadEsalls: 42, lte«sOnMyOesk: |

“Coffee Cup”,

“Telephone”,

{

Pens: ( Quantity : 3},

Pencils: { Quantity : 2),

Erasers: { Quantity : 1>

}

]

}

Set: Multiple scalar values of the same type — string set, number set,

binary set.

[“Black”, “Green”, “Red”]

[42.2, -19, 7.5, 3.14]

[“U3Vubnk=”, “UmFpbnk=”, “U25vd3k=”]

Explain DynamoDB Table.

Creating a Table

  • Table names must be unique per AWS account and region.
  • Between 3 and 255 characters long
  • UTF-8 encoded
  • Case-sensitive
  • Contain a-z. A-Z. 0-9, _ (underscore). • (dash), and. (dot)
  • Primary key must consist of a partition key or a partition key and sort key.
  • Only string, binary, and number data types are allowed for partition or sort keys
  • Provisioned capacity mode is the default (free tier).
  • For provisioned capacity mode, read/write throughput settings are required
  • Secondary indexes creates a local secondary index.
  • Must be created at the time of table creation
  • Same partition key as the table, but a different sort key
  • Provisioned capacity is set at the table level.
  • Adjust at any time or enable auto scaling to modify them automatically
  • On-demand mode has a default upper limit of 40.000 RCU/WCU — unlike auto scaling, which can be capped manually

Create DynamoDB table

DynamoDB is a schema-loss database that only requires a table name and primary key. The table’s primary key is made up of one or two attributes that uniquely identity items, partition the data, and sort data within each partition.

Explain DynamoDB Console Menu Items.

DynamoDB Console Menu Items

  • Dashboard
  • Tables

Storage size and item count are not real time

  • Items: Manage items and perform queries and scans.
  • Metrics: Monitor CloudWatch metrics.
  • Alarms: Manage CloudWatch alarms.
  • Capacity: Modify a table s provisioned capacity.
  • Free tier allows 25 RCU, 25 WCU. and 25 GB for 12 months
  • Cloud Sandbox within the Cloud Playground
  • Indexes: Manage global secondary indexes.
  • Global Tables: Multi region, multi master replicas
  • Backups: On-demand backups and point in time recovery Triggers: Manage triggers to connect DynamoDB streams to Lambda functions.
  • Access control: Set up fine grained access control with v/eb identity federation.

Tags: Apply tags to your resources to help organize and identify them.

  • Backups
  • Reserved capacity
  • Preferences
  • DynamoDB Accelerator (DAX)
How can you apply aws cli in DynamoDB?

Installing the AWS CLI

  • Preinstalled on Amazon Linux and Amazon Linux 2
  • Cloud Sandbox within the Cloud Playground

Obtaining IAM Credentials

  • Option 1 : Create IAM access keys in your own AWS account.
  • Option 2: Use Cloud Sandbox credentials.
  • Note the access key ID and secret access key.

Configuring the AWS CLI

  • aws configure
  • aws sts get-caller-identity
  • aws dynamodb help

Using DynamoDB with the AWS CLI

  • aws dynamodb create-table
  • aws dynamodb describe-table
  • aws dynamodb put-item
  • aws dynamodb scan

Object Persistence Interface

  • Do not directly perform data plane operations
  • Map complex data types to items in a DynamoDB table
  • Create objects that represent tables and indexes
  • Define the relationships between objects in your program and the tables that store those objects
  • Call simple object methods, such as save. load, or delete
  • Available in the AWS SDKs for Java and NET
How can we use cloudwatch in dynamodb?

CloudWatch monitors your AWS resources in real time, providing visibility into resource utilization, application performance, and operational health.

  • Track metrics (data points over time)
  • Create dashboards
  • Create alarms
  • Create rules for events
  • View logs

DynamoDB Metrics

  • ConsumedReadCapacityUnits
  • ConsumedWriteCapacityUnits
  • ProvisionedReadCapacityUnits
  • ProvisionedWriteCapacityUnits
  • ReadThrottleEvents
  • SuccessfulRequestLatency
  • SystemErrors
  • Throttled Requests
  • UserErrors
  • WriteThrottleEvents

Alarms can be created on metrics, taking an action if the alarm is triggered.

Alarms have three states:

  • INSUFFICIENT: Not enough data to judge the state — alarms often start in this state.
  • ALARM: The alarm threshold has been breached (e.g., > 90% CPU).
  • OK: The threshold has not been breached.

Alarms have a number of key components:

  • Metric: The data points over time being measured
  • Threshold: Exceeding this is bad (static or anomaly)
  • Period: How long the threshold should be bad before an alarm is generated
  • Action: What to do when an alarm triggers
  • SNS
  • Auto Scaling
  • EC2

Explain Below terminology.

Provisioned Throughput

Maximum amount of capacity an application can consume from a table or index. Throttled requests: ProvisionedThroughputExceededException

Eventually vs. Strongly Consistent Read

Eventually consistent reads might include stale data.

Strongly consistent reads are always up to date but are subject to network delays.

Read Capacity Units (RCUs)

One RCU represents one strongly consistent read request per second, or two eventually consistent read requests, for an item up to 4 KB in size.

Filtered query or scan results consume full read capacity.

For an 8 KB item size:

  • 2 RCUs for one strongly consistent read
  • 1 RCU for an eventually consistent read
  • 4 RCUs for a transactional read

Write vs. Transactional Write

Writes are eventually consistent within one second or less.

One WCU represents one write per second for an item up to 1 KB in size. Transactional write requests require 2 WCUs for items up to 1 KB.

Standard: 3 WCUs  Provisioned Throughput

Transactional: 6 WCUs      Calculations

*********************************************************************

Explain Scan
  • Returns all items and attributes for a given table
  • Filtering results do not reduce RCU consumption; they simply discard data
  • Eventually consistent by default, but the Consistent Read parameter can enable strongly consistent scans
  • Limit the number of items returned
  • A single query returns results that fit within 1 MB
  • Pagination can be used to retrieve more than 1 MB
  • Parallel scans can be used to improve performance
  • Prefer query over scan when possible; occasional real-world use is okay
  • If you are repeatedly using scans to filter on the same non-PK/SK attribute, consider creating a secondary index
Explain Query
  • Find items based on primary key values
  • Query limited to PK. PK+SK. or secondary indexes
  • Requires PK attribute
  • Returns all items with that PK value
  • Optional SK attribute and comparison operator to refine results
  • Filtering results do not reduce RCU consumption; they simply discard data
  • Eventually consistent by default, but the Consistent Read parameter can enable strongly consistent queries
  • Querying a partition only scans that one partition
  • Limit the number of items returned
  • A single query returns results that fit within 1 MB
  • Pagination can be used to retrieve more than 1 MB

Explain BatchGetltem.

  • Returns attributes for multiple items from multiple tables
  • Request using primary key
  • Returns up to 16 MB of data, up to 100 items
  • Get unprocessed items exceeding limits via UnprocessedKeys
  • Eventually consistent by default, but the Consi stentRead parameter can enable strongly consistent scans
  • Retrieves items in parallel to minimize latency

Explain BatchWriteltem

  • Puts or deletes multiple items in multiple tables
  • Writes up to 16 MB of data, up to 25 put or delete requests
  • Get unprocessed items exceeding limits via Unprocessed Iterns
  • Conditions are not supported for performance reasons
  • Threading may be used to write items in parallel

Explain Provisioned Capacity

  • Minimum capacity required
  • Able to set a budget (maximum capacity)
  • Subject to throttling
  • Auto scaling available
  • Risk of underprovisioning — monitor your metrics
  • Lower price per API call
  • S0.00065 per WCU-hour (us-east-1 )
  • S0.00013 per RCU-hour (us-east-1 )
  • S0.25 per GB-month (first 25 GB is free)

Explain On-Demand Capacity

  • No minimum capacity: pay more per request than provisioned capacity
  • Idle tables not charged for read/write, but only for storage and backups
  • No capacity planning required — just make API calls
  • Eliminates the tradeoffs of over- or under-provisioning
  • Use on-demand for new product launches
  • Switch to provisioned once a steady state is reached
  • $1.25 per million WCU (us-east-1)
  • $0.25 per million RCU (us-east-1 )

**************************************************************

Explain Point-in-Time Recovery (PITR)

Helps protect your DynamoDB tables from accidental writes or deletes. You can restore your data to any point in time in the last 35 days.

  • DynamoDB maintains incremental backups of your data.
  • Point-in-time recovery is not enabled by default.
  • The latest restorable timestamp is typically five minutes in the past.

After restoring a table, you must manually set up the following on the restored table:

  • Auto scaling policies
  • AWS Identity and Access Management (1AM) policies
  • Amazon CloudWatch metrics and alarms
  • Tags
  • Stream settings
  • Time to Live (TTL) settings
  • Point-in-time recovery settings

**********************************************************

What are partitions?
  • They are the underlying storage and processing nodes of Dynamo DB
  • Initially, one table equates to one partition
  • Initially, all the data for that table is stored by that one partition
  • We don’t directly control the number of partitions
  • A partition can store 10 GB
  • A partition can handle 3000 RCU and 1000 WCU
  • So there is a capacity and performance relationship to the number of partitions THIS IS A KEY CONCEPT
  • Design tables and applications to avoid I/O “hot spots”/”hotkeys”
  • When >10 GB or >3000 RCU OR >1000 WCU required a new partition is added and the data is spread between them over time.
  • Partitions will automatically increase
  • While there is an automatic split of data across partitions, there is no automatic decrease when load/performance reduces
  • Allocated WCU and RCU is split between partitions
  • Each partition key is…
  • Limited to 10GB data
  • Limited to 3000 RCU 1000 WCU
  • Key concepts
  • • Be aware of the underlying storage infrastructure – partitions
  • • Be aware of what influences the number of partitions
  • • Capacity
  • • Performance (WCU / RCU )
  • • Be aware that they increase, but they don’t decrease
Explain Indexes in DynamoDB?

Dynamo DB offers two main data retrieval operations, SCAN and QUERY Without indexes.
Indexes allow secondary representations of the data in a table.
It allows efficient queries on those representations
Indexes come in two forms – Global Secondary and Local Secondary

Explain Local Secondary Indexes(LSI)

• LSI’s contain Partition, Sort, and New Sort + optional projected values
• Any data written to the table is copied Async to any LSI’s
• Shares RCU and WCU with the table
• A LSI is a sparse index. An index will only have an ITEM if the index sort key attribute is contained in the table item (row)

• Storage and performance considerations with LSI’s
• Any non-key values by default are not stored in an LSI
• If you query an attribute that is NOT projected, you are charged for the entire ITEM cost from pulling it from the main table
• Take care with planning your LSI and item projections – its important

Explain Global Secondary Indexes

• It shares many of the same concepts as a Local secondary index, BUT, with a GSI we can have an alternative Partition & sort key
• Options for attribute projection
• KEYS.ONLY – New partition and sort keys, old partition key and if applicable, old sort key
• INCLUDE – Specify custom projection values
• ALL – Projects all attributes
• Unlike LSI’s where the performance is shared with the table, RCU and WCU are defined on the GSI – in the same way as the table
• As with LSI, changes are written to the GSI asynchronously
• GSI’s ONLY support eventually consistent reads

What is a DynamoDB stream ?

• When a stream is enabled on a table, it records changes to a table and stores those values for 24 hours
• A stream can be enabled on a table from the console or API
• But can only be read or processed via the streams endpoint and API requests
• streams.dynamodb.us-west-2.amazonaws.com

• AWS guarantee that each change to a Dynamo DB table occur in the stream once and only once AND….
• That ALL changes to the Table occur in the stream in near realtime

• A Lambda function triggered when items are added to a dynamo DB stream, performing analytics on data
• A Lambda function triggered when a new user signup happens on your web app and data is entered into a users table