AWS Interview Question-3

If you have to use in join level queries frequently then which distribution styles would you utilize for the table in Redshift?

Answer : KEY. A distribution key is a column that is used to determine the database partition in which a particular row of data is stored. A distribution key is defined on a table using the CREATE TABLE statement.  The columns of the unique or primary key are used as the distribution keys.

Which method can be used to disable automated snapshots in Red shift?

Answer : Set the retention period to -1

What is the default retention period for a Kinesis stream?

Answer : 1day

What is DynamoDB?

              DynamoDB is a non-relational database for applications that need performance at any scale.

  • NoSQL managed database service
  • Supports both key-value and document data model
  • It’s really fast
    • Consistent responsiveness
    • Single-digit millisecond
  • Unlimited throughput and storage
  • Automatic scaling up or down
  • Handles trillions of requests per day
  • ACID transaction support
  • On -demand backups and point-in-time recovery
  • Encryption at rest
  • Data is replication across multiple Availability zones
  • Service-level agreement (SLA)up to 99.999%
What are the non-relational Databases?

The Non-Relational databases are NoSQL databases.
These databases are categorized into four groups:

  • Key-value stores
  • Graph stores
  • Column stores
  • Document stores
List the Data Types supported by DynamoDB?

DynamoDB supports four scalar data types, and they are:

  • Number
  • String
  • Binary
  • Boolean

DynamoDB supports collection data types such as:

  • Number Set
  • String Set
  • Binary Set
  • Heterogeneous List
  • Heterogeneous Map

DynamoDB also supports Null values.

List the APIs provided by Amazon DynamoDB?
  • CreateTable
  • UpdateTable
  • DeleteTable
  • DescribeTable
  • ListTables
  • PutItem
  • BatchWriteItem
  • UpdateItem
  • DeleteItem
  • GetItem
  • BatchGetItem.
  • Query
  • Scan
what are global secondary indexes?

An index with a different partition and partition-and-sort key from those on the table is called global Secondary index.

List types of secondary indexes supported by Amazons DynamoDB?
  • Global Secondary index – It is an index with a partition or a partition sort key that is different from those on the table. The global secondary index is considered to be global because queries on the index can span all the items in a table, across all the partitions.
  • Local secondary index – An index that has the same partition key as that of the table but different sort key. It is considered to be “local” because every partition of the index is scoped to a table partition that has the same partition key.
How many numbers of global secondary indexes do you create per table?

We can create a maximum of 5 global secondary indexes per table.

Where Does DynamoDB Fit In?

Amazon Relational Database Service (RDS)

Support for Amazon Aurora. PostgreSQL. MySQL MariaDB. Oracle Database, and SQL Server

Amazon DynamoDB

Key-value and document database

Amazon ElastiCache

Managed. Redis- or Memcached compatible in-memory data store

Amazon Neptune

Graph database for applications that work with highly connected data sets

Amazon Redshift

Petabyte-scale data warehouse service

Amazon QLDB

Ledger database providing a cryptographically verifiable transaction log

Amazon DocumentDB MongoDB-compatible database service

Explain Partitions and Data Distribution.

DynamoDB stores data in partitions. A partition is an allocation of storage for a table, backed by solid-state drives (SSDs) and automatically replicated across multiple Availability Zones within an AWS Region.

To get the most out of DynamoDB throughput, create tables where the partition key has a large number of distinct values. Applications should request values fairly uniformly and as randomly as possible.

Table: Collection of data. DynamoDB tables must contain a name, primary key. and the required read and write throughput values. Unlimited size.

Partition Key: A simple primary key. composed of one attribute known as the

partition key This is also called the hash attribute.

Partition and Sort Key: Also Known as a composite primary key. this type of key comprises two attributes. The first attribute is the partition key. and the second attribute is the sort key S also called the range attribute

Explain DynamoDB Performance?

On Demand Capacity: Database series according to demand

Good for -new tables with unknown workloads

Applications with unpredictable traffic

Prefer to pay as you go

Provisioned Capacity

  • Allows us to have consistent and predictable performance
  • Specify expected read and write throughput requirements
  • Read Capacity Units (RCU)
  • Write Capacity Units (WCU)
  • Price is determined by provisioned capacity
  • Cheaper per request than On-Demand mode
  • Good option for

Applications with predictable traffic

Applications whose traffic is consistent or ramps gradually

Capacity requirements can be forecasted, helping to control costs

Both capacity modes have a limit of 40.000 RCUs and 40.000 WCUs.

You can switch between modes only once per 24 hours.

  • Track metrics (data points over time)
  • Create dashboards
  • Create alarms
  • Create rules for events
  • View logs

DynamoDB Metrics

  • ConsumedReadCapacityUnits
  • ConsumedWriteCapacityUnits
  • ProvisionedReadCapacityUnits
  • ProvisionedWriteCapacityUnits
  • ReadThrottleEvents
  • SuccessfulRequestLatency
  • SystemErrors
  • Throttled Requests
  • UserErrors
  • WriteThrottleEvents

Alarms can be created on metrics, taking an action if the alarm is triggered.

Alarms have three states:

  • INSUFFICIENT: Not enough data to judge the state — alarms often start in this state.
  • ALARM: The alarm threshold has been breached (e.g., > 90% CPU).
  • OK: The threshold has not been breached.

Alarms have a number of key components:

  • Metric: The data points over time being measured
  • Threshold: Exceeding this is bad (static or anomaly)
  • Period: How long the threshold should be bad before an alarm is generated
  • Action: What to do when an alarm triggers
  • SNS
  • Auto Scaling
  • EC2

Explain Below terminology.

Provisioned Throughput

Maximum amount of capacity an application can consume from a table or index. Throttled requests: ProvisionedThroughputExceededException

Eventually vs. Strongly Consistent Read

Eventually consistent reads might include stale data.

Strongly consistent reads are always up to date but are subject to network delays.

Read Capacity Units (RCUs)

One RCU represents one strongly consistent read request per second, or two eventually consistent read requests, for an item up to 4 KB in size.

Filtered query or scan results consume full read capacity.

For an 8 KB item size:

  • 2 RCUs for one strongly consistent read
  • 1 RCU for an eventually consistent read
  • 4 RCUs for a transactional read

Write vs. Transactional Write

Writes are eventually consistent within one second or less.

One WCU represents one write per second for an item up to 1 KB in size. Transactional write requests require 2 WCUs for items up to 1 KB.

Standard: 3 WCUs  Provisioned Throughput

Transactional: 6 WCUs      Calculations

Explain Scan
  • Returns all items and attributes for a given table
  • Filtering results do not reduce RCU consumption; they simply discard data
  • Eventually consistent by default, but the Consistent Read parameter can enable strongly consistent scans
  • Limit the number of items returned
  • A single query returns results that fit within 1 MB
  • Pagination can be used to retrieve more than 1 MB
  • Parallel scans can be used to improve performance
  • Prefer query over scan when possible; occasional real-world use is okay
  • If you are repeatedly using scans to filter on the same non-PK/SK attribute, consider creating a secondary index
Explain Query
  • Find items based on primary key values
  • Query limited to PK. PK+SK. or secondary indexes
  • Requires PK attribute
  • Returns all items with that PK value
  • Optional SK attribute and comparison operator to refine results
  • Filtering results do not reduce RCU consumption; they simply discard data
  • Eventually consistent by default, but the Consistent Read parameter can enable strongly consistent queries
  • Querying a partition only scans that one partition
  • Limit the number of items returned
  • A single query returns results that fit within 1 MB
  • Pagination can be used to retrieve more than 1 MB
Explain BatchGetltem.
  • Returns attributes for multiple items from multiple tables
  • Request using primary key
  • Returns up to 16 MB of data, up to 100 items
  • Get unprocessed items exceeding limits via UnprocessedKeys
  • Eventually consistent by default, but the Consi stentRead parameter can enable strongly consistent scans
  • Retrieves items in parallel to minimize latency
Explain BatchWriteltem
  • Puts or deletes multiple items in multiple tables
  • Writes up to 16 MB of data, up to 25 put or delete requests
  • Get unprocessed items exceeding limits via Unprocessed Iterns
  • Conditions are not supported for performance reasons
  • Threading may be used to write items in parallel
Explain Provisioned Capacity
  • Minimum capacity required
  • Able to set a budget (maximum capacity)
  • Subject to throttling
  • Auto scaling available
  • Risk of underprovisioning — monitor your metrics
  • Lower price per API call
  • S0.00065 per WCU-hour (us-east-1 )
  • S0.00013 per RCU-hour (us-east-1 )
  • S0.25 per GB-month (first 25 GB is free)
Explain On-Demand Capacity
  • No minimum capacity: pay more per request than provisioned capacity
  • Idle tables not charged for read/write, but only for storage and backups
  • No capacity planning required — just make API calls
  • Eliminates the tradeoffs of over- or under-provisioning
  • Use on-demand for new product launches
  • Switch to provisioned once a steady state is reached
  • $1.25 per million WCU (us-east-1)
  • $0.25 per million RCU (us-east-1 )
Explain Point-in-Time Recovery (PITR)

Helps protect your DynamoDB tables from accidental writes or deletes. You can restore your data to any point in time in the last 35 days.

  • DynamoDB maintains incremental backups of your data.
  • Point-in-time recovery is not enabled by default.
  • The latest restorable timestamp is typically five minutes in the past.

After restoring a table, you must manually set up the following on the restored table:

  • Auto scaling policies
  • AWS Identity and Access Management (1AM) policies
  • Amazon CloudWatch metrics and alarms
  • Tags
  • Stream settings
  • Time to Live (TTL) settings
  • Point-in-time recovery settings

What are partitions?
  • They are the underlying storage and processing nodes of Dynamo DB
  • Initially, one table equates to one partition
  • Initially, all the data for that table is stored by that one partition
  • We don’t directly control the number of partitions
  • A partition can store 10 GB
  • A partition can handle 3000 RCU and 1000 WCU
  • So there is a capacity and performance relationship to the number of partitions THIS IS A KEY CONCEPT
  • Design tables and applications to avoid I/O “hot spots”/”hotkeys”
  • When >10 GB or >3000 RCU OR >1000 WCU required a new partition is added and the data is spread between them over time.
  • Partitions will automatically increase
  • While there is an automatic split of data across partitions, there is no automatic decrease when load/performance reduces
  • Allocated WCU and RCU is split between partitions
  • Each partition key is…
  • Limited to 10GB data
  • Limited to 3000 RCU 1000 WCU
  • Key concepts
  • • Be aware of the underlying storage infrastructure – partitions
  • • Be aware of what influences the number of partitions
  • • Capacity
  • • Performance (WCU / RCU )
  • • Be aware that they increase, but they don’t decrease
Explain Indexes in DynamoDB?

Dynamo DB offers two main data retrieval operations, SCAN and QUERY Without indexes.
Indexes allow secondary representations of the data in a table.
It allows efficient queries on those representations
Indexes come in two forms – Global Secondary and Local Secondary

Explain Local Secondary Indexes(LSI)

• LSI’s contain Partition, Sort, and New Sort + optional projected values
• Any data written to the table is copied Async to any LSI’s
• Shares RCU and WCU with the table
• A LSI is a sparse index. An index will only have an ITEM if the index sort key attribute is contained in the table item (row)

• Storage and performance considerations with LSI’s
• Any non-key values by default are not stored in an LSI
• If you query an attribute that is NOT projected, you are charged for the entire ITEM cost from pulling it from the main table
• Take care with planning your LSI and item projections – its important

Q: What is serverless computing?

Serverless computing is a cloud-computing execution model in which the cloud provider runs the server, and dynamically manages the allocation of machine resources. It allows us to build and run applications and services without thinking about servers. With serverless computing, our application still runs on servers, but all the server management is done by AWS. That is why it became important for all projects or companies. AWS Lambda is core of this serverless computing. We can run our code without servers.

Which service can be used as a business analytics service that can be used to build visualizations?

Answer : AWS Quick sight

What is the concurrency level of the number of queries that can run per queue in Red shift?

Answer : 5

What are the examples of Columnar databases?

Answer : Amazon Red shift and Apache H Base

Which tool can be used for transferring data between Amazon 53. Hadoop, HDFS, and RDBMS databases?

Answer : Sqoop

Which command can be used to see the impact of a query on a Red shift Table?

Answer : TRY

We are writing data to a Kinesis stream and the default stream settings are used for the kinesis stream. Every fourth day you have decided to send the data to S3 from the stream. When you analyze the data in S3, you see that only the 4th day’s data is present in the stream. What is the reason for this?

Answer : As we know that,Data records are only accessible for a default of 24 hours from the time they are added to a stream, since default stream settings are used for the kinesis stream here.

Q: What is AWS Lambda?

AWS Lambda is a compute service where we can upload our code and create a Lambda function. AWS Lambda takes care of provisioning and managing the servers that we use to run the code. We don’t have to worry about operating systems, patching, scaling, etc.

Basically, we can run code without provisioning or managing servers with help of AWS Lambda. We must pay only for the compute time. There is no charge when our code is not running. With Lambda, we can run code for virtually any type of application or backend service – all with zero administration. Just upload your code and Lambda takes care of everything.

We can use Lambda in the following ways;

  • As an event-driven compute service where AWS Lambda runs our code in response to events.
  • These events could be changes to data in an Amazon S3 bucket or DynamoDB table.
  • As a compute service to run your code in response to HTTP requests using Amazon API Gateway or API calls made using AWS SDKs.
Q: When should I use AWS Lambda versus Amazon EC2?

Amazon EC2 offers flexibility, with a wide range of instance types and the option to customize the operating system, network and security settings, and the entire software stack, allowing you to easily move existing applications to the cloud.

With Amazon EC2 you are responsible for provisioning capacity, monitoring fleet health and performance, and designing for fault tolerance and scalability. 

Where AWS Lambda makes it easy to execute code in response to events, such as changes to Amazon S3 buckets, updates to an Amazon DynamoDB table, or custom events generated by your applications or devices.

With Lambda we do not have to provision our instances; Lambda performs all the operational and administrative activities on our behalf, including capacity provisioning, monitoring fleet health, applying security patches to the underlying compute resources, deploying your code, running a web service front end, and monitoring and logging your code.

What Languages Does Lambda Support?

Node.js

Java

python

GO

PowerShell

Why Is Lambda good?

NO SERVERS!

Continuous Scaling

As we came to know that AWS Lambda is a serverless computing platform that allows us to create a small function.

After creating function, we can configure the function in the AWS console. Once it is configured then can execute code without the need to provision servers and can pay only for the resources used during the execution. As many organizations move towards implementing serverless architectures, AWS Lambda will play big role in future.

To understand how to write a Lambda function, we should understand what goes into one.

A Lambda function has a few requirements. The first requirement we need to satisfy is to provide a handler. The handler is the entry point for the Lambda. A Lambda function accepts JSON-formatted input and will usually return the same.

The second requirement is that we need to specify the runtime environment for the Lambda. The runtime will usually correlate directly with the language that we have selected to write function.

The final requirement is a trigger. We can configure a Lambda request in response to an event, such as a new file uploaded to S3, or a similar AWS event. You can also configure the Lambda to respond to requests to AWS API Gateway, or based on a timer triggered by AWS Cloudwatch.

Use Case: We will understand Lambda concept through simple use case.

Let’s pass two numbers into the function, and have it return the sum, product, difference, and quotient of the two numbers.

Input

{

   “Number1”: 20,

   “Number2”: 10

}

 Response:

{
   "Number1": 20,
   "Number2": 10,
   "Sum": 30,
   "Product": 200,
   "Difference": -10,
   "Quotient": 1
}
Writing Your Lambda function with Python

Step 1: Log in to your AWS Account, and navigate to the Lambda console.

Step 2: Click on Create function.

Step 3: We’ll be creating a Lambda from scratch, so select the Author from scratch option.

Enter name for your Lambda function, select a Python runtime and define a role for your Lambda to use.

Step 4: click on the Create function button. Next we can see below screen:

Click on the Select a test event drop-down and choose Configure test events.

Click on Test  button.

Step 5: Amazon provides collection of test templates. Select Create new test event and provide Event name as our EventName i.e. validatetwonumber.

Now We will modify the Hello World template with data of our own.

click on Create to create the new test event.

Step 6: We can modify code as per our need. So I have changed it as below:

import json

def lambda_handler(event, context):

   number1 = event[‘Number1’]

   number2 = event[‘Number2’]

   sum = number1 + number2

   product = number1 * number2

   difference = abs(number1 – number2)

   quotient = number1 / number2

   return {

       “Number1”: number1,

       “Number2”: number2,

       “Sum”: sum,

       “Product”: product,

       “Difference”: difference,

       “Quotient”: quotient

   }

Same I have kept in lambda function.

Save it.

Now click on Test Button to execute.


After execution we can see Results as below:

This is an example through which we can understand Lambda function and its uses.

Important Points:

• Lambda scales out (not up) automatically

• Lambda functions are independent, 1 event = 1 function

• Lambda is serverless

Lambda functions can trigger other lambda functions, 1 event can = x

functions if functions trigger other functions

Boto3 and Lambda functions using Python

As per Boto 3 documentation,Boto is the Amazon Web Services (AWS) SDK for Python. It enables Python developers to create, configure, and manage AWS services, such as EC2 and S3. Boto provides an easy to use, object-oriented API, as well as low-level access to AWS services.

First create IAM role having programmatic access:

How to create an ec2 instance using boto3?
import boto3
client = boto3.client('ec2')

resp = client.run_instances(ImageId='ami-4652ca39',
                     InstanceType='t2.micro',
                     MinCount=1,
                     MaxCount=1)
for instance in resp['Instances']:
    print(instance['InstanceId'])
How to start EC2 instance through boto3?
import boto3
client = boto3.client('ec2')
client.start_instances(InstanceIds=['i-dfghjkghjvbn6788'])
How to stop EC2 instance through boto3?
import boto3
client = boto3.client('ec2')
client.stop_instances(InstanceIds=['i-dfghjkghjvbn6788'])

How to terminate EC2 instance?

import boto3
client = boto3.client('ec2')

resp = client.terminate_instances(InstanceIds=['i-0158ab7a03bb6a954'])

for instance in resp['TerminatingInstances']:
    print("The instance with id {} Terminated".format(instance['InstanceId']))

How will you describe EC2 instances?
import boto3

client = boto3.client('ec2')

resp = client.describe_instances(Filters=[{
    'Name': 'tag:Env',
    'Values': ['Prod']
}])

for reservation in resp['Reservations']:
    for instance in reservation['Instances']:
        print("InstanceId is {} ".format(instance['InstanceId']))
How will you use filter while describing EC2 instance?
import boto3

client = boto3.client('ec2')

resp = client.describe_instances(Filters=[{
    'Name': 'tag:Env',
    'Values': ['Prod']
}])

for reservation in resp['Reservations']:
    for instance in reservation['Instances']:
        print("InstanceId is {} ".format(instance['InstanceId']))

it will give you ec2 instances which match filter conditions.

How will you stop all running instances using Boto 3?
import boto3

client = boto3.client('ec2')

resp = client.describe_instances(Filters=[{
    'Name': 'instance-state-name',
    'Values': ['running']
}]).stop()
How will you find out instance type and Instance ids for all EC2 instances?
import boto3

ec2 = boto3.resource('ec2')
for instance in ec2.instances.all():
    print('Instance id is {} and Instance type is {}'.format(instance.instance_id,instance.instance_type))))
How will you find out instance type and Instance ids for all EC2 instances available in any specific zone?
import boto3

ec2 = boto3.resource('ec2')
for instance in ec2.instances.filter(Filters=[
{
'Name':'availbility-zone',
'value':['us-east-1d']
}
]):

    print('Instance id is {} and Instance type is {}'.format(instance.instance_id,instance.instance_type))))
How will you find out those instances which has tag as backup?
How will you delete EBS Snapshots which is older than 15days?
from datetime import datetime, timedelta, timezone

import boto3
ec2 = boto3.resource('ec2')

# List(ec2.Snapshot)
snapshots1 = ec2.snapshots.filter(OwnerIds=['self'])

for snapshot in snapshots1:
    start_time = snapshot.start_time
    delete_time = datetime.now(tz=timezone.utc) - timedelta(days=15)
    if delete_time > start_time:
        snapshot.delete()
        print('Snapshot with Id = {} is deleted '.format(snapshot.snapshot_id))
How will you Migrate AMIs To different regions using Boto3 ?
import boto3

##########################
## Part-1 Create Images ##
##########################

west_region = 'ap-west-1'
ec2 = boto3.resource('ec2', region_name=west_region)


instances = ec2.instances.filter(InstanceIds=['i-0067eeaab6c81c'])

image_ids = []

for instance in instances:
    image = instance.create_image(Name='Demo Boto - '+instance.id, Description='Demo Boto'+instance.id)
    image_ids.append(image.id)

print("Images to be copied {} ".format(image_ids))


#############################################
## Part-2 Wait For Images to be available  ##
#############################################
# Get waiter for image_available

client = boto3.client('ec2', region_name=west_region)
waiter = client.get_waiter('image_available')

# Wait for Images to be ready
waiter.wait(Filters=[{
    'Name': 'image-id',
    'Values': image_ids
}])

##########################################
## Part-3 Copy Images to other regions  ##
##########################################

# Copy Images to the region, us-east-1

destination_region = 'us-east-1'
client = boto3.client('ec2', region_name=destination_region)
for image_id in image_ids:
    client.copy_image(Name='Boto3 Copy'+image_id, SourceImageId=image_id, SourceRegion='ap-south-1')



Question: What must be configured with a Lambda function to enable Lambda edge?

Ans – CloudFront trigger

Question: Which types of CloudFront CDNs can be deployed?

Ans – Web

RTMP

Boot Strap Scripts automates EC2 deployment and its process.  By Doing Bootstrap scripting, tasks become easier. Let’s understand it’s practical and will learn its uses.

Example: Suppose user wants to create one EC2 instance and when EC2 ip address (html url) is clicked in browser then “Hello World” message should be published.

Step 1: Create EC2 Instance and navigate Configure Instance Details tab:

Select IAM role as AdminAccess.

After this, Navigate to Advance Details. Enter commands to execute step by step.

In this, each command will be executed step by step.

If you want to display something please add it .

sudo su

yum update -y

yum install httpd -y

service httpd start

chkconfig httpd on

echo “<html><h1>hello world</h1></html>

echo “<html><h1>hello world</h1></html>

it will print hello world in the website url(ip address).

Step 3: Complete remaining steps and launch EC2 Instance. Once IP address is available then check with url

It will show:

Hello world.

***********************************************************************

Important Points:

If you want to get information about your EC2 instance, then type below curl command:

curl http://1 69.254.1 69.254/1atest/meta-data/ 
curl http://1 69.254.1 69.254/1atest/user-data/

*************************************************************************************

What is Placement Groups in AWS?

This specifies a placement group in which to launch EC2 instances. The strategy of the placement group determines how the instances are organized within the group.

What are different types of Placement Groups?

cluster placement group

spread placement group

partition placement group

What is cluster placement group?

A cluster placement group is a grouping of instances within a single Availability Zone. Placement groups

are recommended for applications that need low network latency, high network throughput, or both.

Only certain instances can be launched in to a Clustered Placement Group.

What is Spread placement group?

A spread placement group is a group of instances that are each placed on distinct hardware. Spread placement groups are recommended for applications that have a small number of critical instances that should be kept separate from each other. THINK like it is an INDIVIDUAL INSTANCES.

 What is partition placement group?

partition placement group places groups of instances in different partitions, where instances in one partition do not share the same hardware with instances in another partition.

When using partition placement groups, Amazon EC2 divides each group into logical segments called partitions. Amazon EC2 ensures that each partition within a placement group has its own set of racks. Each rack has its own network and power source. No two partitions within a placement group share the same racks, allowing you to isolate the impact of hardware failure within your application.

THINK like it is MULTIPLE INSTANCES.

 in the volume to Amazon S3, where it is stored in multiple Availability Zones.

Explain AWS Snowball.
  • AWS Snowball is a solution from Amazon Web Services that’s designed to move large amounts of data into the AWS cloud.
  • It’s done using offline data migration.
  • we create a job in the AWS Management Console.
  • This is a secure storage appliance that’s designed to be tamper proof that is used to house the data that we will then physically ship back to an AWS data center to move that data into the cloud.
  • we have to install Snowball client on an on-premises workstation that will be used to perform the copying or the transfer of the data from our on-premises environment to the Snowball appliance that you plug into your local area network.
  • You could also have multiple workstations with the Snowball client installed if you’ve got multiple snowballs and you want to increase the throughput.
  • So, a single Snowball appliance, for example, can handle between 50 and 80 terabytes of data. So, you could potentially, depending on your local network environment, fill one of those up within a single day.
  • Now, depending on your Internet connectivity, transferring that amount of data into the AWS cloud over the Internet could take months, even years in some cases. So, if you’ve got less than 10 petabytes of on-premises data, you should be using one or more AWS Snowball appliances.
  • If you’ve got more than 10 petabytes, you would be looking at using AWS Snowmobile, which is literally a shipping container full of storage appliances. So, that would be for larger organizations with very enormous data centers.
Snowball Jobs

AWS Snowball is used when you need to move large amounts of data from on-premises into the AWS cloud.

Essentially, Amazon Web Services will ship you a secured rugged storage device that you populate and then send back to the data center and they copy it into S3.

Question: When should AWS Snowball be used?

Ans- PB or TB of on-premises data that needs to be placed in AWS

Question: Which statements regarding AWS Snowball jobs are correct?

Ans – KMS can be used to protect data at rest

A shipping address must be specified

What is DNS?
  • The Domain Name Service or DNS, is a name resolution service on a TCP/IP network.
  • It is an application layer which defines how the application runs on different systems, pass the messages to each other.
  • DNS stands for Domain Name System.
  • DNS provides a mapping between the name of a host(on the network) and its address.
  • DNS is required for the functioning of the internet.
  • DNS is a service that translates the domain name into IP addresses.
  • It applies to both IP version 4 and IP version 6.
  • A DNS zone is related to a DNS domain name.Ex: cloudvikas.com.
  • The DNS zone refers to the configuration of the
    • DNS records, or
    • DNS domain name and
    • The DNS server that has control over those records within that zone.

What is TTL or The time to live in DNS?

DNS TTL (time to live) is a setting that tells the DNS resolver how long to cache a query before requesting a new one. So its information is stored in the cache of the recursive or local resolver for the TTL before it reaches back out to collect new, updated details.

  • Example: If a client queries its configured DNS server to resolve a name to an IP address then a server does a successful name resolution result, it will cache it for a period of time. That period of time is called the TTL or the time to live.
  • We recommend a TTL of 24 hours (86,400 seconds). However, if you are planning to make DNS changes, you can lower the TTL to 5 minutes (300 seconds) at least 24 hours in advance of making the changes. 

Consider you need to allow inbound DNS client queries to a VPC subnet. Which port should you allow in the Network ACL rule?

Ans- 53

Question: Which type of DNS record routing rule allows sending a percentage of traffic to a specific host?

Ans – Weighted

Question: You are registering a new DNS domain through Route 53. What must you supply when registering the domain?

Ans – Contact details

Question: Which records exist automatically in a new hosted DNS zone?

Ans – NS

SOA

Question: Which of the following statements is correct? Choose two.

Security group rules have a priority number

Security groups are associated with EC2 instances

Network ACL rules have a priority number

Network ACLs are associated subnets

Ans – Network ACL rules have a priority number

Network ACLs are associated subnets

Question: You are using the AWS management console to create a new Network ACL. What must the ACL be associated with?

Ans – VPC

Question: You have created a network ACL. You now need to create ACL rules using the CLI. Which command should you use?

Ans – aws ec2 create-network-acl-entry

Question: Which PowerShell statement is used to create a Network ACL?

Ans – New-EC2NetworkAcl -VpcId

Question: Which AWS objects can Elastic IPs be associated with?

Ans – Instance

Network interface

Question: You are using the AWS management console to create a new Security Group. What must the security group be associated with?

Ans – VPC

Question: Which CLI command is used to list AWS Security Groups?

Ans – aws ec2 describe-security-groups

Question: we need to allow port 3389 traffic to pass into an EC2 instance. Which PowerShell cmdlet should we use to modify the security group associated with the instance?

Ans – Grant-EC2SecurityGroupIngress

Question: Which term best describes the role of an AWS Internet Gateway?

Ans – Pass-through

Question: You have created an Internet Gateway in VPC1, yet EC2 instances in VPC1 subnets cannot reach the Internet. What should you do?

Ans – Add a route from the subnets

Question: Which term best describes the role of an AWS NAT Gateway?

Ans – Proxy

Question: Which two items must a new NAT gateway be associated with?

Ans – Elastic IP

Subnet

Question: What service does CloudFormation offer to AWS customers?

Ans – Infrastructure as code

Question: What is a fast content delivery networking (CDN) service offered by Amazon Web Services?

Ans – CloudFront

Question: What is a human and machine-readable data interchange method commonly used with IAM managed policies, S3 bucket policies, and CloudFormation infrastructure as code?

Ans – JSON