Hadoop Cluster


Question: Select the purpose of using ssh-copy-id when preparing Ubuntu for a Hadoop install.

key generation

passwordless login

Ans:- passwordless login

Question: Select one of the impacts of running hadoop namenode -format that we should be aware of.

Hadoop settings will be lost

Any existing data in HDFS will be lost

Ans:-Any existing data in HDFS will be lost

Question: On which node must the workers configuration file be updated in Hadoop?

all nodes

new nodes only

Ans:- all nodes

Question: In deployments larger than three racks, which configuration is common for Hadoop nodes?

Worker nodes only

One master and the remaining are worker nodes

Ans:- Worker nodes only

Question: When adding a new node to an existing cluster, what command line utility can we run to verify that the new node is running Hadoop?

ssh

hadoop -uptime

Ans:- hadoop -uptime

Question: Which hadoop command function do we run to copy a local file to a running HDFS?

hadoop -copyFromLocal

hadoop -copyToLocal

Ans:- hadoop -copyFromLocal

Question: Which hadoop command function do we run to copy a file from a hadoop HDFS to the local node?

hadoop -copyToLocal

hadoop -copyFromLocal

hadoop -cat

Ans:- hadoop -copyToLocal

Hadoop on Amazon EMR

Question: Select the benefits of using Hadoop on Amazon EMR.

Dynamic sizing

Latest Apache Hadoop releases

Pay for what you use

Access to common Hadoop tools

Ans:-

Dynamic sizing

Pay for what you use

Access to common Hadoop tools

Question: Select the type of authentication used to connect to nodes in an EMR cluster.

Guest account access

Key-based authentication

Ans:- Key-based authentication

Question: Select the two main elements comprising EMRFS.

Amazon S3

Amazon Redshift

HDFS

Ans:-

Amazon S3

HDFS

Question: Select the Amazon service which stores the Hadoop log files.

CloudFront

S3

Ans:- S3

Question: Select the EMR data typically stored in S3.

HDFS

Output

Logs

Input

NameNode metadata

Ans:-

Output

Logs

Input

Question: When running scripts in the Amazon EMR web console, under which tab is the MapReduce job launched from?

Jobs

Steps

Ans:- Steps

Question: What is the result of the following AWS CLI command?

The file ulysses.txt is uploaded to the input folder

The contents of the input folder is overwritten by ulysses.txt

Ans:- The file ulysses.txt is uploaded to the input folder

Question: Upon a successful execution of ‘aws emr add-steps’, what is returned by the command?

0

A new step id

Ans:- A new step id

Question: Once an EMR cluster has completed, how is it finally terminated?

It must be terminated manually

It will terminate automatically once all of its HDFS data is backed up

Ans:- It must be terminated manually