Easter Sale Special 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: exams65

Cloudera Certified Administrator for Apache Hadoop (CCAH) Question and Answers

Cloudera Certified Administrator for Apache Hadoop (CCAH)

Last Update May 19, 2024
Total Questions : 60

We are offering FREE CCA-500 Cloudera exam questions. All you do is to just go and sign up. Give your details, prepare CCA-500 free exam questions and then go for complete pool of Cloudera Certified Administrator for Apache Hadoop (CCAH) test questions that will help you more.

CCA-500 pdf

CCA-500 PDF

$35  $99.99
CCA-500 Engine

CCA-500 Testing Engine

$42  $119.99
CCA-500 PDF + Engine

CCA-500 PDF + Testing Engine

$56  $159.99
Questions 1

Which two are features of Hadoop’s rack topology? (Choose two)

Options:

A.  

Configuration of rack awareness is accomplished using a configuration file. You cannot use a rack topology script.

B.  

Hadoop gives preference to intra-rack data transfer in order to conserve bandwidth

C.  

Rack location is considered in the HDFS block placement policy

D.  

HDFS is rack aware but MapReduce daemon are not

E.  

Even for small clusters on a single rack, configuring rack awareness will improve performance

Discussion 0
Questions 2

You use the hadoop fs –put command to add a file “sales.txt” to HDFS. This file is small enough that it fits into a single block, which is replicated to three nodes in your cluster (with a replication factor of 3). One of the nodes holding this file (a single block) fails. How will the cluster handle the replication of file in this situation?

Options:

A.  

The file will remain under-replicated until the administrator brings that node back online

B.  

The cluster will re-replicate the file the next time the system administrator reboots the NameNode daemon (as long as the file’s replication factor doesn’t fall below)

C.  

This will be immediately re-replicated and all other HDFS operations on the cluster will halt until the cluster’s replication values are resorted

D.  

The file will be re-replicated automatically after the NameNode determines it is under-replicated based on the block reports it receives from the NameNodes

Discussion 0
Questions 3

You are working on a project where you need to chain together MapReduce, Pig jobs. You also need the ability to use forks, decision points, and path joins. Which ecosystem project should you use to perform these actions?

Options:

A.  

Oozie

B.  

ZooKeeper

C.  

HBase

D.  

Sqoop

E.  

HUE

Discussion 0
Questions 4

A slave node in your cluster has 4 TB hard drives installed (4 x 2TB). The DataNode is configured to store HDFS blocks on all disks. You set the value of the dfs.datanode.du.reserved parameter to 100 GB. How does this alter HDFS block storage?

Options:

A.  

25GB on each hard drive may not be used to store HDFS blocks

B.  

100GB on each hard drive may not be used to store HDFS blocks

C.  

All hard drives may be used to store HDFS blocks as long as at least 100 GB in total is available on the node

D.  

A maximum if 100 GB on each hard drive may be used to store HDFS blocks

Discussion 0
Questions 5

Which command does Hadoop offer to discover missing or corrupt HDFS data?

Options:

A.  

Hdfs fs –du

B.  

Hdfs fsck

C.  

Dskchk

D.  

The map-only checksum

E.  

Hadoop does not provide any tools to discover missing or corrupt data; there is not need because three replicas are kept for each data block

Discussion 0
Questions 6

Choose three reasons why should you run the HDFS balancer periodically? (Choose three)

Options:

A.  

To ensure that there is capacity in HDFS for additional data

B.  

To ensure that all blocks in the cluster are 128MB in size

C.  

To help HDFS deliver consistent performance under heavy loads

D.  

To ensure that there is consistent disk utilization across the DataNodes

E.  

To improve data locality MapReduce

Discussion 0
Questions 7

You’re upgrading a Hadoop cluster from HDFS and MapReduce version 1 (MRv1) to one running HDFS and MapReduce version 2 (MRv2) on YARN. You want to set and enforce version 1 (MRv1) to one running HDFS and MapReduce version 2 (MRv2) on YARN. You want to set and enforce a block size of 128MB for all new files written to the cluster after upgrade. What should you do?

Options:

A.  

You cannot enforce this, since client code can always override this value

B.  

Set dfs.block.size to 128M on all the worker nodes, on all client machines, and on the NameNode, and set the parameter to final

C.  

Set dfs.block.size to 128 M on all the worker nodes and client machines, and set the parameter to final. You do not need to set this value on the NameNode

D.  

Set dfs.block.size to 134217728 on all the worker nodes, on all client machines, and on the NameNode, and set the parameter to final

E.  

Set dfs.block.size to 134217728 on all the worker nodes and client machines, and set the parameter to final. You do not need to set this value on the NameNode

Discussion 0
Questions 8

On a cluster running CDH 5.0 or above, you use the hadoop fs –put command to write a 300MB file into a previously empty directory using an HDFS block size of 64 MB. Just after this command has finished writing 200 MB of this file, what would another use see when they look in directory?

Options:

A.  

The directory will appear to be empty until the entire file write is completed on the cluster

B.  

They will see the file with a ._COPYING_ extension on its name. If they view the file, they will see contents of the file up to the last completed block (as each 64MB block is written, that block becomes available)

C.  

They will see the file with a ._COPYING_ extension on its name. If they attempt to view the file, they will get a ConcurrentFileAccessException until the entire file write is completed on the cluster

D.  

They will see the file with its original name. If they attempt to view the file, they will get a ConcurrentFileAccessException until the entire file write is completed on the cluster

Discussion 0
Questions 9

You observed that the number of spilled records from Map tasks far exceeds the number of map output records. Your child heap size is 1GB and your io.sort.mb value is set to 1000MB. How would you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio?

Options:

A.  

For a 1GB child heap size an io.sort.mb of 128 MB will always maximize memory to disk I/O

B.  

Increase the io.sort.mb to 1GB

C.  

Decrease the io.sort.mb value to 0

D.  

Tune the io.sort.mb value until you observe that the number of spilled records equals (or is as close to equals) the number of map output records.

Discussion 0