It is impossible to pass Cloudera CCA-505 exam without any help in the short term. Come to Testking soon and find the most advanced, correct and guaranteed Cloudera CCA-505 practice questions. You will get a surprising result by our Most up-to-date Cloudera Certified Administrator for Apache Hadoop (CCAH) CDH5 Upgrade Exam practice guides.

2021 Sep CCA-505 exam question

Q11. You have installed a cluster running HDFS and MapReduce version 2 (MRv2) on YARN. You have no afs.hosts entry()ies in your hdfs-alte.xml configuration file. You configure a new worker node by setting fs.default.name in its configuration files to point to the NameNode on your cluster, and you start the DataNode daemon on that worker node.

What do you have to do on the cluster to allow the worker node to join, and start storing HDFS blocks?

A. Nothing; the worker node will automatically join the cluster when the DataNode daemon is started.

B. Without creating a dfs.hosts file or making any entries, run the command hadoop dfsadmin –refreshHadoop on the NameNode

C. Create a dfs.hosts file on the NameNode, add the worker node’s name to it, then issue the command hadoop dfsadmin –refreshNodes on the NameNode

D. Restart the NameNode

Answer: B


Q12. Which is the default scheduler in YARN?

A. Fair Scheduler

B. FIFO Scheduler

C. Capacity Scheduler

D. YARN doesn’t configure a default scheduler. You must first assign a appropriate scheduler class in yarn-site.xml

Answer: C

Explanation: Reference: http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/FairScheduler.html


Q13. You are planning a Hadoop cluster and considering implementing 10 Gigabit Ethernet as the network fabric. Which workloads benefit the most from a faster network fabric?

A. When your workload generates a large amount of output data, significantly larger than amount of intermediate data

B. When your workload generates a large amount of intermediate data, on the order of the input data itself

C. When workload consumers a large amount of input data, relative to the entire capacity of HDFS

D. When your workload consists of processor-intensive tasks

Answer: B


Q14. Given:


You want to clean up this list by removing jobs where the state is KILLED. What command you enter?

A. Yarn application –kill application_1374638600275_0109

B. Yarn rmadmin –refreshQueue

C. Yarn application –refreshJobHistory

D. Yarn rmadmin –kill application_1374638600275_0109

Answer: A

Explanation: Reference: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1-latest/bk_using-apache-hadoop/content/common_mrv2_commands.html


Q15. You are the hadoop fs –put command to add a file “sales.txt” to HDFS. This file is small enough that it fits into a single block, which is replicated to three nodes in your cluster (with a replication factor of 3). One of the nodes holding this file (a single block) fails. How will the cluster handle the replication of this file in this situation/

A. The cluster will re-replicate the file the next time the system administrator reboots the NameNode daemon (as long as the file’s replication doesn’t fall two)

B. This file will be immediately re-replicated and all other HDFS operations on the cluster will halt until the cluster’s replication values are restored

C. The file will remain under-replicated until the administrator brings that nodes back online

D. The file will be re-replicated automatically after the NameNode determines it is under replicated based on the block reports it receives from the DataNodes

Answer: B


CCA-505 study guide

Update CCA-505 exam prep:

Q16. You observe that the number of spilled records from Map tasks far exceeds the number of map output records. Your child heap size is 1GB and your io.sort.mb value is set to 100 MB. How would you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio?

A. Decrease the io.sort.mb value to 0

B. Increase the io.sort.mb to 1GB

C. For 1GB child heap size an io.sort.mb of 128 MB will always maximize memory to disk I/O

D. Tune the io.sort.mb value until you observe that the number of spilled records equals (or is as close to equals) the number of map output records

Answer: D


Q17. You have a 20 node Hadoop cluster, with 18 slave nodes and 2 master nodes running HDFS High Availability (HA). You want to minimize the chance of data loss in you cluster. What should you do?

A. Add another master node to increase the number of nodes running the JournalNode which increases the number of machines available to HA to create a quorum

B. Configure the cluster’s disk drives with an appropriate fault tolerant RAID level

C. Run the ResourceManager on a different master from the NameNode in the order to load share HDFS metadata processing

D. Run a Secondary NameNode on a different master from the NameNode in order to load provide automatic recovery from a NameNode failure

E. Set an HDFS replication factor that provides data redundancy, protecting against failure

Answer: C


Q18. Which three basic configuration parameters must you set to migrate your cluster from MapReduce1 (MRv1) to MapReduce v2 (MRv2)?

A. Configure the NodeManager hostname and enable services on YARN by setting the following property in yarn-site.xml:

<name>yarn.nodemanager.hostname</name>

<value>your_nodeManager_hostname</value>

B. Configure the number of map tasks per job on YARN by setting the following property in mapred-site.xml:

<name>mapreduce.job.maps</name>

<value>2</value>

C. Configure MapReduce as a framework running on YARN by setting the following property in mapred-site.xml:

<name>mapreduce.framework.name</name>

<value>yarn</value>

D. Configure the ResourceManager hostname and enable node services on YARN by setting the following property in yarn-site.xml:

<name>yarn.resourcemanager.hostname</name>

<value>your_responseManager_hostname</value>

E. Configure a default scheduler to run on YARN by setting the following property in sapred-site.xml:

<name>mapreduce.jobtracker.taskScheduler</name>

<value>org.apache.hadoop.mapred.JobQueueTaskScheduler</value>

F. Configure the NodeManager to enable MapReduce services on YARN by adding following property in yarn-site.xml:

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

Answer: A,B,D


Q19. You want to understand more about how users browse you public website. For example, you want to know which pages they visit prior to placing an order. You have a server farm of 200 web servers hosting your website. Which is the most efficient process to gather these web server logs into your Hadoop cluster for analysis?

A. Sample the web server logs web servers and copy them into HDFS using curl

B. Ingest the server web logs into HDFS using Flume

C. Import all users clicks from your OLTP databases into Hadoop using Sqoop

D. Write a MApReduce job with the web servers from mappers and the Hadoop cluster nodes reducers

E. Channel these clickstream into Hadoop using Hadoop Streaming

Answer: A,B


Q20. Assuming a cluster running HDFS, MapReduce version 2 (MRv2) on YARN with all settings at their default, what do you need to do when adding a new slave node to a cluster?

A. Nothing, other than ensuring that DNS (or /etc/hosts files on all machines) contains am entry for the new node.

B. Restart the NameNode and ResourceManager deamons and resubmit any running jobs

C. Increase the value of dfs.number.of.needs in hdfs-site.xml

D. Add a new entry to /etc/nodes on the NameNode host.

E. Restart the NameNode daemon.

Answer: B