Hadoop Questions and Answers Part-8

1. A ________ serves as the master and there is only one NameNode per cluster
a) Data Node
b) NameNode
c) Data block
d) Replication

  Discussion

Answer: b
Explanation: All the metadata related to HDFS including the information about data nodes, files stored on HDFS, and Replication, etc. are stored and maintained on the NameNode.

2. Point out the correct statement.
a) DataNode is the slave/worker node and holds the user data in the form of Data Blocks
b) Each incoming file is broken into 32 MB by default
c) Data blocks are replicated across different nodes in the cluster to ensure a low degree of fault tolerance
d) None of the mentioned

  Discussion

Answer: a
Explanation: There can be any number of DataNodes in a Hadoop Cluster.

3. HDFS works in a __________ fashion.
a) master-worker
b) master-slave
c) worker/slave
d) all of the mentioned

  Discussion

Answer: a
Explanation: NameNode servers as the master and each DataNode servers as a worker/slave

4. _______ NameNode is used when the Primary NameNode goes down
a) Rack
b) Data
c) Secondary
d) None of the mentioned

  Discussion

Answer: c
Explanation: Secondary namenode is used for all time availability and reliability.

5. Point out the wrong statement.
a) Replication Factor can be configured at a cluster level (Default is set to 3) and also at a file level
b) Block Report from each DataNode contains a list of all the blocks that are stored on that DataNode
c) User data is stored on the local file system of DataNodes
d) DataNode is aware of the files to which the blocks stored on it belong to

  Discussion

Answer: d
Explanation: NameNode is aware of the files to which the blocks stored on it belong to.

6. Which of the following scenario may not be a good fit for HDFS?
a) HDFS is not suitable for scenarios requiring multiple/simultaneous writes to the same file
b) HDFS is suitable for storing data related to applications requiring low latency data access
c) HDFS is suitable for storing data related to applications requiring high latency data access
d) None of the mentioned

  Discussion

Answer: a
Explanation: HDFS can be used for storing archive data since it is cheaper as HDFS allows storing the data on low cost commodity hardware while ensuring a high degree of fault-tolerance.

7. The need for data replication can arise in various scenarios like ____________
a) Replication Factor is changed
b) DataNode goes down
c) Data Blocks get corrupted
d) All of the mentioned

  Discussion

Answer: d
Explanation: Data is replicated across different DataNodes to ensure a high degree of fault-tolerance.

8. _______ is the slave/worker node and holds the user data in the form of Data Blocks.
a) DataNode
b) NameNode
c) Data block
d) Replication

  Discussion

Answer: a
Explanation: A DataNode stores data in the [HadoopFileSystem]. A functional filesystem has more than one DataNode, with data replicated across them.

9. HDFS is implemented in _____________ programming language.
a) C++
b) Java
c) Scala
d) None of the mentioned

  Discussion

Answer: b
Explanation: HDFS is implemented in Java and any computer which can run Java can host a NameNode/DataNode on it.

10. For YARN, the ___________ Manager UI provides host and port information.
a) Data Node
b) NameNode
c) Resource
d) Replication

  Discussion

Answer: c
Explanation: All the metadata related to HDFS including the information about data nodes, files stored on HDFS, and Replication, etc. are stored and maintained on the NameNode.