Hadoop Questions and Answers Part-11

1. Hadoop I/O Hadoop comes with a set of ________ for data I/O.
a) methods
b) commands
c) classes
d) none of the mentioned

Answer: d
Explanation: Hadoop I/O consist of primitives for serialization and deserialization.

2. Point out the correct statement.
a) The sequence file also can contain a “secondary” key-value list that can be used as file Metadata
b) SequenceFile formats share a header that contains some information which allows the reader to recognize is format
c) There’re Key and Value Class Name’s that allow the reader to instantiate those classes, via reflection, for reading
d) All of the mentioned

Answer: d
Explanation: In contrast with other persistent key-value data structures like B-Trees, you can’t seek to specified key editing, adding or removing it.

3. Apache Hadoop ___________ provides a persistent data structure for binary key-value pairs.
a) GetFile
b) SequenceFile
c) Putfile
d) All of the mentioned

Answer: b
Explanation: SequenceFile is append-only.

4. How many formats of SequenceFile are present in Hadoop I/O?
a) 2
b) 3
c) 4
d) 5

Answer: b
Explanation: SequenceFile has 3 available formats: An “Uncompressed” format, a “Record Compressed” format and a “Block-Compressed”.

5. Point out the wrong statement.
a) The data file contains all the key, value records but key N + 1 must be greater than or equal to the key N
b) Sequence file is a kind of hadoop file based data structure
c) Map file type is splittable as it contains a sync point after several records
d) None of the mentioned

Answer: c
Explanation: Map file is again a kind of hadoop file based data structure and it differs from a sequence file in a matter of the order.

6. Which of the following format is more compression-aggressive?
a) Partition Compressed
b) Record Compressed
c) Block-Compressed
d) Uncompressed

Answer: c
Explanation: SequenceFile key-value list can be just a Text/Text pair, and is written to the file during the initialization that happens in the SequenceFile.

7. The __________ is a directory that contains two SequenceFile.
a) ReduceFile
b) MapperFile
c) MapFile
d) None of the mentioned

Answer: c
Explanation: Sequence files are data file (“/data”) and the index file (“/index”).

8. The ______ file is populated with the key and a LongWritable that contains the starting byte position of the record.
a) Array
b) Index
c) Immutable
d) All of the mentioned

Answer: b
Explanation: Index doesn’t contains all the keys but just a fraction of the keys.

9. The _________ as just the value field append(value) and the key is a LongWritable that contains the record number, count + 1.
a) SetFile
b) ArrayFile
c) BloomMapFile
d) None of the mentioned

Answer: b
Explanation: The SetFile instead of append(key, value) as just the key field append(key) and the value is always the NullWritable instance.

10. ____________ data file takes is based on avro serialization framework which was primarily created for hadoop.
a) Oozie
b) Avro
c) cTakes
d) Lucene

Answer: b
Explanation: Avro is a splittable data format with a metadata section at the beginning and then a sequence of avro serialized objects.