Hadoop Questions and Answers Part-14

1. Apache _______ is a serialization framework that produces data in a compact binary format.
a) Oozie
b) Impala
c) kafka
d) Avro

Answer: d
Explanation: Apache Avro doesn’t require proxy objects or code generation.

2. Point out the correct statement.
a) Apache Avro is a framework that allows you to serialize data in a format that has a schema built in
b) The serialized data is in a compact binary format that doesn’t require proxy objects or code generation
c) Including schemas with the Avro messages allows any application to deserialize the data
d) All of the mentioned

Answer: d
Explanation: Instead of using generated proxy libraries and strong typing, Avro relies heavily on the schemas that are sent along with the serialized data.

3. Avro schemas describe the format of the message and are defined using ______________
a) JSON
b) XML
c) JS
d) All of the mentioned

Answer: a
Explanation: The JSON schema content is put into a file.

4. The ____________ is an iterator which reads through the file and returns objects using the next() method.
a) DatReader
b) DatumReader
c) DatumRead
d) None of the mentioned

Answer: b
Explanation: DatumReader reads the content through the DataFileReader implementation.

5. Point out the wrong statement.
a) Java code is used to deserialize the contents of the file into objects
b) Avro allows you to use complex data structures within Hadoop MapReduce jobs
c) The m2e plugin automatically downloads the newly added JAR files and their dependencies
d) None of the mentioned

Answer: d
Explanation: A unit test is useful because you can make assertions to verify that the values of the deserialized object are the same as the original values.

6. The ____________ class extends and implements several Hadoop-supplied interfaces.
a) AvroReducer
b) Mapper
c) AvroMapper
d) None of the mentioned

Answer: c
Explanation: AvroMapper is used to provide the ability to collect or map data.

7. ____________ class accepts the values that the ModelCountMapper object has collected.
a) AvroReducer
b) Mapper
c) AvroMapper
d) None of the mentioned

Answer: a
Explanation: AvroReducer summarizes them by looping through the values.

8. The ________ method in the ModelCountReducer class “reduces” the values the mapper collects into a derived value.
a) count
b) add
c) reduce
d) all of the mentioned

Answer: c
Explanation: In some cases, it can be a simple sum of the values.

9. Which of the following works well with Avro?
a) Lucene
b) kafka
c) MapReduce
d) None of the mentioned

Answer: c
Explanation: You can use Avro and MapReduce together to process many items serialized with Avro’s small binary format.

10. __________ tools is used to generate proxy objects in Java to easily work with the objects.
a) Lucene
b) kafka
c) MapReduce
d) Avro

Answer: d
Explanation: Avro serialization includes the schema with it — in JSON format — which allows you to have different versions of objects.