Hadoop Questions and Answers Part-30

1. __________ abstract class has three main methods for loading data and for most use cases it would suffice to extend it.
a) Load
b) LoadFunc
c) FuncLoad
d) None of the mentioned

Answer: b
Explanation: LoadFunc and StoreFunc implementations should use the Hadoop 20 API based classes.

2. Point out the correct statement.
a) LoadMeta has methods to convert byte arrays to specific types
b) The Pig load/store API is aligned with Hadoop InputFormat class only
c) LoadPush has methods to push operations from Pig runtime into loader implementations
d) All of the mentioned

Answer: c
Explanation: Currently only the pushProjection() method is called by Pig to communicate to the loader the exact fields that are required in the Pig script.

3. Which of the following has methods to deal with metadata?
a) LoadPushDown
b) LoadMetadata
c) LoadCaster
d) All of the mentioned

Answer: b
Explanation: Most implementation of loaders don’t need to implement this unless they interact with some metadata system.

4. ____________ method will be called by Pig both in the front end and back end to pass a unique signature to the Loader.
a) relativeToAbsolutePath()
b) setUdfContextSignature()
c) getCacheFiles()
d) getShipFiles()

Answer: b
Explanation: The signature can be used to store into the UDFContext any information which the Loader needs to store between various method invocations in the front end and back end.

5. Point out the wrong statement.
a) The load/store UDFs control how data goes into Pig and comes out of Pig.
b) LoadCaster has methods to convert byte arrays to specific types.
c) The meaning of getNext() has changed and is called by Pig runtime to get the last tuple in the data
d) None of the mentioned

Answer: c
Explanation: The meaning of getNext() has not changed and is called by Pig runtime to get the next tuple in the data.

6. ___________ return a list of hdfs files to ship to distributed cache.
a) relativeToAbsolutePath()
b) setUdfContextSignature()
c) getCacheFiles()
d) getShipFiles()

Answer: d
Explanation: The default implementation provided in LoadFunc handles this for FileSystem locations.

7. The loader should use ______ method to communicate the load information to the underlying InputFormat.
a) relativeToAbsolutePath()
b) setUdfContextSignature()
c) getCacheFiles()
d) setLocation()

Answer: d
Explanation: setLocation() method is called by Pig to communicate the load location to the loader.

8. ____________ method enables the RecordReader associated with the InputFormat provided by the LoadFunc is passed to the LoadFunc.
a) getNext()
b) relativeToAbsolutePath()
c) prepareToRead()
d) all of the mentioned

Answer: c
Explanation: The RecordReader can then be used by the implementation in getNext() to return a tuple representing a record of data back to pig.

9. __________ method tells LoadFunc which fields are required in the Pig script.
a) pushProjection()
b) relativeToAbsolutePath()
c) prepareToRead()
d) none of the mentioned

Answer: a
Explanation: Pig will use the column index requiredField.index to communicate with the LoadFunc about the fields required by the Pig script.

10. A loader implementation should implement __________ if casts (implicit or explicit) from DataByteArray fields to other types need to be supported.
a) LoadPushDown
b) LoadMetadata
c) LoadCaster
d) All of the mentioned

Answer: c
Explanation: LoadCaster has methods to convert byte arrays to specific types.