When is the earliest point at which the reduce method of a given Reducer can be called?
A. As soon as at least one mapper has finished processing its input split.
B. As soon as a mapper has emitted at least one record.
C. Not until all mappers have finished processing all records.
D. It depends on the InputFormat used for the job.
You are developing a combiner that takes as input Text keys, IntWritable values, and emits Text keys, IntWritable values. Which interface should your class implement?
A. Combiner
B. Mapper
C. Reducer
D. Reducer
E. Combiner
Indentify the utility that allows you to create and run MapReduce jobs with any executable or script as the mapper and/or the reducer?
A. Oozie
B. Sqoop
C. Flume
D. Hadoop Streaming
E. mapred
You have just executed a MapReduce job. Where is intermediate data written to after being emitted from the Mapper's map method?
A. Intermediate data in streamed across the network from Mapper to the Reduce and is never written to disk.
B. Into in-memory buffers on the TaskTracker node running the Mapper that spill over and are written into HDFS.
C. Into in-memory buffers that spill over to the local file system of the TaskTracker node running the Mapper.
D. Into in-memory buffers that spill over to the local file system (outside HDFS) of the TaskTracker node running the Reducer
E. Into in-memory buffers on the TaskTracker node running the Reducer that spill over and are written into HDFS.
Which best describes how TextInputFormat processes input files and line breaks?
A. Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReader of the split that contains the beginning of the broken line.
B. Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReaders of both splits containing the broken line.
C. The input file is split exactly at the line breaks, so each RecordReader will read a series of complete lines.
D. Input file splits may cross line breaks. A line that crosses file splits is ignored.
E. Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReader of the split that contains the end of the broken line.
For each intermediate key, each reducer task can emit:
A. As many final key-value pairs as desired. There are no restrictions on the types of those key- value pairs (i.e., they can be heterogeneous).
B. As many final key-value pairs as desired, but they must have the same type as the intermediate key-value pairs.
C. As many final key-value pairs as desired, as long as all the keys have the same type and all the values have the same type.
D. One final key-value pair per value associated with the key; no restrictions on the type.
E. One final key-value pair per key; no restrictions on the type.
You have written a Mapper which invokes the following five calls to the OutputColletor.collect method:
output.collect (new Text ("Apple"), new Text ("Red") ) ;
output.collect (new Text ("Banana"), new Text ("Yellow") ) ; output.collect (new Text ("Apple"), new Text
("Yellow") ) ; output.collect (new Text ("Cherry"), new Text ("Red") ) ;
output.collect (new Text ("Apple"), new Text ("Green") ) ;
How many times will the Reducer's reduce method be invoked?
A. 6
B. 3
C. 1
D. 0
E. 5
Your client application submits a MapReduce job to your Hadoop cluster. Identify the Hadoop daemon on which the Hadoop framework will look for an available slot schedule a MapReduce operation.
A. TaskTracker
B. NameNode
C. DataNode
D. JobTracker
E. Secondary NameNode
You need to move a file titled "weblogs" into HDFS. When you try to copy the file, you can't. You know you have ample space on your DataNodes. Which action should you take to relieve this situation and store more files in HDFS?
A. Increase the block size on all current files in HDFS.
B. Increase the block size on your remaining files.
C. Decrease the block size on your remaining files.
D. Increase the amount of memory for the NameNode.
E. Increase the number of disks (or size) for the NameNode.
F. Decrease the block size on all current files in HDFS.
In a large MapReduce job with m mappers and n reducers, how many distinct copy operations will there be in the sort/shuffle phase?
A. mXn (i.e., m multiplied by n)
B. n
C. m
D. m+n (i.e., m plus n)
E. mn (i.e., m to the power of n)