Javatpoint Logo
Javatpoint Logo

MapReduce API

In this section, we focus on MapReduce APIs. Here, we learn about the classes and methods used in MapReduce programming.

MapReduce Mapper Class

In MapReduce, the role of the Mapper class is to map the input key-value pairs to a set of intermediate key-value pairs. It transforms the input records into intermediate records.

These intermediate records associated with a given output key and passed to Reducer for the final output.

Methods of Mapper Class

void cleanup(Context context) This method called only once at the end of the task.
void map(KEYIN key, VALUEIN value, Context context) This method can be called only once for each key-value in the input split.
void run(Context context) This method can be override to control the execution of the Mapper.
void setup(Context context) This method called only once at the beginning of the task.

MapReduce Reducer Class

In MapReduce, the role of the Reducer class is to reduce the set of intermediate values. Its implementations can access the Configuration for the job via the JobContext.getConfiguration() method.

Methods of Reducer Class

void cleanup(Context context) This method called only once at the end of the task.
void map(KEYIN key, Iterable<VALUEIN> values, Context context) This method called only once for each key.
void run(Context context) This method can be used to control the tasks of the Reducer.
void setup(Context context) This method called only once at the beginning of the task.

MapReduce Job Class

The Job class is used to configure the job and submits it. It also controls the execution and query the state. Once the job is submitted, the set method throws IllegalStateException.

Methods of Job Class

Methods Description
Counters getCounters() This method is used to get the counters for the job.
long getFinishTime() This method is used to get the finish time for the job.
Job getInstance() This method is used to generate a new Job without any cluster.
Job getInstance(Configuration conf) This method is used to generate a new Job without any cluster and provided configuration.
Job getInstance(Configuration conf, String jobName) This method is used to generate a new Job without any cluster and provided configuration and job name.
String getJobFile() This method is used to get the path of the submitted job configuration.
String getJobName() This method is used to get the user-specified job name.
JobPriority getPriority() This method is used to get the scheduling function of the job.
void setJarByClass(Class<?> c) This method is used to set the jar by providing the class name with .class extension.
void setJobName(String name) This method is used to set the user-specified job name.
void setMapOutputKeyClass(Class<?> class) This method is used to set the key class for the map output data.
void setMapOutputValueClass(Class<?> class) This method is used to set the value class for the map output data.
void setMapperClass(Class<? extends Mapper> class) This method is used to set the Mapper for the job.
void setNumReduceTasks(int tasks) This method is used to set the number of reduce tasks for the job
void setReducerClass(Class<? extends Reducer> class) This method is used to set the Reducer for the job.
Next TopicWord Count Example




Please Share

facebook twitter google plus pinterest

Learn Latest Tutorials


Preparation


B.Tech / MCA