您好,登錄后才能下訂單哦!
[toc]
在MapReduce程序?qū)慚apper和Reducer的驅(qū)動(dòng)程序時(shí),有很多代碼都是重復(fù)性代碼,因此可以將其提取出來寫成一個(gè)工具類,后面再寫MapReduce程序時(shí)都會(huì)使用這個(gè)工具類。
程序代碼如下:
package com.uplooking.bigdata.common.utils;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import java.io.IOException;
public class MapReduceJobUtil {
public static Job buildJob(Configuration conf,
Class<?> jobClazz,
String inputpath,
Class<? extends InputFormat> inputFormat,
Class<? extends Mapper> mapperClass,
Class<?> mapKeyClass,
Class<?> mapValueClass,
Path outputpath,
Class<? extends OutputFormat> outputFormat,
Class<? extends Reducer> reducerClass,
Class<?> outkeyClass,
Class<?> outvalueClass) throws IOException {
String jobName = jobClazz.getSimpleName();
Job job = Job.getInstance(conf, jobName);
//設(shè)置job運(yùn)行的jar
job.setJarByClass(jobClazz);
//設(shè)置整個(gè)程序的輸入
FileInputFormat.setInputPaths(job, inputpath);
job.setInputFormatClass(inputFormat);//就是設(shè)置如何將輸入文件解析成一行一行內(nèi)容的解析類
//設(shè)置mapper
job.setMapperClass(mapperClass);
job.setMapOutputKeyClass(mapKeyClass);
job.setMapOutputValueClass(mapValueClass);
//設(shè)置整個(gè)程序的輸出
outputpath.getFileSystem(conf).delete(outputpath, true);//如果當(dāng)前輸出目錄存在,刪除之,以避免.FileAlreadyExistsException
FileOutputFormat.setOutputPath(job, outputpath);
job.setOutputFormatClass(outputFormat);
//設(shè)置reducer,如果有才設(shè)置,沒有的話就不用設(shè)置
if (null != reducerClass) {
job.setReducerClass(reducerClass);
job.setOutputKeyClass(outkeyClass);
job.setOutputValueClass(outvalueClass);
}
return job;
}
}
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。