Hadoop编程之MapReduce操作Mysql数据库

cocoIT

浏览: 48912 次
性别:
来自: 福建

最近访客更多访客>>

HincZhang

jack1007

画个逗号给明天qu

xiaoyulong1988

博主相关

博客

微博

相册

留言

关于我

文章分类

全部博客 (101)

社区版块

存档分类

1，首先建立mysql的数据库，表：如下图

下面是MapReduce编程了，需要导入mysql数据库驱动jar包；

首先需要自定义Recoder,这里是CourceRecord

public class CourceRecord implements Writable, DBWritable {
int id;
String name;
int count;
int no;

//从in流反序列化出内容
@Override
public void readFields(DataInput in) throws IOException {
this.id = in.readInt();
this.name = Text.readString(in);
this.count = in.readInt();
this.no = in.readInt();
}
//将内容序列化到out流
@Override
public void write(DataOutput out) throws IOException {
out.writeInt(this.id);
Text.writeString(out, this.name);
out.writeInt(this.count);
out.writeInt(this.no);
}
@Override
public void readFields(ResultSet result) throws SQLException {
this.id = result.getInt(1);
this.name = result.getString(2);
this.count = result.getInt(3);
this.no = result.getInt(4);
}
@Override
public void write(PreparedStatement stmt) throws SQLException {
stmt.setInt(1, this.id);
stmt.setString(2, this.name);
stmt.setInt(3, this.count);
stmt.setInt(4, this.no);
}
@Override
public String toString() {
return new String(this.name + " " + this.count + " " + this.no);
}
}

------------------------------------------------map类-----------------------------------

public class DBMapper extends MapReduceBase implements
Mapper<LongWritable, CourceRecord, LongWritable, Text> {
public void map(LongWritable key, CourceRecord value,
OutputCollector<LongWritable, Text> collector, Reporter reporter)
throws IOException {
collector.collect(new LongWritable(value.id),
new Text(value.toString()));
}
}

--------------------------------------driver驱动类DB------------------

public class DB {
public static void main(String[] args) throws IOException {
String[] argc={"jdbc:mysql://localhost/test","root", "111111"};

try {
JobConf conf = new JobConf(DB.class);
Class.forName("com.mysql.jdbc.Driver");
DBConfiguration.configureDB(conf, "com.mysql.jdbc.Driver",argc[0], argc[1], argc[2]);
String[] fields = { "id", "name", "count", "no" };
DBInputFormat.setInput(conf, CourceRecord.class, "tb", null,"id", fields);

conf.setInputFormat(DBInputFormat.class);
conf.setOutputKeyClass(LongWritable.class);
conf.setOutputValueClass(Text.class);

Path path = new Path("DBOUTPUT");
FileOutputFormat.setOutputPath(conf, path);

conf.setMapperClass(DBMapper.class);
conf.setReducerClass(IdentityReducer.class);
//如果文件存在则删除
FileSystem hdfs = path.getFileSystem(conf);
if (hdfs.exists(path)) {
hdfs.delete(path, true);
}

JobClient.runJob(conf);
}
catch(ClassNotFoundException e) {
System.err.println("mysql.jdbc.Driver not found");
}
}
------------------------------------------------------

运行：在DBOUTPUT目录下生成一个part-00000的文件，其内容为：

17 hello 28 17
18 hadoop 28 18
19 java 28 19
20 hive 16 20
21 hbase 18 21
22 pig 18 22

实例是跑起来了，主要是DB的一个配置，相关设置函数：

DBConfiguration.configureDB (JobConf job, String driverClass, String Url, String userName, String passwd)

DBInputFormat.setInput(JobConf job, Class<?extends DBWritable> inputClass, String tableName, String conditions,String orderBy, String fieldNames)

DBOutputFormat.setOut(JobConf job,String tablename,String fieldNames);

eg:DBOutputFormat.setOut(job,"tb",28,"mahout",30,401)

分享到：

Hadoop,MapReduce 操作 Mysql | java 图片旋转

2014-04-25 16:08
浏览 491
评论(0)
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论