溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊(cè)×
其他方式登錄
點(diǎn)擊 登錄注冊(cè) 即表示同意《億速云用戶服務(wù)條款》

sqoop相關(guān)整理記錄

發(fā)布時(shí)間:2020-06-09 16:34:13 來源:網(wǎng)絡(luò) 閱讀:3528 作者:紅塔山lvs 欄目:大數(shù)據(jù)

生產(chǎn)背景:

在從mysql導(dǎo)入到hive中,遇到如下問題:

從hive導(dǎo)出到mysql中,遇到如下問題:

sqoop 缺點(diǎn):

1  基于命令行的操作方式,易出錯(cuò),且不安全。
2  數(shù)據(jù)傳輸和數(shù)據(jù)格式是緊耦合的,這使得connector無法支持所有的數(shù)據(jù)格式
3  用戶名和密碼暴漏出來
4  sqoop安裝需要root權(quán)限

Sqoop優(yōu)點(diǎn):

1   高效可控的利用資源,任務(wù)并行度,超時(shí)時(shí)間。
2   數(shù)據(jù)類型映射與轉(zhuǎn)化,可自動(dòng)進(jìn)行,用戶也可自定義 .
3   支持多種主流數(shù)據(jù)庫,MySQL,Oracle,SQL Server,DB2等等 。

Sqoop原理:

Sqoop的inport原理:

Sqoop的export原理:

驗(yàn)證sqoop的各種報(bào)錯(cuò):

1 mysql字段太短
2 hive的空字段轉(zhuǎn)換
3 分隔符錯(cuò)誤
4  mysql的網(wǎng)絡(luò)不在集群網(wǎng)絡(luò)中
5  mysql停止服務(wù)
6 mysql utf8編碼只是3個(gè)字節(jié),可能是因?yàn)槟承﹗nicode字符轉(zhuǎn)成utf8之后變成了4個(gè)字節(jié),需要mysql支持utf8mb4
7  Sqoop調(diào)式信息
8 修改生成的Java類,重新打包。

Sqoop命令行說明



生產(chǎn)背景:

 在從mysql導(dǎo)入到hive,遇到如下問題:

       1) 源mysql和集群機(jī)器不在同一個(gè)網(wǎng)段中,導(dǎo)致執(zhí)行導(dǎo)入命令,網(wǎng)絡(luò)連接失敗。

       2) 某些字符導(dǎo)入到hive中,出現(xiàn)報(bào)錯(cuò)終止。 

    2.1  sqoop使用的JDBC-connector 版本太低(更換版本)。

hive導(dǎo)出到mysql中,遇到如下問題:

 1)某些字符插入mysql,出現(xiàn)報(bào)錯(cuò)終止。

   1.1 可能mysql本身編碼的限制,某些字符不支持,比如uft8utf8mb4

   1.2  sqoop使用的JDBC-connector 版本太低(更換版本)。

 

sqoop 缺點(diǎn):

 1  基于命令行的操作方式,易出錯(cuò),且不安全。

 2  數(shù)據(jù)傳輸和數(shù)據(jù)格式是緊耦合的,這使得connector無法支持所有的數(shù)據(jù)格式

 3  用戶名和密碼暴漏出來

 4  sqoop安裝需要root權(quán)限

Sqoop優(yōu)點(diǎn):

 

1   高效可控的利用資源,任務(wù)并行度,超時(shí)時(shí)間。

2   數(shù)據(jù)類型映射與轉(zhuǎn)化,可自動(dòng)進(jìn)行,用戶也可自定義 .


3   支持多種主流數(shù)據(jù)庫,MySQL,Oracle,SQL Server,DB2等等 

 

 

Sqoop原理:

 

Sqoopinport原理:

    Sqoopimport時(shí),需要制定split-by參數(shù)。Sqoop根據(jù)不同的split-by參數(shù)值來進(jìn)行切分,然后將切分出來的區(qū)域分配到不同map中。每個(gè)map中再處理數(shù)據(jù)庫中獲取的一行一行的值,寫入到HDFS中。同時(shí)split-by根據(jù)不同的參數(shù)類型有不同的切分方法,如比較簡單的int型,Sqoop會(huì)取最大和最小split-by字段值,然后根據(jù)傳入的num-mappers來確定劃分幾個(gè)區(qū)域。 比如select max(split_by),min(split-by) from得到的max(split-by)min(split-by)分別為10001,而num-mappers2的話,則會(huì)分成兩個(gè)區(qū)域(1,500)(501-100),同時(shí)也會(huì)分成2個(gè)sql2個(gè)map去進(jìn)行導(dǎo)入操作,分別為select XXX from table where split-by>=1 and split-by<500select XXX from table where split-by>=501 and split-by<=1000。最后每個(gè)map各自獲取各自SQL中的數(shù)據(jù)進(jìn)行導(dǎo)入工作。

Sqoopexport原理:根據(jù)mysql表名稱,生成一個(gè)以表名稱命名的Java類,該類繼承了sqoopRecord的,是一個(gè)只有MapMR,且自定義了輸出字段。

   

sqoop export --connect jdbc:mysql://$url:3306/$3?characterEncoding=utf8 --username $username --password $password --table $1 --export-dir $2 --input-fields-terminated-by '|' --null-non-string '0' --null-string '0';

驗(yàn)證sqoop的各種報(bào)錯(cuò):


mysql表

create  table dm_trlog (

plat     varchar(20),

user_id  varchar(20),

click_time   varchar(20),

click_url    varchar(200)

)

hive 表

CREATE TABLE TRLOG

(

PLATFORM string,

USER_ID int,

CLICK_TIME string,

CLICK_URL string

)

row format delimited

fields terminated by '\t';

sqoop list-databases –connect jdbc:mysql://192.168.119.129:3306/ –username li72 –password 123

1  mysql字段太短

 

sqoop export --connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 --username li72 --password 123 --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '\t' --null-non-string '0' --null-string '0';

 

Warning: $HADOOP_HOME is deprecated.

14/11/06 01:42:32 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

14/11/06 01:42:32 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/11/06 01:42:32 INFO tool.CodeGenTool: Beginning code generation

14/11/06 01:42:34 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1

14/11/06 01:42:34 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1

14/11/06 01:42:34 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/bigdata/hadoop

Note: /tmp/sqoop-root/compile/d5e37c20a9231b3253c97fc27d16d8a9/dm_trlog.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

14/11/06 01:42:38 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/d5e37c20a9231b3253c97fc27d16d8a9/dm_trlog.jar

14/11/06 01:42:38 INFO mapreduce.ExportJobBase: Beginning export of dm_trlog

14/11/06 01:42:43 INFO input.FileInputFormat: Total input paths to process : 1

14/11/06 01:42:43 INFO input.FileInputFormat: Total input paths to process : 1

14/11/06 01:42:43 INFO util.NativeCodeLoader: Loaded the native-hadoop library

14/11/06 01:42:43 WARN snappy.LoadSnappy: Snappy native library not loaded

14/11/06 01:42:44 INFO mapred.JobClient: Running job: job_201411060114_0001

14/11/06 01:42:45 INFO mapred.JobClient:  map 0% reduce 0%

14/11/06 01:43:15 INFO mapred.JobClient: Task Id : attempt_201411060114_0001_m_000000_0, Status : FAILED

java.io.IOException: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long for column 'click_time' at row 1

        at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:192)

        at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)

        at org.apache.hadoop.mapred.Child.main(Child.java:249)

Caused by: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long for column 'click_time' at row 1

        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4118)

        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4052)

        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2503)

        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2664)

        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2794)

        at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2155)

        at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1379)

        at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233)

mysql字段太短了

drop table  dm_trlog;

create  table dm_trlog (

plat     varchar(20),

user_id  varchar(20),

click_time   varchar(200),

click_url    varchar(200)

)

 

2 hive的空字段轉(zhuǎn)換 

由于Hive的NULL用\N來表示,字段用\001來分割,換行用\n來換行,導(dǎo)出分隔符一定要和hive表保持一致,如果為空可以指定轉(zhuǎn)換為0,有些mysql數(shù)字字段不能插入\N

加上兩個(gè)參數(shù):--input-null-string '\\N' --input-null-non-string '\\N',多加一個(gè)'\',是為轉(zhuǎn)義

sqoop export --connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 --username li72 --password 123 --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '\t' --null-non-string '0' --null-string '0';

14/10/23 04:53:47 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/915d24128af59c9e517580e0f07411d4/dm_pc_play_kpi.jar 
14/10/23 04:53:47 INFO mapreduce.ExportJobBase: Beginning export of dm_pc_play_kpi 
14/10/23 04:53:48 INFO input.FileInputFormat: Total input paths to process : 1 
14/10/23 04:53:48 INFO input.FileInputFormat: Total input paths to process : 1 
14/10/23 04:53:48 WARN snappy.LoadSnappy: Snappy native library is available 
14/10/23 04:53:48 INFO util.NativeCodeLoader: Loaded the native-hadoop library 
14/10/23 04:53:48 INFO snappy.LoadSnappy: Snappy native library loaded 
14/10/23 04:53:48 INFO mapred.JobClient: Running job: job_201408301703_84117 
14/10/23 04:53:49 INFO mapred.JobClient: map 0% reduce 0% 
14/10/23 04:55:45 INFO mapred.JobClient: Task Id : attempt_201408301703_84117_m_000000_0, Status : FAILED 
java.io.IOException: Can't export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.lang.NumberFormatException: For input string: "N" 
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1241) 
at java.lang.Float.valueOf(Float.java:417) 
at dm_pc_play_kpi.__loadFromFields(dm_pc_play_kpi.java:335) 
at dm_pc_play_kpi.parse(dm_pc_play_kpi.java:282) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83) 
... 10 more 

14/10/23 04:55:53 INFO mapred.JobClient: Task Id : attempt_201408301703_84117_m_000000_1, Status : FAILED 
14/10/23 04:55:58 INFO mapred.JobClient: Task Id : attempt_201408301703_84117_m_000000_2, Status : FAILED 
java.io.IOException: Can't export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.lang.NumberFormatException: For input string: "N" 
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1241) 
at java.lang.Float.valueOf(Float.java:417) 
at dm_pc_play_kpi.__loadFromFields(dm_pc_play_kpi.java:335) 
at dm_pc_play_kpi.parse(dm_pc_play_kpi.java:282) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83) 
... 10 more 

 

3 分隔符錯(cuò)誤

Hive中的分隔符是’\t’ 但是導(dǎo)出寫成’|’

sqoop export --connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 --username li72 --password 123 --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '|' --null-non-string '0' --null-string '0';

 為了測試,特意把分隔符改成 "|"


14/11/06 01:50:19 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

14/11/06 01:50:19 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/11/06 01:50:19 INFO tool.CodeGenTool: Beginning code generation

14/11/06 01:50:20 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1

14/11/06 01:50:21 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1

14/11/06 01:50:21 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/bigdata/hadoop

Note: /tmp/sqoop-root/compile/e474b3f8292f91dd4134b302ae35df19/dm_trlog.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

14/11/06 01:50:25 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/e474b3f8292f91dd4134b302ae35df19/dm_trlog.jar

14/11/06 01:50:25 INFO mapreduce.ExportJobBase: Beginning export of dm_trlog

14/11/06 01:50:45 INFO input.FileInputFormat: Total input paths to process : 1

14/11/06 01:50:45 INFO input.FileInputFormat: Total input paths to process : 1

14/11/06 01:50:45 INFO util.NativeCodeLoader: Loaded the native-hadoop library

14/11/06 01:50:45 WARN snappy.LoadSnappy: Snappy native library not loaded

14/11/06 01:50:51 INFO mapred.JobClient: Running job: job_201411060114_0003

14/11/06 01:50:52 INFO mapred.JobClient:  map 0% reduce 0%

14/11/06 01:51:20 INFO mapred.JobClient: Task Id : attempt_201411060114_0003_m_000000_0, Status : FAILED

java.io.IOException: Can't export data, please check task tracker logs

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)

        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)

        at org.apache.hadoop.mapred.Child.main(Child.java:249)

Caused by: java.util.NoSuchElementException

        at java.util.AbstractList$Itr.next(AbstractList.java:350)

        at dm_trlog.__loadFromFields(dm_trlog.java:252)

        at dm_trlog.parse(dm_trlog.java:201)

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)

        ... 10 more

 

14/11/06 01:51:21 INFO mapred.JobClient: Task Id : attempt_201411060114_0003_m_000001_0, Status : FAILED

java.io.IOException: Can't export data, please check task tracker logs

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)

        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)

        at org.apache.hadoop.mapred.Child.main(Child.java:249)

Caused by: java.util.NoSuchElementException

        at java.util.AbstractList$Itr.next(AbstractList.java:350)

        at dm_trlog.__loadFromFields(dm_trlog.java:252)

        at dm_trlog.parse(dm_trlog.java:201)

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)

        ... 10 more

 

4  mysql的網(wǎng)絡(luò)不在集群網(wǎng)絡(luò)中

 

sqoop export --connect jdbc:mysql://192.168.119.1:3306/student?characterEncoding=utf8 --username li72 --password 123 --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '\t' --null-non-string '0' --null-string '0';

 Warning: $HADOOP_HOME is deprecated.

14/11/06 02:04:29 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

14/11/06 02:04:30 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/11/06 02:04:30 INFO tool.CodeGenTool: Beginning code generation

14/11/06 02:07:40 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:355)

        at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2461)

        at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2498)

        at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2283)

        at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:822)

        at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:404)

        at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)

        at java.sql.DriverManager.getConnection(DriverManager.java:582)

        at java.sql.DriverManager.getConnection(DriverManager.java:185)

        at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:745)

        at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:605)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:628)

        at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:235)

        at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:219)

        at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:283)

        at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1255)

        at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1072)

        at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)

        at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)

        at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)

        at org.apache.sqoop.Sqoop.run(Sqoop.java:145)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)

        at org.apache.sqoop.Sqoop.main(Sqoop.java:238)

Caused by: java.net.ConnectException: Connection timed out

        at java.net.PlainSocketImpl.socketConnect(Native Method)

        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)

        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)

        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)

        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

        at java.net.Socket.connect(Socket.java:529)

        at java.net.Socket.connect(Socket.java:478)

        at java.net.Socket.<init>(Socket.java:375)

        at java.net.Socket.<init>(Socket.java:218)

        at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:305)

        ... 32 more

 

5  mysql停止服務(wù)

14/11/06 04:55:25 DEBUG tool.BaseSqoopTool: Enabled debug logging.

14/11/06 04:55:25 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

14/11/06 04:55:25 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory

14/11/06 04:55:25 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory

14/11/06 04:55:25 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:mysql:

14/11/06 04:55:25 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/11/06 04:55:25 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.MySQLManager@8a0d5d

14/11/06 04:55:25 INFO tool.CodeGenTool: Beginning code generation

14/11/06 04:55:26 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.

14/11/06 04:55:27 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:355)

        at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2461)

        at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2498)

        at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2283)

        at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:822)

        at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:404)

        at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)

        at java.sql.DriverManager.getConnection(DriverManager.java:582)

        at java.sql.DriverManager.getConnection(DriverManager.java:185)

        at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:745)

        at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:605)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:628)

        at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:235)

        at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:219)

        at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:283)

        at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1255)

        at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1072)

        at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)

        at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)

        at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)

        at org.apache.sqoop.Sqoop.run(Sqoop.java:145)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)

        at org.apache.sqoop.Sqoop.main(Sqoop.java:238)

Caused by: java.net.ConnectException: Connection refused

        at java.net.PlainSocketImpl.socketConnect(Native Method)

        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)

        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)

        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)

        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

        at java.net.Socket.connect(Socket.java:529)

        at java.net.Socket.connect(Socket.java:478)

        at java.net.Socket.<init>(Socket.java:375)

        at java.net.Socket.<init>(Socket.java:218)

        at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:305)

        ... 32 more

14/11/06 04:55:27 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.

14/11/06 04:55:27 ERROR manager.CatalogQueryManager: Failed to list columns from query: SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = (SELECT SCHEMA())   AND TABLE_NAME = 'dm_trlog' 

com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:355)

        at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2461)

        at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2498)

        at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2283)

        at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:822)

        at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:404)

        at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)

        at java.sql.DriverManager.getConnection(DriverManager.java:582)

        at java.sql.DriverManager.getConnection(DriverManager.java:185)

        at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:745)

        at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)

        at org.apache.sqoop.manager.CatalogQueryManager.getColumnNames(CatalogQueryManager.java:147)

        at org.apache.sqoop.orm.ClassWriter.getColumnNames(ClassWriter.java:1222)

        at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1074)

        at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)

        at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)

        at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)

        at org.apache.sqoop.Sqoop.run(Sqoop.java:145)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)

        at org.apache.sqoop.Sqoop.main(Sqoop.java:238)

Caused by: java.net.ConnectException: Connection refused

        at java.net.PlainSocketImpl.socketConnect(Native Method)

        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)

        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)

        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)

        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

        at java.net.Socket.connect(Socket.java:529)

        at java.net.Socket.connect(Socket.java:478)

        at java.net.Socket.<init>(Socket.java:375)

        at java.net.Socket.<init>(Socket.java:218)

        at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:305)

        ... 28 more

14/11/06 04:55:27 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

 

6  mysql utf8編碼只是3個(gè)字節(jié),可能是因?yàn)槟承﹗nicode字符轉(zhuǎn)成utf8之后變成了4個(gè)字節(jié),需要mysql支持utf8mb4

Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail. 
Please set $HCAT_HOME to the root of your HCatalog installation. 
14/11/09 07:00:33 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 
14/11/09 07:00:33 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 
14/11/09 07:00:33 INFO tool.CodeGenTool: Beginning code generation 
14/11/09 07:00:33 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_go_snger` AS t LIMIT 1 
14/11/09 07:00:34 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_go_snger` AS t LIMIT 1 
14/11/09 07:00:34 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/hadoop/hadoop-1.1.2 
Note: /tmp/sqoop-hadoop/compile/211b9679d4ac771d3ff710cbbd1c7277/dm_go_snger.java uses or overrides a deprecated API. 
Note: Recompile with -Xlint:deprecation for details. 
14/11/09 07:00:35 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/211b9679d4ac771d3ff710cbbd1c7277/dm_go_snger.jar 
14/11/09 07:00:35 INFO mapreduce.ExportJobBase: Beginning export of dm_go_snger 
14/11/09 07:00:36 INFO input.FileInputFormat: Total input paths to process : 1 
14/11/09 07:00:36 INFO input.FileInputFormat: Total input paths to process : 1 
14/11/09 07:00:36 WARN snappy.LoadSnappy: Snappy native library is available 
14/11/09 07:00:36 INFO util.NativeCodeLoader: Loaded the native-hadoop library 
14/11/09 07:00:36 INFO snappy.LoadSnappy: Snappy native library loaded 
14/11/09 07:00:37 INFO mapred.JobClient: Running job: job_201408301703_121362 
14/11/09 07:00:38 INFO mapred.JobClient: map 0% reduce 0% 
14/11/09 07:00:52 INFO mapred.JobClient: Task Id : attempt_201408301703_121362_m_000000_0, Status : FAILED 
java.io.IOException: Can't export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.io.IOException: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220) 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46) 
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639) 
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84) 
... 10 more 
Caused by: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1 
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1072) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3563) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3495) 
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959) 
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113) 
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2693) 
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2102) 
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1364) 
at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233) 

14/11/09 07:00:59 INFO mapred.JobClient: Task Id : attempt_201408301703_121362_m_000000_1, Status : FAILED 
java.io.IOException: Can't export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.io.IOException: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220) 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46) 
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639) 
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84) 
... 10 more 
Caused by: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1 
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1072) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3563) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3495) 
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959) 
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113) 
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2693) 
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2102) 
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1364) 
at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233) 

14/11/09 07:01:07 INFO mapred.JobClient: Task Id : attempt_201408301703_121362_m_000000_2, Status : FAILED 
java.io.IOException: Can't export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.io.IOException: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220) 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46) 
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639) 
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84) 
... 10 more 
Caused by: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1 
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1072) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3563) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3495) 
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959) 
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113) 
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2693) 
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2102) 
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1364) 
at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233) 

14/11/09 07:01:20 INFO mapred.JobClient: Job complete: job_201408301703_121362 
14/11/09 07:01:20 INFO mapred.JobClient: Counters: 8 
14/11/09 07:01:20 INFO mapred.JobClient: Job Counters 
14/11/09 07:01:20 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=31500 
14/11/09 07:01:20 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 
14/11/09 07:01:20 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 
14/11/09 07:01:20 INFO mapred.JobClient: Rack-local map tasks=3 
14/11/09 07:01:20 INFO mapred.JobClient: Launched map tasks=4 
14/11/09 07:01:20 INFO mapred.JobClient: Data-local map tasks=1 
14/11/09 07:01:20 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 
14/11/09 07:01:20 INFO mapred.JobClient: Failed map tasks=1 
14/11/09 07:01:20 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 44.4052 seconds (0 bytes/sec) 
14/11/09 07:01:20 INFO mapreduce.ExportJobBase: Exported 0 records. 
14/11/09 07:01:20 ERROR tool.ExportTool: Error during export: Export job failed! 
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. 
Logging initialized using configuration in jar:file:/home/hadoop/hadoop/hive-0.10.0.20140629/lib/hive-common-0.10.0.jar!/hive-log4j.properties 
Hive history file=/tmp/hadoop/hive_job_log_hadoop_201411090701_1423933997.txt 

7  Sqoop調(diào)式信息

 增加關(guān)鍵字--verbose

sqoop export --connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 --username li72 --password 123 --verbose --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '\t' --null-non-string '0' --null-string '0';

8 修改生成的Java類,重新打包。

每次通過sqoop導(dǎo)入MySql的時(shí),都會(huì)生成一個(gè)以MySql表命名的.java文件,然后打成JAR包,給sqoop提交給hadoop MR來解析Hive表中的數(shù)據(jù)。那可以根據(jù)報(bào)的錯(cuò)誤,找到對(duì)應(yīng)的行,改寫該文件,編譯,重新打包,sqoop可以通過 -jar-file ,--class-name 組合讓我們指定運(yùn)行自己的jar包中的某個(gè)class。來解析該hive表中的每行數(shù)據(jù)。腳本如下:一個(gè)完整的例子如下: 
./bin/sqoop export --connect "jdbc:mysql://192.168.119.129:3306/student?useUnicode=true&characterEncoding=utf-8" 
--username li72 --password 123 --table dm_trlog 
--export-dir /hive/warehouse/trlog --input-fields-terminated-by '\t' 
--input-null-string '\\N' --input-null-non-string '\\N' 
--class-name com.li72.trlog 
--jar-file /tmp/sqoopTempjar/trlog.jar 
上面--jar-file 參數(shù)指定jar包的路徑。--class-name 指定jar包中的class。 
這樣就可以解決所有解析異常了。 

Sqoop命令行說明

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
Common arguments:
   --connect <jdbc-uri> Specify JDBC connect
                                                string
   --connection-manager <class-name> Specify connection manager
                                                class name
   --connection-param-file <properties-file> Specify connection
                                                parameters file
   --driver <class-name> Manually specify JDBC
                                                driver class to use
   --hadoop-home <hdir> Override
                                                $HADOOP_MAPRED_HOME_ARG
   --hadoop-mapred-home <dir> Override
                                                $HADOOP_MAPRED_HOME_ARG
   --help Print usage instructions
-P Read password from console
   --password <password> Set authentication
                                                password
   --password-file <password-file> Set authentication
                                                password file path
   --relaxed-isolation Use read-uncommitted
                                                isolation for imports
   --skip-dist-cache Skip copying jars to
                                                distributed cache
   --username <username> Set authentication
                                                username
   --verbose Print more information
                                                while working
  
Import control arguments:
   --append Imports data
                                                              in append
                                                              mode
   --as-avrodatafile Imports data
                                                              to Avro data
                                                              files
   --as-sequencefile Imports data
                                                              to
                                                              SequenceFile
                                                              s
   --as-textfile Imports data
                                                              as plain
                                                              text
                                                              (default)
   --boundary-query <statement> Set boundary
                                                              query for
                                                              retrieving
                                                              max and min
                                                              value of the
                                                              primary key
   --columns <col,col,col...> Columns to
                                                              import from
                                                              table
   --compression-codec <codec> Compression
                                                              codec to use
                                                              for import
   --delete-target-dir Imports data
                                                              in delete
                                                              mode
   --direct Use direct
                                                              import fast
                                                              path
   --direct-split-size <n> Split the
                                                              input stream
                                                              every 'n'
                                                              bytes when
                                                              importing in
                                                              direct mode
-e,--query <statement> Import
                                                              results of
                                                              SQL
                                                              'statement'
   --fetch-size <n> Set number
                                                              'n' of rows
                                                              to fetch
                                                              from the
                                                              database
                                                              when more
                                                              rows are
                                                              needed
   --inline-lob-limit <n> Set the
                                                              maximum size
                                                              for an
                                                              inline LOB
-m,--num-mappers <n> Use 'n' map
                                                              tasks to
                                                              import in
                                                              parallel
   --mapreduce-job-name <name> Set name for
                                                              generated
                                                              mapreduce
                                                              job
   --split-by <column-name> Column of
                                                              the table
                                                              used to
                                                              split work
                                                              units
   --table <table-name> Table to
                                                              read
   --target-dir <dir> HDFS plain
                                                              table
                                                              destination
   --validate Validate the
                                                              copy using
                                                              the
                                                              configured
                                                              validator
   --validation-failurehandler <validation-failurehandler> Fully
                                                              qualified
                                                              class name
                                                              for
                                                              ValidationFa
                                                              ilureHandler
   --validation-threshold <validation-threshold> Fully
                                                              qualified
                                                              class name
                                                              for
                                                              ValidationTh
                                                              reshold
   --validator <validator> Fully
                                                              qualified
                                                              class name
                                                              for the
                                                              Validator
   --warehouse-dir <dir> HDFS parent
                                                              for table
                                                              destination
   --where <where clause> WHERE clause
                                                              to use
                                                              during
                                                              import
-z,--compress Enable
                                                              compression
  
Incremental import arguments:
   --check-column <column> Source column to check for incremental
                                  change
   --incremental <import-type> Define an incremental import of type
                                  'append' or 'lastmodified'
   --last-value <value> Last imported value in the incremental
                                  check column
  
Output line formatting arguments:
   --enclosed-by <char> Sets a required field enclosing
                                      character
   --escaped-by <char> Sets the escape character
   --fields-terminated-by <char> Sets the field separator character
   --lines-terminated-by <char> Sets the end-of-line character
   --mysql-delimiters Uses MySQL's default delimiter set:
                                      fields: , lines: \n escaped-by: \
                                      optionally-enclosed-by: '
   --optionally-enclosed-by <char> Sets a field enclosing character
  
Input parsing arguments:
   --input-enclosed-by <char> Sets a required field encloser
   --input-escaped-by <char> Sets the input escape
                                            character
   --input-fields-terminated-by <char> Sets the input field separator
   --input-lines-terminated-by <char> Sets the input end-of-line
                                            char
   --input-optionally-enclosed-by <char> Sets a field enclosing
                                            character
  
Hive arguments:
   --create-hive-table Fail if the target hive
                                               table exists
   --hive-database <database-name> Sets the database name to
                                               use when importing to hive
   --hive-delims-replacement <arg> Replace Hive record \0x01
                                               and row delimiters (\n\r)
                                               from imported string fields
                                               with user-defined string
   --hive-drop-import-delims Drop Hive record \0x01 and
                                               row delimiters (\n\r) from
                                               imported string fields
   --hive-home <dir> Override $HIVE_HOME
   --hive-import Import tables into Hive
                                               (Uses Hive's default
                                               delimiters if none are
                                               set.)
   --hive-overwrite Overwrite existing data in
                                               the Hive table
   --hive-partition-key <partition-key> Sets the partition key to
                                               use when importing to hive
   --hive-partition-value <partition-value> Sets the partition value to
                                               use when importing to hive
   --hive-table <table-name> Sets the table name to use
                                               when importing to hive
   --map-column-hive <arg> Override mapping for
                                               specific column to hive
                                               types.
  
HBase arguments:
   --column-family <family> Sets the target column family for the
                               import
   --hbase-bulkload Enables HBase bulk loading
   --hbase-create-table If specified, create missing HBase tables
   --hbase-row-key <col> Specifies which input column to use as the
                               row key
   --hbase-table <table> Import to <table> in HBase
  
HCatalog arguments:
   --hcatalog-database <arg> HCatalog database name
   --hcatalog-home <hdir> Override $HCAT_HOME
   --hcatalog-table <arg> HCatalog table name
   --hive-home <dir> Override $HIVE_HOME
   --hive-partition-key <partition-key> Sets the partition key to
                                               use when importing to hive
   --hive-partition-value <partition-value> Sets the partition value to
                                               use when importing to hive
   --map-column-hive <arg> Override mapping for
                                               specific column to hive
                                               types.
  
HCatalog import specific options:
   --create-hcatalog-table Create HCatalog before import
   --hcatalog-storage-stanza <arg> HCatalog storage stanza for table
                                      creation
  
Accumulo arguments:
   --accumulo-batch-size <size> Batch size in bytes
   --accumulo-column-family <family> Sets the target column family for
                                         the import
   --accumulo-create-table If specified, create missing
                                         Accumulo tables
   --accumulo-instance <instance> Accumulo instance name.
   --accumulo-max-latency <latency> Max write latency in milliseconds
   --accumulo-password <password> Accumulo password.
   --accumulo-row-key <col> Specifies which input column to
                                         use as the row key
   --accumulo-table <table> Import to <table> in Accumulo
   --accumulo-user <user> Accumulo user name.
   --accumulo-visibility <vis> Visibility token to be applied to
                                         all rows imported
   --accumulo-zookeepers <zookeepers> Comma-separated list of
                                         zookeepers (host:port)
  
Code generation arguments:
   --bindir <dir> Output directory for compiled
                                         objects
   --class-name <name> Sets the generated class name.
                                         This overrides --package-name.
                                         When combined with --jar-file,
                                         sets the input class.
   --input-null-non-string <null-str> Input null non-string
                                         representation
   --input-null-string <null-str> Input null string representation
   --jar-file <file> Disable code generation; use
                                         specified jar
   --map-column-java <arg> Override mapping for specific
                                         columns to java types
   --null-non-string <null-str> Null non-string representation
   --null-string <null-str> Null string representation
   --outdir <dir> Output directory for generated
                                         code
   --package-name <name> Put auto-generated classes in
                                         this package


向AI問一下細(xì)節(jié)

免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場,如果涉及侵權(quán)請(qǐng)聯(lián)系站長郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI