您好,登錄后才能下訂單哦!
生產(chǎn)背景:
在從mysql導(dǎo)入到hive中,遇到如下問題:
1) 源mysql和集群機(jī)器不在同一個(gè)網(wǎng)段中,導(dǎo)致執(zhí)行導(dǎo)入命令,網(wǎng)絡(luò)連接失敗。
2) 某些字符導(dǎo)入到hive中,出現(xiàn)報(bào)錯(cuò)終止。
2.1 sqoop使用的JDBC-connector 版本太低(更換版本)。
從hive導(dǎo)出到mysql中,遇到如下問題:
1)某些字符插入mysql,出現(xiàn)報(bào)錯(cuò)終止。
1.1 可能mysql本身編碼的限制,某些字符不支持,比如uft8和utf8mb4
1.2 sqoop使用的JDBC-connector 版本太低(更換版本)。
sqoop 缺點(diǎn):
1 基于命令行的操作方式,易出錯(cuò),且不安全。
2 數(shù)據(jù)傳輸和數(shù)據(jù)格式是緊耦合的,這使得connector無法支持所有的數(shù)據(jù)格式
3 用戶名和密碼暴漏出來
4 sqoop安裝需要root權(quán)限
Sqoop優(yōu)點(diǎn):
1 高效可控的利用資源,任務(wù)并行度,超時(shí)時(shí)間。
2 數(shù)據(jù)類型映射與轉(zhuǎn)化,可自動(dòng)進(jìn)行,用戶也可自定義 .
3 支持多種主流數(shù)據(jù)庫,MySQL,Oracle,SQL Server,DB2等等 。
Sqoop原理:
Sqoop的inport原理:
Sqoop在import時(shí),需要制定split-by參數(shù)。Sqoop根據(jù)不同的split-by參數(shù)值來進(jìn)行切分,然后將切分出來的區(qū)域分配到不同map中。每個(gè)map中再處理數(shù)據(jù)庫中獲取的一行一行的值,寫入到HDFS中。同時(shí)split-by根據(jù)不同的參數(shù)類型有不同的切分方法,如比較簡單的int型,Sqoop會(huì)取最大和最小split-by字段值,然后根據(jù)傳入的num-mappers來確定劃分幾個(gè)區(qū)域。 比如select max(split_by),min(split-by) from得到的max(split-by)和min(split-by)分別為1000和1,而num-mappers為2的話,則會(huì)分成兩個(gè)區(qū)域(1,500)和(501-100),同時(shí)也會(huì)分成2個(gè)sql給2個(gè)map去進(jìn)行導(dǎo)入操作,分別為select XXX from table where split-by>=1 and split-by<500和select XXX from table where split-by>=501 and split-by<=1000。最后每個(gè)map各自獲取各自SQL中的數(shù)據(jù)進(jìn)行導(dǎo)入工作。
Sqoop的export原理:根據(jù)mysql表名稱,生成一個(gè)以表名稱命名的Java類,該類繼承了sqoopRecord的,是一個(gè)只有Map的MR,且自定義了輸出字段。
sqoop export --connect jdbc:mysql://$url:3306/$3?characterEncoding=utf8 --username $username --password $password --table $1 --export-dir $2 --input-fields-terminated-by '|' --null-non-string '0' --null-string '0';
驗(yàn)證sqoop的各種報(bào)錯(cuò):
mysql表
create table dm_trlog (
plat varchar(20),
user_id varchar(20),
click_time varchar(20),
click_url varchar(200)
)
hive 表
CREATE TABLE TRLOG
(
PLATFORM string,
USER_ID int,
CLICK_TIME string,
CLICK_URL string
)
row format delimited
fields terminated by '\t';
sqoop list-databases –connect jdbc:mysql://192.168.119.129:3306/ –username li72 –password 123
1 mysql字段太短
sqoop export --connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 --username li72 --password 123 --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '\t' --null-non-string '0' --null-string '0';
Warning: $HADOOP_HOME is deprecated.
14/11/06 01:42:32 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/11/06 01:42:32 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/11/06 01:42:32 INFO tool.CodeGenTool: Beginning code generation
14/11/06 01:42:34 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1
14/11/06 01:42:34 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1
14/11/06 01:42:34 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/bigdata/hadoop
Note: /tmp/sqoop-root/compile/d5e37c20a9231b3253c97fc27d16d8a9/dm_trlog.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/11/06 01:42:38 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/d5e37c20a9231b3253c97fc27d16d8a9/dm_trlog.jar
14/11/06 01:42:38 INFO mapreduce.ExportJobBase: Beginning export of dm_trlog
14/11/06 01:42:43 INFO input.FileInputFormat: Total input paths to process : 1
14/11/06 01:42:43 INFO input.FileInputFormat: Total input paths to process : 1
14/11/06 01:42:43 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/11/06 01:42:43 WARN snappy.LoadSnappy: Snappy native library not loaded
14/11/06 01:42:44 INFO mapred.JobClient: Running job: job_201411060114_0001
14/11/06 01:42:45 INFO mapred.JobClient: map 0% reduce 0%
14/11/06 01:43:15 INFO mapred.JobClient: Task Id : attempt_201411060114_0001_m_000000_0, Status : FAILED
java.io.IOException: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long for column 'click_time' at row 1
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:192)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long for column 'click_time' at row 1
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4118)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4052)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2503)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2664)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2794)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2155)
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1379)
at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233)
mysql字段太短了
drop table dm_trlog;
create table dm_trlog (
plat varchar(20),
user_id varchar(20),
click_time varchar(200),
click_url varchar(200)
)
2 hive的空字段轉(zhuǎn)換
由于Hive的NULL用\N來表示,字段用\001來分割,換行用\n來換行,導(dǎo)出分隔符一定要和hive表保持一致,如果為空可以指定轉(zhuǎn)換為0,有些mysql數(shù)字字段不能插入\N
加上兩個(gè)參數(shù):--input-null-string '\\N' --input-null-non-string '\\N',多加一個(gè)'\',是為轉(zhuǎn)義
sqoop export --connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 --username li72 --password 123 --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '\t' --null-non-string '0' --null-string '0';
14/10/23 04:53:47 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/915d24128af59c9e517580e0f07411d4/dm_pc_play_kpi.jar
14/10/23 04:53:47 INFO mapreduce.ExportJobBase: Beginning export of dm_pc_play_kpi
14/10/23 04:53:48 INFO input.FileInputFormat: Total input paths to process : 1
14/10/23 04:53:48 INFO input.FileInputFormat: Total input paths to process : 1
14/10/23 04:53:48 WARN snappy.LoadSnappy: Snappy native library is available
14/10/23 04:53:48 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/10/23 04:53:48 INFO snappy.LoadSnappy: Snappy native library loaded
14/10/23 04:53:48 INFO mapred.JobClient: Running job: job_201408301703_84117
14/10/23 04:53:49 INFO mapred.JobClient: map 0% reduce 0%
14/10/23 04:55:45 INFO mapred.JobClient: Task Id : attempt_201408301703_84117_m_000000_0, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.NumberFormatException: For input string: "N"
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1241)
at java.lang.Float.valueOf(Float.java:417)
at dm_pc_play_kpi.__loadFromFields(dm_pc_play_kpi.java:335)
at dm_pc_play_kpi.parse(dm_pc_play_kpi.java:282)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
14/10/23 04:55:53 INFO mapred.JobClient: Task Id : attempt_201408301703_84117_m_000000_1, Status : FAILED
14/10/23 04:55:58 INFO mapred.JobClient: Task Id : attempt_201408301703_84117_m_000000_2, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.NumberFormatException: For input string: "N"
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1241)
at java.lang.Float.valueOf(Float.java:417)
at dm_pc_play_kpi.__loadFromFields(dm_pc_play_kpi.java:335)
at dm_pc_play_kpi.parse(dm_pc_play_kpi.java:282)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
3 分隔符錯(cuò)誤
Hive中的分隔符是’\t’ 但是導(dǎo)出寫成’|’
sqoop export --connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 --username li72 --password 123 --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '|' --null-non-string '0' --null-string '0';
為了測試,特意把分隔符改成 "|"
14/11/06 01:50:19 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/11/06 01:50:19 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/11/06 01:50:19 INFO tool.CodeGenTool: Beginning code generation
14/11/06 01:50:20 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1
14/11/06 01:50:21 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1
14/11/06 01:50:21 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/bigdata/hadoop
Note: /tmp/sqoop-root/compile/e474b3f8292f91dd4134b302ae35df19/dm_trlog.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/11/06 01:50:25 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/e474b3f8292f91dd4134b302ae35df19/dm_trlog.jar
14/11/06 01:50:25 INFO mapreduce.ExportJobBase: Beginning export of dm_trlog
14/11/06 01:50:45 INFO input.FileInputFormat: Total input paths to process : 1
14/11/06 01:50:45 INFO input.FileInputFormat: Total input paths to process : 1
14/11/06 01:50:45 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/11/06 01:50:45 WARN snappy.LoadSnappy: Snappy native library not loaded
14/11/06 01:50:51 INFO mapred.JobClient: Running job: job_201411060114_0003
14/11/06 01:50:52 INFO mapred.JobClient: map 0% reduce 0%
14/11/06 01:51:20 INFO mapred.JobClient: Task Id : attempt_201411060114_0003_m_000000_0, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.util.NoSuchElementException
at java.util.AbstractList$Itr.next(AbstractList.java:350)
at dm_trlog.__loadFromFields(dm_trlog.java:252)
at dm_trlog.parse(dm_trlog.java:201)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
14/11/06 01:51:21 INFO mapred.JobClient: Task Id : attempt_201411060114_0003_m_000001_0, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.util.NoSuchElementException
at java.util.AbstractList$Itr.next(AbstractList.java:350)
at dm_trlog.__loadFromFields(dm_trlog.java:252)
at dm_trlog.parse(dm_trlog.java:201)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
4 mysql的網(wǎng)絡(luò)不在集群網(wǎng)絡(luò)中
sqoop export --connect jdbc:mysql://192.168.119.1:3306/student?characterEncoding=utf8 --username li72 --password 123 --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '\t' --null-non-string '0' --null-string '0';
Warning: $HADOOP_HOME is deprecated.
14/11/06 02:04:29 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/11/06 02:04:30 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/11/06 02:04:30 INFO tool.CodeGenTool: Beginning code generation
14/11/06 02:07:40 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:355)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2461)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2498)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2283)
at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:822)
at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:404)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)
at java.sql.DriverManager.getConnection(DriverManager.java:582)
at java.sql.DriverManager.getConnection(DriverManager.java:185)
at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:745)
at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:605)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:628)
at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:235)
at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:219)
at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:283)
at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1255)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1072)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)
at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)
at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)
at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
Caused by: java.net.ConnectException: Connection timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:529)
at java.net.Socket.connect(Socket.java:478)
at java.net.Socket.<init>(Socket.java:375)
at java.net.Socket.<init>(Socket.java:218)
at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:305)
... 32 more
5 mysql停止服務(wù)
14/11/06 04:55:25 DEBUG tool.BaseSqoopTool: Enabled debug logging.
14/11/06 04:55:25 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/11/06 04:55:25 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory
14/11/06 04:55:25 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory
14/11/06 04:55:25 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:mysql:
14/11/06 04:55:25 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/11/06 04:55:25 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.MySQLManager@8a0d5d
14/11/06 04:55:25 INFO tool.CodeGenTool: Beginning code generation
14/11/06 04:55:26 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.
14/11/06 04:55:27 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:355)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2461)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2498)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2283)
at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:822)
at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:404)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)
at java.sql.DriverManager.getConnection(DriverManager.java:582)
at java.sql.DriverManager.getConnection(DriverManager.java:185)
at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:745)
at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:605)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:628)
at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:235)
at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:219)
at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:283)
at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1255)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1072)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)
at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)
at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)
at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:529)
at java.net.Socket.connect(Socket.java:478)
at java.net.Socket.<init>(Socket.java:375)
at java.net.Socket.<init>(Socket.java:218)
at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:305)
... 32 more
14/11/06 04:55:27 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.
14/11/06 04:55:27 ERROR manager.CatalogQueryManager: Failed to list columns from query: SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = (SELECT SCHEMA()) AND TABLE_NAME = 'dm_trlog'
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:355)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2461)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2498)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2283)
at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:822)
at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:404)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)
at java.sql.DriverManager.getConnection(DriverManager.java:582)
at java.sql.DriverManager.getConnection(DriverManager.java:185)
at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:745)
at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)
at org.apache.sqoop.manager.CatalogQueryManager.getColumnNames(CatalogQueryManager.java:147)
at org.apache.sqoop.orm.ClassWriter.getColumnNames(ClassWriter.java:1222)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1074)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)
at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)
at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)
at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:529)
at java.net.Socket.connect(Socket.java:478)
at java.net.Socket.<init>(Socket.java:375)
at java.net.Socket.<init>(Socket.java:218)
at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:305)
... 28 more
14/11/06 04:55:27 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
6 mysql utf8編碼只是3個(gè)字節(jié),可能是因?yàn)槟承﹗nicode字符轉(zhuǎn)成utf8之后變成了4個(gè)字節(jié),需要mysql支持utf8mb4
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
14/11/09 07:00:33 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/11/09 07:00:33 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/11/09 07:00:33 INFO tool.CodeGenTool: Beginning code generation
14/11/09 07:00:33 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_go_snger` AS t LIMIT 1
14/11/09 07:00:34 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_go_snger` AS t LIMIT 1
14/11/09 07:00:34 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/hadoop/hadoop-1.1.2
Note: /tmp/sqoop-hadoop/compile/211b9679d4ac771d3ff710cbbd1c7277/dm_go_snger.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/11/09 07:00:35 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/211b9679d4ac771d3ff710cbbd1c7277/dm_go_snger.jar
14/11/09 07:00:35 INFO mapreduce.ExportJobBase: Beginning export of dm_go_snger
14/11/09 07:00:36 INFO input.FileInputFormat: Total input paths to process : 1
14/11/09 07:00:36 INFO input.FileInputFormat: Total input paths to process : 1
14/11/09 07:00:36 WARN snappy.LoadSnappy: Snappy native library is available
14/11/09 07:00:36 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/11/09 07:00:36 INFO snappy.LoadSnappy: Snappy native library loaded
14/11/09 07:00:37 INFO mapred.JobClient: Running job: job_201408301703_121362
14/11/09 07:00:38 INFO mapred.JobClient: map 0% reduce 0%
14/11/09 07:00:52 INFO mapred.JobClient: Task Id : attempt_201408301703_121362_m_000000_0, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.IOException: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220)
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84)
... 10 more
Caused by: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1072)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3563)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3495)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2693)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2102)
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1364)
at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233)
14/11/09 07:00:59 INFO mapred.JobClient: Task Id : attempt_201408301703_121362_m_000000_1, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.IOException: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220)
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84)
... 10 more
Caused by: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1072)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3563)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3495)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2693)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2102)
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1364)
at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233)
14/11/09 07:01:07 INFO mapred.JobClient: Task Id : attempt_201408301703_121362_m_000000_2, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.IOException: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220)
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84)
... 10 more
Caused by: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1072)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3563)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3495)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2693)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2102)
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1364)
at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233)
14/11/09 07:01:20 INFO mapred.JobClient: Job complete: job_201408301703_121362
14/11/09 07:01:20 INFO mapred.JobClient: Counters: 8
14/11/09 07:01:20 INFO mapred.JobClient: Job Counters
14/11/09 07:01:20 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=31500
14/11/09 07:01:20 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
14/11/09 07:01:20 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
14/11/09 07:01:20 INFO mapred.JobClient: Rack-local map tasks=3
14/11/09 07:01:20 INFO mapred.JobClient: Launched map tasks=4
14/11/09 07:01:20 INFO mapred.JobClient: Data-local map tasks=1
14/11/09 07:01:20 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
14/11/09 07:01:20 INFO mapred.JobClient: Failed map tasks=1
14/11/09 07:01:20 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 44.4052 seconds (0 bytes/sec)
14/11/09 07:01:20 INFO mapreduce.ExportJobBase: Exported 0 records.
14/11/09 07:01:20 ERROR tool.ExportTool: Error during export: Export job failed!
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
Logging initialized using configuration in jar:file:/home/hadoop/hadoop/hive-0.10.0.20140629/lib/hive-common-0.10.0.jar!/hive-log4j.properties
Hive history file=/tmp/hadoop/hive_job_log_hadoop_201411090701_1423933997.txt
7 Sqoop調(diào)式信息
增加關(guān)鍵字--verbose
sqoop export --connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 --username li72 --password 123 --verbose --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '\t' --null-non-string '0' --null-string '0';
8 修改生成的Java類,重新打包。
每次通過sqoop導(dǎo)入MySql的時(shí),都會(huì)生成一個(gè)以MySql表命名的.java文件,然后打成JAR包,給sqoop提交給hadoop 的MR來解析Hive表中的數(shù)據(jù)。那可以根據(jù)報(bào)的錯(cuò)誤,找到對(duì)應(yīng)的行,改寫該文件,編譯,重新打包,sqoop可以通過 -jar-file ,--class-name 組合讓我們指定運(yùn)行自己的jar包中的某個(gè)class。來解析該hive表中的每行數(shù)據(jù)。腳本如下:一個(gè)完整的例子如下:
./bin/sqoop export --connect "jdbc:mysql://192.168.119.129:3306/student?useUnicode=true&characterEncoding=utf-8"
--username li72 --password 123 --table dm_trlog
--export-dir /hive/warehouse/trlog --input-fields-terminated-by '\t'
--input-null-string '\\N' --input-null-non-string '\\N'
--class-name com.li72.trlog
--jar-file /tmp/sqoopTempjar/trlog.jar
上面--jar-file 參數(shù)指定jar包的路徑。--class-name 指定jar包中的class。
這樣就可以解決所有解析異常了。
Sqoop命令行說明
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 | Common arguments: --connect < jdbc-uri > Specify JDBC connect string --connection-manager < class-name > Specify connection manager class name --connection-param-file < properties-file > Specify connection parameters file --driver < class-name > Manually specify JDBC driver class to use --hadoop-home < hdir > Override $HADOOP_MAPRED_HOME_ARG --hadoop-mapred-home < dir > Override $HADOOP_MAPRED_HOME_ARG --help Print usage instructions -P Read password from console --password < password > Set authentication password --password-file < password-file > Set authentication password file path --relaxed-isolation Use read-uncommitted isolation for imports --skip-dist-cache Skip copying jars to distributed cache --username < username > Set authentication username --verbose Print more information while working Import control arguments: --append Imports data in append mode --as-avrodatafile Imports data to Avro data files --as-sequencefile Imports data to SequenceFile s --as-textfile Imports data as plain text (default) --boundary-query < statement > Set boundary query for retrieving max and min value of the primary key --columns < col ,col,col...> Columns to import from table --compression-codec < codec > Compression codec to use for import --delete-target-dir Imports data in delete mode --direct Use direct import fast path --direct-split-size < n > Split the input stream every 'n' bytes when importing in direct mode -e,--query < statement > Import results of SQL 'statement' --fetch-size < n > Set number 'n' of rows to fetch from the database when more rows are needed --inline-lob-limit < n > Set the maximum size for an inline LOB -m,--num-mappers < n > Use 'n' map tasks to import in parallel --mapreduce-job-name < name > Set name for generated mapreduce job --split-by < column-name > Column of the table used to split work units --table < table-name > Table to read --target-dir < dir > HDFS plain table destination --validate Validate the copy using the configured validator --validation-failurehandler < validation-failurehandler > Fully qualified class name for ValidationFa ilureHandler --validation-threshold < validation-threshold > Fully qualified class name for ValidationTh reshold --validator < validator > Fully qualified class name for the Validator --warehouse-dir < dir > HDFS parent for table destination --where < where clause> WHERE clause to use during import -z,--compress Enable compression Incremental import arguments: --check-column < column > Source column to check for incremental change --incremental < import-type > Define an incremental import of type 'append' or 'lastmodified' --last-value < value > Last imported value in the incremental check column Output line formatting arguments: --enclosed-by < char > Sets a required field enclosing character --escaped-by < char > Sets the escape character --fields-terminated-by < char > Sets the field separator character --lines-terminated-by < char > Sets the end-of-line character --mysql-delimiters Uses MySQL's default delimiter set: fields: , lines: \n escaped-by: \ optionally-enclosed-by: ' --optionally-enclosed-by < char > Sets a field enclosing character Input parsing arguments: --input-enclosed-by < char > Sets a required field encloser --input-escaped-by < char > Sets the input escape character --input-fields-terminated-by < char > Sets the input field separator --input-lines-terminated-by < char > Sets the input end-of-line char --input-optionally-enclosed-by < char > Sets a field enclosing character Hive arguments: --create-hive-table Fail if the target hive table exists --hive-database < database-name > Sets the database name to use when importing to hive --hive-delims-replacement < arg > Replace Hive record \0x01 and row delimiters (\n\r) from imported string fields with user-defined string --hive-drop-import-delims Drop Hive record \0x01 and row delimiters (\n\r) from imported string fields --hive-home < dir > Override $HIVE_HOME --hive-import Import tables into Hive (Uses Hive's default delimiters if none are set.) --hive-overwrite Overwrite existing data in the Hive table --hive-partition-key < partition-key > Sets the partition key to use when importing to hive --hive-partition-value < partition-value > Sets the partition value to use when importing to hive --hive-table < table-name > Sets the table name to use when importing to hive --map-column-hive < arg > Override mapping for specific column to hive types. HBase arguments: --column-family < family > Sets the target column family for the import --hbase-bulkload Enables HBase bulk loading --hbase-create-table If specified, create missing HBase tables --hbase-row-key < col > Specifies which input column to use as the row key --hbase-table < table > Import to < table > in HBase HCatalog arguments: --hcatalog-database < arg > HCatalog database name --hcatalog-home < hdir > Override $HCAT_HOME --hcatalog-table < arg > HCatalog table name --hive-home < dir > Override $HIVE_HOME --hive-partition-key < partition-key > Sets the partition key to use when importing to hive --hive-partition-value < partition-value > Sets the partition value to use when importing to hive --map-column-hive < arg > Override mapping for specific column to hive types. HCatalog import specific options: --create-hcatalog-table Create HCatalog before import --hcatalog-storage-stanza < arg > HCatalog storage stanza for table creation Accumulo arguments: --accumulo-batch-size < size > Batch size in bytes --accumulo-column-family < family > Sets the target column family for the import --accumulo-create-table If specified, create missing Accumulo tables --accumulo-instance < instance > Accumulo instance name. --accumulo-max-latency < latency > Max write latency in milliseconds --accumulo-password < password > Accumulo password. --accumulo-row-key < col > Specifies which input column to use as the row key --accumulo-table < table > Import to < table > in Accumulo --accumulo-user < user > Accumulo user name. --accumulo-visibility < vis > Visibility token to be applied to all rows imported --accumulo-zookeepers < zookeepers > Comma-separated list of zookeepers (host:port) Code generation arguments: --bindir < dir > Output directory for compiled objects --class-name < name > Sets the generated class name. This overrides --package-name. When combined with --jar-file, sets the input class. --input-null-non-string < null-str > Input null non-string representation --input-null-string < null-str > Input null string representation --jar-file < file > Disable code generation; use specified jar --map-column-java < arg > Override mapping for specific columns to java types --null-non-string < null-str > Null non-string representation --null-string < null-str > Null string representation --outdir < dir > Output directory for generated code --package-name < name > Put auto-generated classes in this package |
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場,如果涉及侵權(quán)請(qǐng)聯(lián)系站長郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。