您好,登錄后才能下訂單哦!
hive是大數(shù)據(jù)技術(shù)簇中進(jìn)行數(shù)據(jù)倉庫應(yīng)用的基礎(chǔ)組件,是其它類似數(shù)據(jù)倉庫應(yīng)用的對比基準(zhǔn)?;A(chǔ)的數(shù)據(jù)操作我們可以通過腳本方式以hive-client進(jìn)行處理。若需要開發(fā)應(yīng)用程序,則需要使用hive的jdbc驅(qū)動進(jìn)行連接。本文以hive wiki上示例為基礎(chǔ),詳細(xì)講解了如何使用jdbc連接hive數(shù)據(jù)庫。hive wiki原文地址:
https://cwiki.apache.org/confluence/display/Hive/HiveClient
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBC
首先hive必須以服務(wù)方式啟動,我們平臺選用hdp平臺,hdp2.2平臺默認(rèn)啟動時hive server2 模式。hiveserver2是比hiveserver更高級的服務(wù)模式,提供了hiveserver不能提供的并發(fā)控制、安全機(jī)制等高級功能。服務(wù)器啟動以不同模式啟動,客戶端代碼的編碼方式也略有不同,具體見代碼。
服務(wù)啟動完成之后,在eclipse環(huán)境中編輯代碼。代碼如下:
import java.sql.SQLException; import java.sql.Connection; import java.sql.ResultSet; import java.sql.Statement; import java.sql.DriverManager; public class HiveJdbcClient { /*hiverserver 版本使用此驅(qū)動*/ //private static String driverName = "org.apache.hadoop.hive.jdbc.HiveDriver"; /*hiverserver2 版本使用此驅(qū)動*/ private static String driverName = "org.apache.hive.jdbc.HiveDriver"; public static void main(String[] args) throws SQLException { try { Class.forName(driverName); } catch (ClassNotFoundException e) { e.printStackTrace(); System.exit(1); } /*hiverserver 版本jdbc url格式*/ //Connection con = DriverManager.getConnection("jdbc:hive://hostip:10000/default", "", ""); /*hiverserver2 版本jdbc url格式*/ Connection con = DriverManager.getConnection("jdbc:hive2://hostip:10000/default", "hive", "hive"); Statement stmt = con.createStatement(); //參數(shù)設(shè)置測試 //boolean resHivePropertyTest = stmt // .execute("SET tez.runtime.io.sort.mb = 128"); boolean resHivePropertyTest = stmt .execute("set hive.execution.engine=tez"); System.out.println(resHivePropertyTest); String tableName = "testHiveDriverTable"; stmt.executeQuery("drop table " + tableName); ResultSet res = stmt.executeQuery("create table " + tableName + " (key int, value string)"); //show tables String sql = "show tables '" + tableName + "'"; System.out.println("Running: " + sql); res = stmt.executeQuery(sql); if (res.next()) { System.out.println(res.getString(1)); } //describe table sql = "describe " + tableName; System.out.println("Running: " + sql); res = stmt.executeQuery(sql); while (res.next()) { System.out.println(res.getString(1) + "\t" + res.getString(2)); } // load data into table // NOTE: filepath has to be local to the hive server // NOTE: /tmp/a.txt is a ctrl-A separated file with two fields per line String filepath = "/tmp/a.txt"; sql = "load data local inpath '" + filepath + "' into table " + tableName; System.out.println("Running: " + sql); res = stmt.executeQuery(sql); // select * query sql = "select * from " + tableName; System.out.println("Running: " + sql); res = stmt.executeQuery(sql); while (res.next()) { System.out.println(String.valueOf(res.getInt(1)) + "\t" + res.getString(2)); } // regular hive query sql = "select count(1) from " + tableName; System.out.println("Running: " + sql); res = stmt.executeQuery(sql); while (res.next()) { System.out.println(res.getString(1)); } } }
可以將如下jar包放在eclipse buildpath,可以在啟動時放在classpath路徑。
其中jdbcdriver可用hive-jdbc.jar,這樣的話,其他的jar也必須包含,或者用jdbc-standalone jar包,用此jar包其他jar包就可以不用包含。其中hadoop-common包一定要包含。
執(zhí)行后等待結(jié)果正確運行。若出現(xiàn)異常,則根據(jù)提示進(jìn)行解決。提示不明確的幾個異常的解決方案如下:
1. 假如classpath或者buildpath中不包含hadoop-common-0.23.9.jar,出現(xiàn)如下錯誤
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration at org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:393) at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:187) at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:163) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:247) at HiveJdbcClient.main(HiveJdbcClient.java:28) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 7 more
2. HIVE JDBC連接服務(wù)器卡死:
假如使用hiveserver 版本JDBCdriver 連接hiverserver2,將可能出現(xiàn)此問題,具體在JDBCDriver連接上之后根據(jù)協(xié)議要求請求hiveserver2返回數(shù)據(jù)時,hiveserver2不返回任何數(shù)據(jù),因此JDBC driver將卡死不返回。
3. TezTask出錯,返回錯誤號1.
Exception in thread "main" java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:296) at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:392) at HiveJdbcClient.main(HiveJdbcClient.java:40)
錯誤號1代表用戶認(rèn)證失敗,在連接時必須指定用戶名密碼,有可能通過服務(wù)器設(shè)置可以不需要用戶認(rèn)證就可以執(zhí)行,hdp默認(rèn)安裝配置用戶名密碼是hive,hive
3. TezTask出錯,返回錯誤號2.
TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.IllegalArgumentException: tez.runtime.io.sort.mb 256 should be larger than 0 and should be less than the available task memory (MB):133 at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) at org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.getInitialMemoryRequirement(ExternalSorter.java:291) at org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.initialize(OrderedPartitionedKVOutput.java:95) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.call(LogicalIOProcessorRuntimeTask.java:430) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.call(LogicalIOProcessorRuntimeTask.java:409) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1441168955561_1508_2_00 [Map 1] killed/failed due to:null] Vertex killed, vertexName=Reducer 2, vertexId=vertex_1441168955561_1508_2_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1441168955561_1508_2_01 [Reducer 2] killed/failed due to:null] DAG failed due to vertex failure. failedVertices:1 killedVertices:1 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask
code 2,代表錯誤是參數(shù)錯誤,一般是指對應(yīng)的值不合適,以上堆棧指示tez.runtime.io.sort.mb參數(shù)256比可用內(nèi)存大,因此修改配置文件或者執(zhí)行查詢之前先設(shè)置其大小即可。
通過以上設(shè)置以及參數(shù)修正之后,應(yīng)用程序就能正確的使用jdbc連接hive數(shù)據(jù)庫。
另可以用squirrel-sql GUI客戶端管理hivedb,驅(qū)動設(shè)置方式與代碼中對應(yīng)jar包、驅(qū)動類、url等使用同樣方式設(shè)置,測試成功建立好alias就可以開始連接hive,可以比較方便的管理和操作hive數(shù)據(jù)庫。
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進(jìn)行舉報,并提供相關(guān)證據(jù),一經(jīng)查實,將立刻刪除涉嫌侵權(quán)內(nèi)容。