溫馨提示×

溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊×
其他方式登錄
點擊 登錄注冊 即表示同意《億速云用戶服務(wù)條款》

Hive命令操作的示例分析

發(fā)布時間:2021-11-08 14:57:48 來源:億速云 閱讀:125 作者:小新 欄目:云計算

這篇文章主要為大家展示了“Hive命令操作的示例分析”,內(nèi)容簡而易懂,條理清晰,希望能夠幫助大家解決疑惑,下面讓小編帶領(lǐng)大家一起研究并學習一下“Hive命令操作的示例分析”這篇文章吧。

1、準備文本文件,啟動hadoop[root@hadoop0 ~]# cat /opt/test.txt
JieJie
MengMeng
NingNing
JingJing
FengJie
[root@hadoop0 ~]# start-all.sh
Warning: $HADOOP_HOME is deprecated.
starting namenode, logging to /opt/hadoop/libexec/../logs/hadoop-root-namenode-hadoop0.out
localhost: starting datanode, logging to /opt/hadoop/libexec/../logs/hadoop-root-datanode-hadoop0.out
localhost: starting secondarynamenode, logging to /opt/hadoop/libexec/../logs/hadoop-root-secondarynamenode-hadoop0.out
starting jobtracker, logging to /opt/hadoop/libexec/../logs/hadoop-root-jobtracker-hadoop0.out
localhost: starting tasktracker, logging to /opt/hadoop/libexec/../logs/hadoop-root-tasktracker-hadoop0.out
2、進入命令行[root@hadoop0 ~]# hive
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
Logging initialized using configuration in jar:file:/opt/hive/lib/hive-common-0.9.0.jar!/hive-log4j.properties
Hive history file=/tmp/root/hive_job_log_root_201509252001_1674268419.txt
3、查詢昨天的表hive> select * from stu;
OK
JieJie 26       NULL
MM 24   NULL
Time taken: 17.05 seconds
4、顯示數(shù)據(jù)庫hive> show databases; 
OK
default
Time taken: 0.237 seconds
5、創(chuàng)建數(shù)據(jù)庫hive> create database test; 
OK
Time taken: 0.259 seconds
hive> show databases;       
OK
default
test
6、使用數(shù)據(jù)庫Time taken: 0.119 seconds
hive> use test;
OK
Time taken: 0.03 seconds
7、創(chuàng)建表textfile 默認格式,數(shù)據(jù)不做壓縮,磁盤開銷大,數(shù)據(jù)解析開銷大。
可結(jié)合Gzip、Bzip2使用(系統(tǒng)自動檢查,執(zhí)行查詢時自動解壓),但使用這種方式,hive不會對數(shù)據(jù)進行切分,從而無法對數(shù)據(jù)進行并行操作。
SequenceFile是Hadoop API提供的一種二進制文件支持,其具有使用方便、可分割、可壓縮的特點。
SequenceFile支持三種壓縮選擇:NONE, RECORD, BLOCK。 Record壓縮率低,一般建議使用BLOCK壓縮
rcfile是一種行列存儲相結(jié)合的存儲方式。首先,其將數(shù)據(jù)按行分塊,保證同一個record在一個塊上,避免讀一個記錄需要讀取多個block。其次,塊數(shù)據(jù)列式存儲,有利于數(shù)據(jù)壓縮和快速的列存取。
hive>  create table test1(str STRING)  STORED AS TEXTFILE; 
OK
Time taken: 0.598 seconds
--加載數(shù)據(jù)
hive> LOAD DATA LOCAL INPATH '/opt/test.txt' INTO TABLE test1; 
Copying data from file:/opt/test.txt
Copying file: file:/opt/test.txt
Loading data to table test.test1
OK
Time taken: 1.657 seconds
hive> select * from test1;
OK
JieJie
MengMeng
NingNing
JingJing
FengJie
Time taken: 0.388 seconds
hive> select count(*) from test1;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201509252000_0001, Tracking URL = http://hadoop0:50030/jobdetails.jsp?jobid=job_201509252000_0001
Kill Command = /opt/hadoop/libexec/../bin/hadoop job  -Dmapred.job.tracker=hadoop0:9001 -kill job_201509252000_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2015-09-25 20:09:55,796 Stage-1 map = 0%,  reduce = 0%
2015-09-25 20:10:19,806 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.67 sec
2015-09-25 20:10:53,218 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 6.95 sec
2015-09-25 20:10:54,223 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 6.95 sec
MapReduce Total cumulative CPU time: 6 seconds 950 msec
Ended Job = job_201509252000_0001
MapReduce Jobs Launched:
Job 0: Map: 1  Reduce: 1   Cumulative CPU: 6.95 sec   HDFS Read: 258 HDFS Write: 2 SUCCESS
Total MapReduce CPU Time Spent: 6 seconds 950 msec
OK
5
Time taken: 77.515 seconds


create table test1(str STRING)  STORED AS TEXTFILE; 
create table test2(str STRING) ;
hive> create table test3(str STRING)  STORED AS SEQUENCEFILE;
OK
Time taken: 0.112 seconds
 
hive> create table test4(str STRING)  STORED AS RCFILE; 
OK
Time taken: 0.502 seconds
8、把舊表數(shù)據(jù)導(dǎo)入新表INSERT OVERWRITE TABLE test4 SELECT * FROM test1;
9、設(shè)置hive參數(shù)hive> SET hive.exec.compress.output=true; 
hive> SET io.seqfile.compression.type=BLOCK;
10、查看hive參數(shù) hive> SET ; 

以上是“Hive命令操作的示例分析”這篇文章的所有內(nèi)容,感謝各位的閱讀!相信大家都有了一定的了解,希望分享的內(nèi)容對大家有所幫助,如果還想學習更多知識,歡迎關(guān)注億速云行業(yè)資訊頻道!

向AI問一下細節(jié)

免責聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進行舉報,并提供相關(guān)證據(jù),一經(jīng)查實,將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI