<blockquote id="n2c5t"><b id="n2c5t"><small id="n2c5t"></small></b></blockquote>

溫馨提示×

溫馨提示×

您好，登錄后才能下訂單哦！

密碼登錄×

忘記密碼？

登錄注冊(cè)×

獲取短信驗(yàn)證碼

其他方式登錄

點(diǎn)擊登錄注冊(cè) 即表示同意《億速云用戶服務(wù)條款》

用戶登錄×

賬戶密碼登錄

請(qǐng)使用微信掃描上方二維碼

使用幫助

請(qǐng)求超時(shí)！

請(qǐng)點(diǎn)擊重新獲取二維碼

Hive如何實(shí)現(xiàn)DML數(shù)據(jù)操作、分區(qū)表和分桶表

發(fā)布時(shí)間：2021-12-16 14:12:28 來(lái)源：億速云閱讀：141 作者：小新欄目：大數(shù)據(jù)

這篇文章主要為大家展示了“Hive如何實(shí)現(xiàn)DML數(shù)據(jù)操作、分區(qū)表和分桶表”，內(nèi)容簡(jiǎn)而易懂，條理清晰，希望能夠幫助大家解決疑惑，下面讓小編帶領(lǐng)大家一起研究并學(xué)習(xí)一下“Hive如何實(shí)現(xiàn)DML數(shù)據(jù)操作、分區(qū)表和分桶表”這篇文章吧。

1、DML數(shù)據(jù)操作

1.1、數(shù)據(jù)導(dǎo)入

1.通過(guò)load data導(dǎo)入
	load data [local] inpath '數(shù)據(jù)的path' [overwrite] 
		#[local] ：如果不加該字段表示路徑為HDFS。加上local表示本地路徑
		#[overwrite] ：如果加該字段第二次導(dǎo)入會(huì)覆蓋第一次導(dǎo)入的數(shù)據(jù)。不加會(huì)追加
		
	into table 表名 [partition (partcol1=val1,…)];
		#[partition (partcol1=val1,…)] ：指定分區(qū)的字段（后面再說(shuō)）。
		
tip：set hive.exec.mode.local.auto=true; 使用本地模式去跑MR（只有在一定條件下才跑本地不滿足還跑集群）


-----------------------------------------------------------
2.通過(guò)查詢語(yǔ)句向表中插入數(shù)據(jù)（Insert）

	2.1 直接向表中插入新的數(shù)據(jù)
		insert into student values(1,'aa');

	2.2 將查詢的結(jié)果插入到表中(注意：查詢的結(jié)果的列數(shù)和原表的列必須保持一致（列的數(shù)量和類型）)
		insert overwrite table 表名 sql語(yǔ)句;


--------------------------------------------------------------
3.查詢語(yǔ)句中創(chuàng)建表并加載數(shù)據(jù)（As Select）
	create table if not exists 表名
	as sql語(yǔ)句;
	
	
	
----------------------------------------------------------------
4.創(chuàng)建表時(shí)通過(guò)Location指定加載數(shù)據(jù)路徑
	create table if not exists student3(
	id int,
	name string
	)
	row format delimited fields terminated by '\t'
	location '/input';


--------------------------------------------------------------------
5.導(dǎo)入數(shù)據(jù)（只有導(dǎo)出的數(shù)據(jù)才能導(dǎo)入）
	注意：表必須不存在，否則會(huì)報(bào)錯(cuò)
	import table 庫(kù)名.表名  from 'HDFS導(dǎo)出的路徑';

1.2、數(shù)據(jù)導(dǎo)出

1. insert導(dǎo)出
	insert overwrite [local] directory '路徑'
	row format delimited fields terminated by '\t' #指定分隔符
            sql查詢語(yǔ)句;
	#local:如果加上該字段導(dǎo)出的路徑為本地。如果不加該字段導(dǎo)出的路徑為HDFS

    例：
	insert overwrite local directory '/opt/module/hive/datas2' 
	row format delimited fields terminated by '\t'
	select * from db4.student3;

	insert overwrite directory '/output' 
	row format delimited fields terminated by '\t'
	select * from db4.student3;


-------------------------------------------------------------------
2. Hadoop命令導(dǎo)出到本地

	hadoop fs -get '表中數(shù)據(jù)的路徑'  '本地路徑'
	hdfs dfs -get '表中數(shù)據(jù)的路徑'  '本地路徑'
	在hive客戶端中 ：dfs -get '表中數(shù)據(jù)的路徑'  '本地路徑'


--------------------------------------------------------------------
3.Hive Shell 命令導(dǎo)出
	bin/hive -e 'select * from 表名;' > 本地路徑;


--------------------------------------------------------------------
4 Export導(dǎo)出到HDFS上

	export table 庫(kù)名.表名 to 'HDFS路徑';


--------------------------------------------------------------------
5.Sqoop導(dǎo)出
	后面會(huì)提。。。

2、分區(qū)表和分桶表

2.1、分區(qū)表

一 創(chuàng)建分區(qū)表
	create table 表名(
		deptno int, dname string, loc string
	)
	partitioned by (字段名 字段類型) #指定分區(qū)字段
	row format delimited fields terminated by '\t';

   案例：
	create table dept_partition(
	deptno int, dname string, loc string
	)
	partitioned by (day string)
	row format delimited fields terminated by '\t';


---------------------------------------------------------------------------------
二 分區(qū)表的操作：

	1.添加分區(qū)
	alter table 表名 add partition(分區(qū)字段名='值') partition(分區(qū)字段名='值') .......
	
	2.查看分區(qū)
	show partitions 表名;
	
	3.刪除分區(qū)
	alter table 表名 drop partition(分區(qū)字段名='值'),partition(分區(qū)字段名='值').......
	
	4.向分區(qū)表中添加數(shù)據(jù)
	load data [local] inpath '路徑' [overwrite] into table 表名 partition(分區(qū)字段名='值');


---------------------------------------------------------------------------------------
三 創(chuàng)建二級(jí)分區(qū)表
	create table 表名(
	deptno int, dname string, loc string
	 )
	partitioned by (字段名1 字段類型, 字段名2 字段類型,......)
	row format delimited fields terminated by '\t';

   案例：
	create table dept_partition2(
	deptno int, dname string, loc string
	)
	partitioned by (day string, hour string)
	row format delimited fields terminated by '\t';


   向二級(jí)分區(qū)表中添加數(shù)據(jù)（在load數(shù)據(jù)時(shí)如果分區(qū)不存在則直接創(chuàng)建）：
	load data local inpath '/opt/module/hive/datas/dept_20200401.log' into table
	dept_partition2 partition(day='20200401', hour='12');

	load data local inpath '/opt/module/hive/datas/dept_20200402.log' into table
	dept_partition2 partition(day='20200401', hour='13');


---------------------------------------------------------------
四 數(shù)據(jù)和分區(qū)的關(guān)聯(lián)方式

	1.執(zhí)行修復(fù)命令
		msck repair table 表名;

	2.方式二：上傳數(shù)據(jù)后添加分區(qū)
		alter table 表名 add partition(字段名='值');

	3.方式三：創(chuàng)建文件夾后load數(shù)據(jù)到分區(qū)(會(huì)直接創(chuàng)建該分區(qū))
		load data local inpath '/opt/module/hive/datas/dept_20200402.log' into table
		dept_partition2 partition(day='20200401', hour='13');

2.2、分桶表

一 創(chuàng)建分桶表：
	create table 表名(id int, name string)
	clustered by(id) #id:分桶字段。分桶時(shí)就會(huì)根據(jù)此id進(jìn)行分桶。
	into 桶的數(shù)量 buckets
	row format delimited fields terminated by '\t';

   案例：
	create table stu_buck(id int, name string)
	clustered by(id) 
	into 4 buckets
	row format delimited fields terminated by '\t';

   注意：
	 1.在hive的新版本當(dāng)我們向一個(gè)分桶表中l(wèi)oad數(shù)據(jù)時(shí)會(huì)跑MR
		所以load數(shù)據(jù)的路徑最好放在HDFS上。

	 2.我們分桶的數(shù)量要和ReduceTask的數(shù)量相等。

	 3.分桶的原則：根據(jù)分桶的字段的內(nèi)容的hashCode值 % 分桶的數(shù)量 算出數(shù)據(jù)應(yīng)該進(jìn)入到哪個(gè)桶。

以上是“Hive如何實(shí)現(xiàn)DML數(shù)據(jù)操作、分區(qū)表和分桶表”這篇文章的所有內(nèi)容，感謝各位的閱讀！相信大家都有了一定的了解，希望分享的內(nèi)容對(duì)大家有所幫助，如果還想學(xué)習(xí)更多知識(shí)，歡迎關(guān)注億速云行業(yè)資訊頻道！

向AI問(wèn)一下細(xì)節(jié)

推薦閱讀：

免責(zé)聲明：本站發(fā)布的內(nèi)容（圖片、視頻和文字）以原創(chuàng)、轉(zhuǎn)載和分享為主，文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng)，如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱：is@yisu.com進(jìn)行舉報(bào)，并提供相關(guān)證據(jù)，一經(jīng)查實(shí)，將立刻刪除涉嫌侵權(quán)內(nèi)容。

上一篇新聞：
sparklines中的Area是什么
下一篇新聞：
Linux?sftp命令的用法是怎樣的

猜你喜歡

AI
助
手

產(chǎn)品服務(wù)

地區(qū)劃分

專題活動(dòng)

幫助支持

關(guān)于我們

售后咨詢

7*24小時(shí)在線電話：400-100-2938

7*24小時(shí)在線 QQ：800811969

關(guān)注億速云

億速云公眾號(hào)

手機(jī)網(wǎng)站二維碼