Hive運(yùn)維中hive-site文件的示例分析

發(fā)布時(shí)間：2021-12-10 10:37:54 來源：億速云閱讀：118 作者：小新欄目：云計(jì)算

小編給大家分享一下Hive運(yùn)維中hive-site文件的示例分析，相信大部分人都還不怎么了解，因此分享這篇文章給大家參考一下，希望大家閱讀完這篇文章后大有收獲，下面讓我們一起去了解一下吧！

0. hive設(shè)置的優(yōu)先級(jí)（從高到低）：

1. Hive set命令。

2. 命令行選擇 -hiveconf

3. hive-site.xml

4. hive-default.xml

5. hadoop-site.xml(或者是core-site.xml hdfs-site.xml mapred-site.xml)

6. hadoop-default.xml(或者是core-default.xml hdfs-default.xml mapred-default.xml)。

7. hive的日志信息存放在 /tmp/$USER/hive.log，出錯(cuò)時(shí)hadoop的mapred task logs也可以查看，本環(huán)境在/tmp/nslab下查看。

命令：hive -hiveconf hive.root.logger=DEBUG,console 將調(diào)試信息打印到控制臺(tái)。

使用set的使用

1. 使用set查看設(shè)置的值：

set hive.enforce.bucketing

2. 只輸入一個(gè)set，會(huì)列出所有的設(shè)置。

3. 設(shè)置新的屬性，格式類似下面：

set hive.enforce.bucketing=true;

1. 動(dòng)態(tài)分區(qū)：

hive.exec.dynamic.partition

是否打開動(dòng)態(tài)分區(qū)。

默認(rèn)：false

hive.exec.dynamic.partition.mode

打開動(dòng)態(tài)分區(qū)后，動(dòng)態(tài)分區(qū)的模式，有 strict 和 nonstrict 兩個(gè)值可選，strict 要求至少包含一個(gè)靜態(tài)分區(qū)列，nonstrict 則無此要求。

默認(rèn)：strict

hive.exec.max.dynamic.partitions

所允許的最大的動(dòng)態(tài)分區(qū)的個(gè)數(shù)。

默認(rèn)：1000

hive.exec.max.dynamic.partitions.pernode

單個(gè) reduce 結(jié)點(diǎn)所允許的最大的動(dòng)態(tài)分區(qū)的個(gè)數(shù)。

默認(rèn)：100

hive.exec.default.partition.name

默認(rèn)的動(dòng)態(tài)分區(qū)的名稱，當(dāng)動(dòng)態(tài)分區(qū)列為''或者null時(shí)，使用此名稱。''

2. 打印列名, 開啟行轉(zhuǎn)列（有待測試）

set hive.cli.print.header=true; // 打印列名

set hive.cli.print.row.to.vertical=true; // 開啟行轉(zhuǎn)列功能, 前提必須開啟打印列名功能

set hive.cli.print.row.to.vertical.num=1; // 設(shè)置每行顯示的列數(shù)

3. 查看hive版本：

set hive.hwi.war.file;

4. 查看hive命令行字符編碼：

hive.cli.encoding

Hive 默認(rèn)的命令行字符編碼。

默認(rèn): 'UTF8'

5. Hive Fetch Task執(zhí)行：

set hive.fetch.task.conversion=more;

對于簡單的不需要聚合的類似SELECT <col> from <table> LIMIT n語句，不需要起MapReduce job，直接通過Fetch task獲取數(shù)據(jù)（數(shù)據(jù)量過大，也能無返回結(jié)果）

類似linux的vi,直接對文本進(jìn)行操作。

也有點(diǎn)類似shark的列存儲(chǔ)的操作：放在同一個(gè)array里面，所以查詢數(shù)據(jù)很快

hive.fetch.task.conversion

Hive 默認(rèn)的mapreduce操作

默認(rèn): minimal

6. MapJoin

舊版本HIVE需要自行在查詢/子查詢的SELECT關(guān)鍵字后面添加/*+ MAPJOIN(tablelist) */提示優(yōu)化器轉(zhuǎn)化為MapJoin。高版本只需設(shè)置：

set hive.auto.convert.join=true;

HIVE自行選擇小表作為LEFT的左表。

7. Strict Mode：

hive.mapred.mode=true，嚴(yán)格模式不允許執(zhí)行以下查詢：

分區(qū)表上沒有指定了分區(qū)

沒有l(wèi)imit限制的order by語句

笛卡爾積：JOIN時(shí)沒有ON語句

8. 并發(fā)執(zhí)行任務(wù)：

設(shè)置該參數(shù)是控制在同一個(gè)sql中的不同的job是否可以同時(shí)運(yùn)行，默認(rèn)是false

hive.exec.parallel=true ，默認(rèn)為false

hive.exec.parallel.thread.number=8

9. 負(fù)載均衡

hive.groupby.skewindata=true：數(shù)據(jù)傾斜時(shí)負(fù)載均衡，當(dāng)選項(xiàng)設(shè)定為true，生成的查詢計(jì)劃會(huì)有兩個(gè)MRJob。第一個(gè)MRJob 中，

Map的輸出結(jié)果集合會(huì)隨機(jī)分布到Reduce中，每個(gè)Reduce做部分聚合操作，并輸出結(jié)果，這樣處理的結(jié)果是相同的GroupBy Key

有可能被分發(fā)到不同的Reduce中，從而達(dá)到負(fù)載均衡的目的；第二個(gè)MRJob再根據(jù)預(yù)處理的數(shù)據(jù)結(jié)果按照GroupBy Key分布到

Reduce中（這個(gè)過程可以保證相同的GroupBy Key被分布到同一個(gè)Reduce中），最后完成最終的聚合操作。

10. hive.exec.rowoffset：是否提供虛擬列

11. set hive.error.on.empty.partition=true; 那么動(dòng)態(tài)分區(qū)如果為空，則會(huì)報(bào)異常

set hive.error.on.empty.partition = true;
set hive.exec.dynamic.partition.mode=nonstrict;

參考地址： http://my.oschina.net/repine/blog/541380

12. hive.merge.mapredfiles：合并小文件

工作需要合并reduce產(chǎn)生文件：

set hive.merge.smallfiles.avgsize=67108864;
set hive.merge.mapfiles=true;
set hive.merge.mapredfiles=true;

參考地址： http://www.linuxidc.com/Linux/2015-06/118391.htm

1.先在hive-site.xml中設(shè)置小文件的標(biāo)準(zhǔn).

<property>
<name>hive.merge.smallfiles.avgsize</name>
<value>536870912</value>
<description>When the average output file size of a job is less than this number, Hive will start an additional map-reduce job to merge the output files into bigger files. This is only done for map-only jobs if hive.merge.mapfiles is true, and for map-reduce jobs if hive.merge.mapredfiles is true.</description>
</property>

2.為只有map的mapreduce的輸出并合并小文件.

<property>
<name>hive.merge.mapfiles</name>
<value>true</value>
<description>Merge small files at the end of a map-only job</description>
</property>

3.為含有reduce的mapreduce的輸出并合并小文件.

<property>
<name>hive.merge.mapredfiles</name>
<value>true</value>
<description>Merge small files at the end of a map-reduce job</description>
</property>

以上是“Hive運(yùn)維中hive-site文件的示例分析”這篇文章的所有內(nèi)容，感謝各位的閱讀！相信大家都有了一定的了解，希望分享的內(nèi)容對大家有所幫助，如果還想學(xué)習(xí)更多知識(shí)，歡迎關(guān)注億速云行業(yè)資訊頻道！

向AI問一下細(xì)節(jié)