您好,登錄后才能下訂單哦!
溫馨提示:要看高清無(wú)碼套圖,請(qǐng)使用手機(jī)打開并單擊圖片放大查看。
1.文檔編寫目的
繼上一章講述如何在CDH集群安裝Anaconda&搭建Python私有源后,本章節(jié)主要講述如何使用Pyton Impyla客戶端連接CDH集群的HiveServer2和Impala Daemon,并進(jìn)行SQL操作。
1.依賴包安裝
2.代碼編寫
3.代碼測(cè)試
1.CM和CDH版本為5.11.2
2.RedHat7.2
1.CDH集群環(huán)境正常運(yùn)行
2.Anaconda已安裝并配置環(huán)境變量
3.pip工具能夠正常安裝Python包
4.Python版本2.6+ or 3.3+
5.非安全集群環(huán)境
2.Impyla依賴包安裝
Impyla所依賴的Python包
1.首先安裝Impyla依賴的Python包
[root@ip-172-31-22-86 ~]# pip install bit_array
[root@ip-172-31-22-86 ~]# pip install thrift==0.9.3
[root@ip-172-31-22-86 ~]# pip install six
[root@ip-172-31-22-86 ~]# pip install thrift_sasl
[root@ip-172-31-22-86 ~]# pip install sasl
注意:thrift的版本必須使用0.9.3,默認(rèn)安裝的為0.10.0版本,需要卸載后重新安裝0.9.3版本,卸載命令pip uninstall thrift
2.安裝Impyla包
impyla版本,默認(rèn)安裝的是0.14.0,需要將卸載后安裝0.13.8版本
[root@ip-172-31-22-86 ec2-user]# pip install impyla==0.13.8
Collecting impyla
Downloading impyla-0.14.0.tar.gz (151kB)
100% |████████████████████████████████| 153kB 1.0MB/s
Requirement already satisfied: six in /opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (from impyla)
Requirement already satisfied: bitarray in /opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (from impyla)
Requirement already satisfied: thrift in /opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (from impyla)
Building wheels for collected packages: impyla
Running setup.py bdist_wheel for impyla ... done
Stored in directory: /root/.cache/pip/wheels/96/fa/d8/40e676f3cead7ec45f20ac43eb373edc471348ac5cb485d6f5
Successfully built impyla
Installing collected packages: impyla
Successfully installed impyla-0.14.0
3.編寫Python代碼
Python連接Hive(HiveTest.py)
from impala.dbapi importconnect
conn = connect(host='ip-172-31-21-45.ap-southeast-1.compute.internal',port=10000,database='default',auth_mechan
ism='PLAIN')
print(conn)
cursor = conn.cursor()
cursor.execute('show databases')
print cursor.description # prints the result set's schema
results = cursor.fetchall()
print(results)
cursor.execute('SELECT * FROM test limit 10')
print cursor.description # prints the result set's schema
results = cursor.fetchall()
print(results)
Python連接Impala(ImpalaTest.py)
from impala.dbapi importconnect
conn = connect(host='ip-172-31-26-80.ap-southeast-1.compute.internal',port=21050)
print(conn)
cursor = conn.cursor()
cursor.execute('show databases')
print cursor.description # prints the result set's schema
results = cursor.fetchall()
print(results)
cursor.execute('SELECT * FROM test limit 10')
print cursor.description # prints the result set's schema
results = cursor.fetchall()
print(results)
4.測(cè)試代碼
在shell命令行執(zhí)行Python代碼測(cè)試
1.測(cè)試連接Hive
_root@ip-172-31-22-86_ec2-user# python HiveTest.py
<impala.hiveserver2.HiveServer2Connection_object at 0x7f66eee00250>_
('database_name', 'STRING', None, None, None, None, None)
('default',)
('test.s1', 'STRING',None, None, None, None, None), ('test.s2', 'STRING', None, None, None, None, None)
('name1', 'age1'), ('name2', 'age2'), ('name3', 'age3'), ('name4', 'age4'), ('name5', 'age5'), ('name6', 'age6'), ('name7', 'age7'), ('name8', 'age8'), ('name9', 'age9'), ('name10', 'age10')
[root@ip-172-31-22-86 ec2-user]#
2.測(cè)試連接Impala
_root@ip-172-31-22-86_ec2-user# python ImpalaTest.py
<impala.hiveserver2.HiveServer2Connection_object at 0x7f7e1f2cfad0>_
('name', 'STRING', None, None, None, None, None), ('comment', 'STRING', None, None, None, None, None)
('_impala_builtins', 'Systemdatabase for Impala builtin functions'), ('default', 'Default Hive database')
('s1', 'STRING', None, None, None,None, None), ('s2', 'STRING', None, None, None,None, None)
('name1', 'age1'), ('name2', 'age2'), ('name3', 'age3'), ('name4', 'age4'), ('name5', 'age5'), ('name6', 'age6'), ('name7', 'age7'), ('name8', 'age8'), ('name9', 'age9'), ('name10', 'age10')
[root@ip-172-31-22-86 ec2-user]#
5.常見問題
1.錯(cuò)誤一
building 'sasl.saslwrapper' extension
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/sasl
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Isasl -I/opt/cloudera/parcels/Anaconda/include/python2.7 -c sasl/saslwrapper.cpp -o build/temp.linux-x86_64-2.7/sasl/saslwrapper.o
unable to execute 'gcc': No such file or directory
error: command 'gcc' failed with exit status 1
----------------------------------------
Command "/opt/cloudera/parcels/Anaconda/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-kD6tvP/sasl/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-WJFNeG-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-kD6tvP/sasl/
解決方法:
[root@ip-172-31-22-86 ec2-user]# yum -y install gcc
[root@ip-172-31-22-86 ec2-user]# yum install gcc-c++
2.錯(cuò)誤二
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Isasl -I/opt/cloudera/parcels/Anaconda/include/python2.7 -c sasl/saslwrapper.cpp -o build/temp.linux-x86_64-2.7/sasl/saslwrapper.o
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
In file included from sasl/saslwrapper.cpp:254:0:
sasl/saslwrapper.h:22:23: fatal error: sasl/sasl.h: No such file or directory
#include <sasl/sasl.h>
^
compilation terminated.
error: command 'gcc' failed with exit status 1
解決方法:
[root@ip-172-31-22-86 ec2-user]# yum -y install python-devel.x86_64 cyrus-sasl-devel.x86_64
醉酒鞭名馬,少年多浮夸! 嶺南浣溪沙,嘔吐酒肆下!摯友不肯放,數(shù)據(jù)玩的花!
溫馨提示:要看高清無(wú)碼套圖,請(qǐng)使用手機(jī)打開并單擊圖片放大查看。
推薦關(guān)注Hadoop實(shí)操,第一時(shí)間,分享更多Hadoop干貨,歡迎轉(zhuǎn)發(fā)和分享。
原創(chuàng)文章,歡迎轉(zhuǎn)載,轉(zhuǎn)載請(qǐng)注明:轉(zhuǎn)載自微信公眾號(hào)Hadoop實(shí)操
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。