溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊(cè)×
其他方式登錄
點(diǎn)擊 登錄注冊(cè) 即表示同意《億速云用戶服務(wù)條款》

0039-如何使用Python Impyla客戶端連接Hive和Impala

發(fā)布時(shí)間:2020-07-26 10:35:03 來(lái)源:網(wǎng)絡(luò) 閱讀:2096 作者:Hadoop實(shí)操 欄目:大數(shù)據(jù)

溫馨提示:要看高清無(wú)碼套圖,請(qǐng)使用手機(jī)打開并單擊圖片放大查看。

1.文檔編寫目的


繼上一章講述如何在CDH集群安裝Anaconda&搭建Python私有源后,本章節(jié)主要講述如何使用Pyton Impyla客戶端連接CDH集群的HiveServer2和Impala Daemon,并進(jìn)行SQL操作。

  • 內(nèi)容概述

1.依賴包安裝

2.代碼編寫

3.代碼測(cè)試

  • 測(cè)試環(huán)境

1.CM和CDH版本為5.11.2

2.RedHat7.2

  • 前置條件

1.CDH集群環(huán)境正常運(yùn)行

2.Anaconda已安裝并配置環(huán)境變量

3.pip工具能夠正常安裝Python包

4.Python版本2.6+ or 3.3+

5.非安全集群環(huán)境

2.Impyla依賴包安裝


Impyla所依賴的Python包

  • six
  • bit_array
  • thrift (on Python 2.x) orthriftpy (on Python 3.x)
  • thrift_sasl
  • sasl

1.首先安裝Impyla依賴的Python包

[root@ip-172-31-22-86 ~]# pip install bit_array
[root@ip-172-31-22-86 ~]# pip install thrift==0.9.3
[root@ip-172-31-22-86 ~]# pip install six
[root@ip-172-31-22-86 ~]# pip install thrift_sasl
[root@ip-172-31-22-86 ~]# pip install sasl

0039-如何使用Python Impyla客戶端連接Hive和Impala

0039-如何使用Python Impyla客戶端連接Hive和Impala

0039-如何使用Python Impyla客戶端連接Hive和Impala

0039-如何使用Python Impyla客戶端連接Hive和Impala

注意:thrift的版本必須使用0.9.3,默認(rèn)安裝的為0.10.0版本,需要卸載后重新安裝0.9.3版本,卸載命令pip uninstall thrift

2.安裝Impyla包

impyla版本,默認(rèn)安裝的是0.14.0,需要將卸載后安裝0.13.8版本

 [root@ip-172-31-22-86 ec2-user]# pip install impyla==0.13.8
Collecting impyla
  Downloading impyla-0.14.0.tar.gz (151kB)
    100% |████████████████████████████████| 153kB 1.0MB/s 
Requirement already satisfied: six in /opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (from impyla)
Requirement already satisfied: bitarray in /opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (from impyla)
Requirement already satisfied: thrift in /opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (from impyla)
Building wheels for collected packages: impyla
  Running setup.py bdist_wheel for impyla ... done
  Stored in directory: /root/.cache/pip/wheels/96/fa/d8/40e676f3cead7ec45f20ac43eb373edc471348ac5cb485d6f5
Successfully built impyla
Installing collected packages: impyla
Successfully installed impyla-0.14.0

0039-如何使用Python Impyla客戶端連接Hive和Impala

3.編寫Python代碼


Python連接Hive(HiveTest.py)

from impala.dbapi importconnect

conn = connect(host='ip-172-31-21-45.ap-southeast-1.compute.internal',port=10000,database='default',auth_mechan

ism='PLAIN')

print(conn)

cursor = conn.cursor()

cursor.execute('show databases')

print cursor.description # prints the result set's schema

results = cursor.fetchall()

print(results)

cursor.execute('SELECT * FROM test limit 10')

print cursor.description # prints the result set's schema

results = cursor.fetchall()

print(results)

Python連接Impala(ImpalaTest.py)

from impala.dbapi importconnect

conn = connect(host='ip-172-31-26-80.ap-southeast-1.compute.internal',port=21050)

print(conn)

cursor = conn.cursor()

cursor.execute('show databases')

print cursor.description # prints the result set's schema

results = cursor.fetchall()

print(results)

cursor.execute('SELECT * FROM test limit 10')

print cursor.description # prints the result set's schema

results = cursor.fetchall()

print(results)

4.測(cè)試代碼


在shell命令行執(zhí)行Python代碼測(cè)試

1.測(cè)試連接Hive

_root@ip-172-31-22-86_ec2-user# python HiveTest.py

<impala.hiveserver2.HiveServer2Connection_object at 0x7f66eee00250>_

('database_name', 'STRING', None, None, None, None, None)

('default',)

('test.s1', 'STRING',None, None, None, None, None), ('test.s2', 'STRING', None, None, None, None, None)

('name1', 'age1'), ('name2', 'age2'), ('name3', 'age3'), ('name4', 'age4'), ('name5', 'age5'), ('name6', 'age6'), ('name7', 'age7'), ('name8', 'age8'), ('name9', 'age9'), ('name10', 'age10')

[root@ip-172-31-22-86 ec2-user]#

0039-如何使用Python Impyla客戶端連接Hive和Impala

2.測(cè)試連接Impala

_root@ip-172-31-22-86_ec2-user# python ImpalaTest.py

<impala.hiveserver2.HiveServer2Connection_object at 0x7f7e1f2cfad0>_

('name', 'STRING', None, None, None, None, None), ('comment', 'STRING', None, None, None, None, None)

('_impala_builtins', 'Systemdatabase for Impala builtin functions'), ('default', 'Default Hive database')

('s1', 'STRING', None, None, None,None, None), ('s2', 'STRING', None, None, None,None, None)

('name1', 'age1'), ('name2', 'age2'), ('name3', 'age3'), ('name4', 'age4'), ('name5', 'age5'), ('name6', 'age6'), ('name7', 'age7'), ('name8', 'age8'), ('name9', 'age9'), ('name10', 'age10')

[root@ip-172-31-22-86 ec2-user]#

0039-如何使用Python Impyla客戶端連接Hive和Impala

5.常見問題


1.錯(cuò)誤一

building 'sasl.saslwrapper' extension
    creating build/temp.linux-x86_64-2.7
    creating build/temp.linux-x86_64-2.7/sasl
    gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Isasl -I/opt/cloudera/parcels/Anaconda/include/python2.7 -c sasl/saslwrapper.cpp -o build/temp.linux-x86_64-2.7/sasl/saslwrapper.o
    unable to execute 'gcc': No such file or directory
    error: command 'gcc' failed with exit status 1

    ----------------------------------------
Command "/opt/cloudera/parcels/Anaconda/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-kD6tvP/sasl/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-WJFNeG-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-kD6tvP/sasl/

解決方法:

[root@ip-172-31-22-86 ec2-user]# yum -y install gcc 
[root@ip-172-31-22-86 ec2-user]# yum install gcc-c++ 

2.錯(cuò)誤二

gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Isasl -I/opt/cloudera/parcels/Anaconda/include/python2.7 -c sasl/saslwrapper.cpp -o build/temp.linux-x86_64-2.7/sasl/saslwrapper.o
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
In file included from sasl/saslwrapper.cpp:254:0:
sasl/saslwrapper.h:22:23: fatal error: sasl/sasl.h: No such file or directory
#include <sasl/sasl.h>
                   ^
compilation terminated.
error: command 'gcc' failed with exit status 1

解決方法:

[root@ip-172-31-22-86 ec2-user]# yum -y install python-devel.x86_64 cyrus-sasl-devel.x86_64

醉酒鞭名馬,少年多浮夸! 嶺南浣溪沙,嘔吐酒肆下!摯友不肯放,數(shù)據(jù)玩的花!
溫馨提示:要看高清無(wú)碼套圖,請(qǐng)使用手機(jī)打開并單擊圖片放大查看。


推薦關(guān)注Hadoop實(shí)操,第一時(shí)間,分享更多Hadoop干貨,歡迎轉(zhuǎn)發(fā)和分享。

0039-如何使用Python Impyla客戶端連接Hive和Impala
原創(chuàng)文章,歡迎轉(zhuǎn)載,轉(zhuǎn)載請(qǐng)注明:轉(zhuǎn)載自微信公眾號(hào)Hadoop實(shí)操

向AI問一下細(xì)節(jié)

免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI