在Python中,类比于pymysql包连接MySQL数据库,可以利用impala包的impala.dbapi连接Hive数据库,建立起来的连接和游标cursor.execute(sql)
之后,后续查询操作基本上与pymysql相似。所需要的库的安装过程如下:
参考:
安装环境:Ubuntu20.04 LTS
按以下步骤来安装即可:
1 2 3 4 5 6
| pip install six pip install bit_array pip install thriftpy apt-get install python-dev libsasl2-dev gcc pip install thrift_sasl pip install impyla
|
其中apt-get install python-dev libsasl2-dev gcc
是为了解决在pip install thrift_sasl
中需要编译而缺少相应软件产生的如下报错(Windows系统可参考这里):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
| Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting thrift_sasl Downloading https://pypi.tuna.tsinghua.edu.cn/packages/73/d3/588654faef5511afadc1a091d32fcdbb24ae5f2d90b380874aee68a717f9/thrift_sasl-0.4.2.tar.gz Collecting thrift>=0.10.0 (from thrift_sasl) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/97/1e/3284d19d7be99305eda145b8aa46b0c33244e4a496ec66440dac19f8274d/thrift-0.13.0.tar.gz (59kB) |████████████████████████████████| 61kB 684kB/s Collecting sasl>=0.2.1 (from thrift_sasl) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/8e/2c/45dae93d666aea8492678499e0999269b4e55f1829b1e4de5b8204706ad9/sasl-0.2.1.tar.gz Collecting six>=1.13.0 (from thrift_sasl) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/65/eb/1f97cb97bfc2390a276969c6fae16075da282f5058082d4cb10c6c5c1dba/six-1.14.0-py2.py3-none-any.whl Building wheels for collected packages: thrift-sasl, thrift, sasl Building wheel for thrift-sasl (setup.py) ... done Created wheel for thrift-sasl: filename=thrift_sasl-0.4.2-cp37-none-any.whl size=4010 sha256=d9aff46bdb4423f147da5e2809198b9feff9d54259b627c6b6d716640b3cc842 Stored in directory: /root/.cache/pip/wheels/92/22/93/59527f7435acb500da2c80d4eb038377e752009fa47e842fba Building wheel for thrift (setup.py) ... done Created wheel for thrift: filename=thrift-0.13.0-cp37-cp37m-linux_x86_64.whl size=414111 sha256=113f6ddd0744dea046e5d8764c5858c9d397ca9d910004719662e6ccda1c9792 Stored in directory: /root/.cache/pip/wheels/dc/f4/14/0cd659ffc6431d0a24534f04087f6239494daf4fb3531c542a Building wheel for sasl (setup.py) ... error ERROR: Command errored out with exit status 1: command: /home/cuper/anaconda3/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-zp2a5lr8/sasl/setup.py'"'"'; __file__='"'"'/tmp/pip-install-zp2a5lr8/sasl/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-fc4awf9y --python-tag cp37 cwd: /tmp/pip-install-zp2a5lr8/sasl/ Complete output (30 lines): running bdist_wheel running build running build_py creating build creating build/lib.linux-x86_64-3.7 creating build/lib.linux-x86_64-3.7/sasl copying sasl/__init__.py -> build/lib.linux-x86_64-3.7/sasl running egg_info writing sasl.egg-info/PKG-INFO writing dependency_links to sasl.egg-info/dependency_links.txt writing requirements to sasl.egg-info/requires.txt writing top-level names to sasl.egg-info/top_level.txt reading manifest file 'sasl.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' writing manifest file 'sasl.egg-info/SOURCES.txt' copying sasl/saslwrapper.cpp -> build/lib.linux-x86_64-3.7/sasl copying sasl/saslwrapper.h -> build/lib.linux-x86_64-3.7/sasl copying sasl/saslwrapper.pyx -> build/lib.linux-x86_64-3.7/sasl running build_ext building 'sasl.saslwrapper' extension creating build/temp.linux-x86_64-3.7 creating build/temp.linux-x86_64-3.7/sasl gcc -pthread -B /home/cuper/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Isasl -I/home/cuper/anaconda3/include/python3.7m -c sasl/saslwrapper.cpp -o build/temp.linux-x86_64-3.7/sasl/saslwrapper.o cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ In file included from sasl/saslwrapper.cpp:254: sasl/saslwrapper.h:22:10: fatal error: sasl/sasl.h: 没有那个文件或目录 22 | | ^~~~~~~~~~~~~ compilation terminated. error: command 'gcc' failed with exit status 1 ---------------------------------------- ERROR: Failed building wheel for sasl Running setup.py clean for sasl Successfully built thrift-sasl thrift Failed to build sasl ERROR: astroid 2.3.1 requires typed-ast<1.5,>=1.4.0; implementation_name == "cpython" and python_version < "3.8", which is not installed. ERROR: astroid 2.3.1 has requirement six==1.12, but you'll have six 1.14.0 which is incompatible. Installing collected packages: six, thrift, sasl, thrift-sasl Found existing installation: six 1.12.0 Uninstalling six-1.12.0: Successfully uninstalled six-1.12.0 Running setup.py install for sasl ... error ERROR: Command errored out with exit status 1: command: /home/cuper/anaconda3/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-zp2a5lr8/sasl/setup.py'"'"'; __file__='"'"'/tmp/pip-install-zp2a5lr8/sasl/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-fm0zp25n/install-record.txt --single-version-externally-managed --compile cwd: /tmp/pip-install-zp2a5lr8/sasl/ Complete output (30 lines): running install running build running build_py creating build creating build/lib.linux-x86_64-3.7 creating build/lib.linux-x86_64-3.7/sasl copying sasl/__init__.py -> build/lib.linux-x86_64-3.7/sasl running egg_info writing sasl.egg-info/PKG-INFO writing dependency_links to sasl.egg-info/dependency_links.txt writing requirements to sasl.egg-info/requires.txt writing top-level names to sasl.egg-info/top_level.txt reading manifest file 'sasl.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' writing manifest file 'sasl.egg-info/SOURCES.txt' copying sasl/saslwrapper.cpp -> build/lib.linux-x86_64-3.7/sasl copying sasl/saslwrapper.h -> build/lib.linux-x86_64-3.7/sasl copying sasl/saslwrapper.pyx -> build/lib.linux-x86_64-3.7/sasl running build_ext building 'sasl.saslwrapper' extension creating build/temp.linux-x86_64-3.7 creating build/temp.linux-x86_64-3.7/sasl gcc -pthread -B /home/cuper/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Isasl -I/home/cuper/anaconda3/include/python3.7m -c sasl/saslwrapper.cpp -o build/temp.linux-x86_64-3.7/sasl/saslwrapper.o cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ In file included from sasl/saslwrapper.cpp:254: sasl/saslwrapper.h:22:10: fatal error: sasl/sasl.h: 没有那个文件或目录 22 | #include <sasl/sasl.h> | ^~~~~~~~~~~~~ compilation terminated. error: command 'gcc' failed with exit status 1 ---------------------------------------- ERROR: Command errored out with exit status 1: /home/cuper/anaconda3/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-zp2a5lr8/sasl/setup.py'"'"'; __file__='"'"'/tmp/pip-install-zp2a5lr8/sasl/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-fm0zp25n/install-record.txt --single-version-externally-managed --compile Check the logs for full command output.
|
安装完成后测试Python代码:
1
| from impala.dbapi import connect
|
无报错证明安装成功,即可实现通过Python连接Hive数据库。