PyHive Thrift transport exception: read 0 bytes
Asked Answered
G

0

6

I'm trying to connect to Hive server-2 running inside docker container (from outside the container) via python (PyHive 0.5, python 2.7) using DB-API (asynchronous) example

from pyhive import hive
conn = hive.connect(host='172.17.0.2', port='10001', auth='NOSASL')

However, I'm getting following error

Traceback (most recent call last):
  File "py_2.py", line 4, in <module>
    conn = hive.connect(host='172.17.0.2', port='10001', auth='NOSASL')
  File "/home/foodie/anaconda2/lib/python2.7/site-packages/pyhive/hive.py", line 64, in connect
    return Connection(*args, **kwargs)
  File "/home/foodie/anaconda2/lib/python2.7/site-packages/pyhive/hive.py", line 164, in __init__
    response = self._client.OpenSession(open_session_req)
  File "/home/foodie/anaconda2/lib/python2.7/site-packages/TCLIService/TCLIService.py", line 187, in OpenSession
    return self.recv_OpenSession()
  File "/home/foodie/anaconda2/lib/python2.7/site-packages/TCLIService/TCLIService.py", line 199, in recv_OpenSession
    (fname, mtype, rseqid) = iprot.readMessageBegin()
  File "/home/foodie/anaconda2/lib/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 148, in readMessageBegin
    name = self.trans.readAll(sz)
  File "/home/foodie/anaconda2/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 60, in readAll
    chunk = self.read(sz - have)
  File "/home/foodie/anaconda2/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 161, in read
    self.__rbuf = BufferIO(self.__trans.read(max(sz, self.__rbuf_size)))
  File "/home/foodie/anaconda2/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 132, in read
    message='TSocket read 0 bytes')
thrift.transport.TTransport.TTransportException: TSocket read 0 bytes

The docker image that I'm using is this (tag: mysql_corrected). It runs following services (as outputted by jps command)

992 Master
1810 RunJar
259 DataNode
2611 Jps
584 ResourceManager
1576 RunJar
681 NodeManager
137 NameNode
426 SecondaryNameNode
1690 RunJar
732 HistoryServer

I'm launching the container using

docker run -it -p 8088:8088 -p 8042:8042 -p 4040:4040 -p 18080:18080 -p 10002:10002 -p 10000:10000 -e 3306 -e 9084 -h sandbox -v /home/foodie/docker/w1:/usr/tmp/test rohitbarnwal7/spark:mysql_corrected bash

Furthermore, I perform following steps to launch Hive server inside docker container

  1. Start mysql service: service mysqld start
  2. Switch to directory /usr/local/hive: cd $HIVE_HOME
  3. Launch Hive metastore server: nohup bin/hive --service metastore &
  4. Launch Hive server 2: hive --service hive-server2 (note that thrift-server port is already changed to 10001 in /usr/local/hive/conf/hive-site.xml)
  5. Launch beeline shell: beeline
  6. Connect beeline shell with Hive server-2: !connect jdbc:hive2://localhost:10001/default;transportMode=http;httpPath=cliservice

I've already tried the following things without any luck

  1. Making python 2.7.3 as default python version inside docker container (original default is python 2.6.6, python 2.7.3 is installed inside container but isn't default)
  2. Changing Hive server port to it's' default value: 10000
  3. Trying to connect with Hive server by running same python script inside the container (it still gives the same error)
Goshen answered 23/10, 2017 at 10:6 Comment(2)
On step 6, when trying to connect beeline shell with Hive server-2, hit Enter when prompted for username and passwordGoshen
A minor correction: it's not Python v2.7.3 but v2.7.13Goshen

© 2022 - 2024 — McMap. All rights reserved.