Unable to load libhdfs when using pyarrow
Asked Answered
A

1

11

I'm trying to connect to HDFS through Pyarrow, but it does not work because libhdfs library cannot be loaded.

libhdfs.so is in $HADOOP_HOME/lib/native as well as in $ARROW_LIBHDFS_DIR.

print(os.environ['ARROW_LIBHDFS_DIR'])
fs = hdfs.connect()


bash-3.2$ ls $ARROW_LIBHDFS_DIR
examples        libhadoop.so.1.0.0  libhdfs.a       libnativetask.a
libhadoop.a     libhadooppipes.a    libhdfs.so      libnativetask.so
libhadoop.so        libhadooputils.a    libhdfs.so.0.0.0    libnativetask.so.1.0.0

The error I'm getting:

Traceback (most recent call last):
  File "wine-pred-ml.py", line 31, in <module>
    fs = hdfs.connect()
  File "/Users/PVZP/Library/Python/2.7/lib/python/site-packages/pyarrow/hdfs.py", line 183, in connect
    extra_conf=extra_conf)
  File "/Users/PVZP/Library/Python/2.7/lib/python/site-packages/pyarrow/hdfs.py", line 37, in __init__
    self._connect(host, port, user, kerb_ticket, driver, extra_conf)
  File "pyarrow/io-hdfs.pxi", line 89, in pyarrow.lib.HadoopFileSystem._connect
  File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Unable to load libhdfs
Acrylonitrile answered 31/10, 2018 at 16:11 Comment(3)
Solved by using Conda to install libhdfs3 and pyarrow rather than trying to build it myself or using the libhdfs prepackaged with Hadoop.Acrylonitrile
Thanks for the follow-up. Can you share more details? I installed libhdfs3 using conda and put driver='libhdfs3 but still get "Unable to load libhdfs3". What am I missing?Puccini
I also would like to know how you solved this, Pablo.Androsphinx
H
4

This solves my issue:

conda install libhdfs3 pyarrow

in your script.py:

import os
os.environ['ARROW_LIBHDFS_DIR'] = '/opt/cloudera/parcels/CDH/lib64/'

where the path is the directory in which libhdfs3 lives - in my case this is where Cloudera hosts the lib

Hijoung answered 26/8, 2020 at 9:9 Comment(3)
How can I pip install this instead?Gamic
simply use anaconda ;)Hijoung
but I can't in this environment unfortunately, I found the libhdfs.so file elswhere already installed thoGamic

© 2022 - 2024 — McMap. All rights reserved.