Import failure of s3fs library in AWS Glue
Asked Answered
P

2

6

AWS glue is not importing s3fs module

import s3fs

I expect the library to be imported but AWS glue says

ImportError : No module named s3fs

Preachy answered 9/4, 2019 at 8:53 Comment(4)
s3fs.readthedocs.io/en/latestSelfexecuting
Thanks. But there is nothing given in there.Preachy
Do you have boto3 ?Selfexecuting
are you running this import in databricks? in that case you might want to execute the following: %sh /databricks/python/bin/pip install s3fsLisk
I
6

AWS Glue jobs come with some common libraries pre installed but for anything more than that you need to download the .whl for the library from pypi, which in the case of s3fs can be found here.

Once you have that, upload it to an s3 bucket, eg. s3://my-libraries/ and reference it in the Python library path field in the console.

enter image description here

This will prompt Glue to install the libraries within this bucket prior to running the script. Note that only pure python libraries are support currently.

Investiture answered 14/5, 2020 at 23:57 Comment(1)
This fixed my problem for Glue4. Used pip download to download that exact version. FWIW, trying to use awswrangler in a pyshellKyrakyriako
S
1

s3fs is only included in Glue 2.0 and up. If you are trying to use this in a Python shell Glue job which uses Glue 1.0, you'll have to provide the whl file for s3fs as Sherlock mentioned above.

Here is a list of the default packages for Python Shell jobs https://docs.aws.amazon.com/glue/latest/dg/add-job-python.html#python-shell-supported-library

Swaraj answered 20/7, 2022 at 14:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.