Databricks (Spark): .egg dependencies not installed automatically?
Asked Answered
D

1

12

I have a locally created .egg package that depends on boto==2.38.0. I used setuptools to create the build distribution. Everything works in my own local environment, as it fetches boto correctly from PiP. However on databricks it does not automatically fetch dependencies when I attach a library to the cluster.

I really struggled now for a few days trying to install a dependency automatically when loaded on databricks, I use setuptools; 'install_requires=['boto==2.38.0']' is the relevant field.

When I install boto directly from PyPi on the databricks server (so not relying on the install_requires field to work properly) and then call my own .egg, it does recognize that boto is a package, but it does not recognize any of its modules (since it is not imported on my own .egg's namespace???). So I cannot get my .egg to work. If this problem persists without having any solutions I'd think that is a really big problem for databricks users right now. There should be a solution of course...

Thank you!

Domestic answered 20/8, 2015 at 13:12 Comment(3)
Loek, did you ever determine a solution?Delaine
@JohnA.Ramey I have not, but I also haven't been working on this issue anymore. I remember that the databricks team told me that they're in the process of resolving this. I assume you're currently running into the same issues? Sorry to hear that. Let me know when you've found a solution yourself :)Domestic
any progress about this problem yet?Frost
A
1

Your application's dependencies will not, in general, work properly if they are diverse and don't have uniform language support. The Databrick docs explain that

Databricks will install the correct version if the library supports both Python 2 and 3. If the library does not support Python 3 then library attachment will fail with an error.

In this case it will not automatically fetch dependencies when you attach a library to the cluster.

Alikee answered 2/1, 2018 at 2:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.