I have received several recommendations to use virtualenv to clean up my python modules. I am concerned because it seems too good to be true. Has anyone found downside related to performance or memory issues in working with multicore settings, starcluster, numpy, scikit-learn, pandas, or iPython notebook.
Virtualenv is the best and easiest way to keep some sort of order when it comes to dependencies. Python is really behind Ruby (bundler!) when it comes to dealing with installing and keeping track of modules. The best tool you have is virtualenv.
So I suggest you create a virtualenv directory for each of your applications, put together a file where you list all the 'pip install' commands you need to build the environment and ensure that you have a clean repeatable process for creating this environment.
I think that the nature of the application makes little difference. There should not be any performance issue since all that virtualenv does is to load libraries from a specific path rather than load them from the directory where they are saved by default.
In any case (this may be completely irrelevant), but if performance is an issue, then perhaps you ought to be looking at a compiled language. Most likely though, any performance bottlenecks could be improved with better coding.
There's no performance overhead to using virtualenv. All it's doing is using different locations in the filesystem.
The only "overhead" is the time it takes to set it up. You'd need to install each package in your virtualenv (numpy, pandas, etc.)
Virtualenvs do not deal with C dependencies which may be an issue depending on how how keen you are about reproducible builds and capturing all of the machine setup in one process. You might end up needing to install C libraries through another package manager such as brew
apt
or rpm
, and these dependencies can be different between machine or change over time. To avoid this, you might end up using docker
and friends - which then adds another layer of complexity.
conda
goes tries to address the non-python dependencies. The issue is that it is bigger and slower.
© 2022 - 2024 — McMap. All rights reserved.