i'm sending code to amazon's EMR via the mrjob/boto modules. i've got some external python dependencies (ie. numpy, boto, etc) and currently have to download the source of the python packages, and send them over as a tarball in the "python_archives" field of the mrjob.config file.
this makes dependency management messier than i would like, and am wondering if i can somehow use the same requirements.txt file i use for my virtualenv setup to bootstrap the emr instance with my dependencies. is it possible to set up virtualenv's on EMR instances and do something like:
pip install -r requirements.txt
as i would locally?