Parallel fetching of files
Asked Answered
S

3

6

In order to download files, I'm creating a urlopen object (urllib2 class) and reading it in chunks.

I would like to connect to the server several times and download the file in six different sessions. Doing that, the download speed should get faster. Many download managers have this feature.

I thought about specifying the part of file i would like to download in each session, and somehow process all the sessions in the same time. I'm not sure how I can achieve this.

Swearword answered 25/1, 2012 at 17:45 Comment(0)
J
3

Sounds like you want to use one of the flavors of HTTP Range that are available.

edit Updated link to point to the w3.org stored RFC

Jaret answered 25/1, 2012 at 17:51 Comment(1)
Thanks for mentioning this - updated the link to point to the w3.org RFC which should be less transient.Jaret
J
3

Sounds like you want to use one of the flavors of HTTP Range that are available.

edit Updated link to point to the w3.org stored RFC

Jaret answered 25/1, 2012 at 17:51 Comment(1)
Thanks for mentioning this - updated the link to point to the w3.org RFC which should be less transient.Jaret
P
3

As we've been talking already I made such one using PycURL.

The one, and only one, thing I had to do was pycurl_instance.setopt(pycurl_instance.NOSIGNAL, 1) to prevent crashes.

I did use APScheduler to fire requests in the separate threads. Thanks to your advices of changing busy waiting while True: pass to while True: time.sleep(3) in the main thread the code behaves quite nice and usage of Runner module from python-daemon package application is almost ready to be used as a typical UN*X daemon.

Peers answered 21/11, 2012 at 21:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.