Python (3.5) - urllib.request.urlopen - Progress Bar Available?

Asked 12/12, 2016 at 17:50 Answered 25/7, 2020 at 17:45

I'm trying to search the world wide web for this answer, but I feel there answer may be no. I'm using Python 3.5 and a library called urllib.request with a method called urllib.request.urlopen(url) to open a link and download a file.

It would be nice to have some kind of measure of progress for this, as the file(s) are over 200MB. I'm looking at the API here, and don't see any kind of parameter with a hook.

Here's my code:

downloadURL = results[3] #got this from code earlier up
rel_path = account + '/' + eventID + '_' + title + '.mp4'

filename_abs_path = os.path.join(script_dir, rel_path)
print('>>> Downloading >>> ' + title)

# Download .mp4 from a url and save it locally under `file_name`:
with urllib.request.urlopen(downloadURL) as response, open(filename_abs_path, 'wb') as out_file:
    shutil.copyfileobj(response, out_file)

Can anyone provide insight if they think I can potentially have a progress bar or would the only way be to use a different library? I'm looking to keep the code quite short and simple, I just need some indication of the file being downloaded. Thanks for any help provided!

Burtburta answered 12/12, 2016 at 17:50 Comment(3)

This doesn't really pertain to urllib, does it? Rather the problem is that copyfileobj() doesn't provide a progress callback. So dup? – Cedrickceevah 12/12, 2016 at 17:53

I would recommend switching to python requests – Readytowear 12/12, 2016 at 17:55

In order to build a progress bar, you need to know the total size of the data you're about to download and the current size of its downloaded portion. The first one can be retrieved from the Content-Length header. You can calculate the second one by reading hunks of data of a specified size and computing sums of the chunks' lengths. The requests library lets you read chunks of data from a connection. – Conatus 12/12, 2016 at 18:1

If the response includes a content-length you can read the incoming data in blocks and calculate percent done. Unfortunately, web servers that "chunk" responses don't always provide a content length, so it doesn't always work. Here is an example to test.

import urllib.request
import sys
import io

try:
    url = sys.argv[1]
except IndexError:
    print("usage: test.py url")
    exit(2)

resp = urllib.request.urlopen(url)
length = resp.getheader('content-length')
if length:
    length = int(length)
    blocksize = max(4096, length//100)
else:
    blocksize = 1000000 # just made something up

print(length, blocksize)

buf = io.BytesIO()
size = 0
while True:
    buf1 = resp.read(blocksize)
    if not buf1:
        break
    buf.write(buf1)
    size += len(buf1)
    if length:
        print('{:.2f}\r done'.format(size/length), end='')
print()

Laminitis answered 12/12, 2016 at 18:30 Comment(8)

Hey, thanks for the response. I tried this in a separate file and the error I'm getting is line 24 TypeError: 'float' object is not iterable. – Burtburta 12/12, 2016 at 18:39

I should have done floor division. I've changed the example to blocksize = max(4096, length//100). (I only tested with a small file!) – Laminitis 12/12, 2016 at 19:7

Wonderful, that code is working. Do you know how I can apply it to my download? Would I tweak it and put it under the with urllib.request.urlopen? – Burtburta 12/12, 2016 at 19:12

Yes, it replaces the shutil stuff. – Laminitis 12/12, 2016 at 19:22

Great, this seems to work now! As a final question, is it possible to instead of print on a new line the new percentage, update the same line? If not, it's okay, I'm very grateful for your help and time. – Burtburta 12/12, 2016 at 19:25

I updated the print statement to keep it on one line. – Laminitis 12/12, 2016 at 19:27

Is this possible for POST request? – Oly 27/9, 2022 at 11:30

@Oly - Yes, it should work with any request that gets a response header. – Laminitis 27/9, 2022 at 22:4

@tdelaney's answer is great, but in Python 3.8 you have to use getvalue() method instead of read():

    import io, urllib.request

    with urllib.request.urlopen(Url) as Response:
        Length = Response.getheader('content-length')
        BlockSize = 1000000  # default value

        if Length:
            Length = int(Length)
            BlockSize = max(4096, Length // 20)

        print("UrlLib len, blocksize: ", Length, BlockSize)

        BufferAll = io.BytesIO()
        Size = 0
        while True:
            BufferNow = Response.read(BlockSize)
            if not BufferNow:
                break
            BufferAll.write(BufferNow)
            Size += len(BufferNow)
            if Length:
                Percent = int((Size / Length)*100)
                print(f"download: {Percent}% {Url}")

        print("Buffer All len:", len(BufferAll.getvalue()))

Homs answered 25/7, 2020 at 17:45 Comment(0)

Recommended topics

Hot tags