Get file size from "Content-Length" value from a file in python 3.2

Asked 21/10, 2012 at 8:42 Answered 23/7, 2015 at 0:28

I want to get the Content-Length value from the meta variable. I need to get the size of the file that I want to download. But the last line returns an error, HTTPMessage object has no attribute getheaders.

import urllib.request
import http.client

#----HTTP HANDLING PART----
 url = "http://client.akamai.com/install/test-objects/10MB.bin"

file_name = url.split('/')[-1]
d = urllib.request.urlopen(url)
f = open(file_name, 'wb')

#----GET FILE SIZE----
meta = d.info()

print ("Download Details", meta)
file_size = int(meta.getheaders("Content-Length")[0])

Acrobatics answered 21/10, 2012 at 8:42 Comment(0)

It looks like you are using Python 3, and have read some code / documentation for Python 2.x. It is poorly documented, but there is no getheaders method in Python 3, but only a get_all method.

See this bug report.

Fustic answered 21/10, 2012 at 8:51 Comment(2)

For the benefit of people from Google, it seems you can now do file_size = int(d.getheader('Content-Length')) in Python 3 (tested in 3.4.1). d.getheaders() also seems to have been added. – Calumet 19/6, 2014 at 1:21

@freshtop: Both d.getheader() and d.getheaders() work even on Python 3.2. Note: OP uses d.info() instead of d here. d.info().getheader() and d.info().getheaders() is Python 2 code. To support both Python 2 and 3, d.headers['Content-Length'] could be used. – Wnw 23/7, 2015 at 1:3

for Content-Length:

file_size = int(d.getheader('Content-Length'))

Sacellum answered 21/10, 2012 at 14:56 Comment(2)

I think they are looking for a python3 solution, (at least I am and this is the top google hit) – Felishafelita 29/4, 2014 at 5:56

@ThorSummoner: d.getheader() works on Python 3 only. The question has python-3.x tag and therefore Python 3 only solution is appropriate. – Wnw 23/7, 2015 at 1:4

Change final line to:

file_size = int(meta.get_all("Content-Length")[0])

Breakage answered 22/12, 2014 at 5:42 Comment(0)

You should consider using Requests:

import requests

url = "http://client.akamai.com/install/test-objects/10MB.bin"
resp = requests.get(url)

print resp.headers['content-length']
# '10485760'

For Python 3, use:

print(resp.headers['content-length'])

instead.

Mousey answered 21/10, 2012 at 8:51 Comment(7)

+1, If you only expect one header, go with the item operator. However, I fear there is no headers attribute in Python3, so it should probably be resp.get("Content-Length") or maybe resp["Content-Length"] (didn't try this) – Fustic 21/10, 2012 at 8:56

seems to be no requests libraries in python 3.2...think i should switch versions...which version you guys using ? – Acrobatics 21/10, 2012 at 9:2

@Acrobatics Requests recently added 3.3 support. I am running 2.7.3. – Mousey 21/10, 2012 at 9:3

@Fustic That wasn't an issue, as resp is a Requests response dict. There's one thing I need to change though.. it should be print(resp.headers) instead for Python3. – Mousey 21/10, 2012 at 9:28

@Acrobatics You are welcome! I forgot to change print statement to python3's format in the original answer. – Mousey 21/10, 2012 at 9:30

@KayZhu, yes of course. Overlooked that you had removed the info() call :) – Fustic 21/10, 2012 at 10:50

@Fustic ah ok, though I didn't really remove anything in my post edit. There was never a info() call, I suppose you meant you mislooked? :) – Mousey 21/10, 2012 at 11:5

response.headers['Content-Length'] works on both Python 2 and 3:

#!/usr/bin/env python
from contextlib import closing

try:
    from urllib2 import urlopen
except ImportError: # Python 3
    from urllib.request import urlopen


with closing(urlopen('https://mcmap.net/q/821686/-get-file-size-from-quot-content-length-quot-value-from-a-file-in-python-3-2')) as response:
    print("File size: " + response.headers['Content-Length'])

Wnw answered 23/7, 2015 at 0:28 Comment(4)

This doesn't work if a header is repeated. You only get the first one when using the headers attribute. The only reliable way is to use info().get_all(). In Python2 info().get() would concatenate all duplicate headers but this fragile behavior has been removed for Py3. Unfortunately get_all() hasn't been backported to Py2 so we are stuck having to wrestle with this poorly documented library for more years to come. – Benge 20/7, 2016 at 14:43

@KevinThibedeau: 1- duplicate Content-Length headers with different values are not supported in http 2- info() is implemented as return self.headers. – Wnw 20/7, 2016 at 15:34

From RFC-6265: "Origin servers SHOULD NOT fold multiple Set-Cookie header fields into a single header field". It is not at all unusual to receive duplicate headers. Python's libraries need to support this behavior properly. – Benge 20/7, 2016 at 16:6

@KevinThibedeau: Set-Cookie is a well-known exception -- you should not use it as an example for other http headers. rfc7230 specifies the behavior for the Content-Length header explicitly (read the link from my previous comment). – Wnw 20/7, 2016 at 16:16

import urllib.request

link = "<url here>"

f = urllib.request.urlopen(link)
meta = f.info()
print (meta.get("Content-length"))
f.close()

Works with python 3.x

Overzealous answered 22/7, 2015 at 17:31 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags