Modify URL components in Python 2

About

Asked 13/6, 2014 at 8:33 Answered 24/8, 2017 at 7:33

Is there a cleaner way to modify some parts of a URL in Python 2?

For example

http://foo/bar -> http://foo/yah

At present, I'm doing this:

import urlparse

url = 'http://foo/bar'

# Modify path component of URL from 'bar' to 'yah'
# Use nasty convert-to-list hack due to urlparse.ParseResult being immutable
parts = list(urlparse.urlparse(url))
parts[2] = 'yah'

url = urlparse.urlunparse(parts)

Is there a cleaner solution?

Marlanamarlane answered 13/6, 2014 at 8:33 Comment(1)

What exactly do you mean by 'clean'? – Mannose 13/6, 2014 at 8:35

Unfortunately, the documentation is out of date; the results produced by urlparse.urlparse() (and urlparse.urlsplit()) use a collections.namedtuple()-produced class as a base.

Don't turn this namedtuple into a list, but make use of the utility method provided for just this task:

parts = urlparse.urlparse(url)
parts = parts._replace(path='yah')

url = parts.geturl()

The namedtuple._replace() method lets you create a new copy with specific elements replaced. The ParseResult.geturl() method then re-joins the parts into a url for you.

Demo:

>>> import urlparse
>>> url = 'http://foo/bar'
>>> parts = urlparse.urlparse(url)
>>> parts = parts._replace(path='yah')
>>> parts.geturl()
'http://foo/yah'

mgilson filed a bug report (with patch) to address the documentation issue.

Skiles answered 13/6, 2014 at 8:35 Comment(10)

I was going to point this out. The utility methods are provided to urlparse.ParseResult by the subclass returned by namedtuple. I think that this should be pointed out in the 2.7 docs, because without knowing that, you have no way of knowing that _replace actually is part of the public API for this class... – Diffractive 13/6, 2014 at 8:38

Even more interesting is the mention of BaseResult in the docs which doesn't appear in the source at all ... (sorry about the digression ... It's late ... +1 anyway) – Diffractive 13/6, 2014 at 8:43

@mgilson: heh, indeed, that must be a leftover from before namedtuple was used. – Skiles 13/6, 2014 at 8:45

Thanks - that's a nicer solution. Although, as pointed out in the other comments, there seems to be no way to know about it, based on the docs alone. – Marlanamarlane 13/6, 2014 at 8:56

@GarethStockwell: yeah, looks like a doc bug; none filed yet, I'll do that later. – Skiles 13/6, 2014 at 9:2

@MartijnPieters -- Ninja'd you on the doc bug – Diffractive 13/6, 2014 at 9:11

@mgilson: \o/ less work for me! :-P Can I push you to add a decent example to the docs as well, along the lines of what I did in this answer? – Skiles 13/6, 2014 at 9:15

Whoa there buddy, lets not get carried away now. an example?!?! Who's gonna use that :-) – Diffractive 13/6, 2014 at 9:22

Sadly, this does not allow setting attributes that are not always available. For example, the username and password attribute, which are only in the result tuple when they were part of the URL that was parsed. – Sansculotte 23/11, 2017 at 16:16

@TimVisée: those extra attributes are derived values. They are part of the netloc value; use _replace() to set a netloc string that includes the login info: parts._replace(netloc='{}:{}@{}'.format(newusername, newpassword, parts.netloc.rpartition('@')[-1])) – Skiles 23/11, 2017 at 16:32

-1

I guess the proper way to do it is this way.

As using _replace private methods or variables is not suggested.

from urlparse import urlparse, urlunparse

res = urlparse('http://www.goog.com:80/this/is/path/;param=paramval?q=val&foo=bar#hash')
l_res = list(res)
# this willhave ['http', 'www.goog.com:80', '/this/is/path/', 'param=paramval', 'q=val&foo=bar', 'hash']
l_res[2] = '/new/path'
urlunparse(l_res)
# outputs 'http://www.goog.com:80/new/path;param=paramval?q=val&foo=bar#hash'

Hitchhike answered 24/8, 2017 at 7:33 Comment(1)

it's part of the public interface, it's just prefixed with underscrore to not clash with actual members. – Leaves 29/8, 2017 at 13:47

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags