Example:
http://example.com/?a=text&q2=text2&q3=text3&q2=text4
After removing "q2", it will return:
http://example.com/?q=text&q3=text3
In this case, there were multiple "q2" and all have been removed.
Example:
http://example.com/?a=text&q2=text2&q3=text3&q2=text4
After removing "q2", it will return:
http://example.com/?q=text&q3=text3
In this case, there were multiple "q2" and all have been removed.
import sys
if sys.version_info.major == 3:
from urllib.parse import urlencode, urlparse, urlunparse, parse_qs
else:
from urllib import urlencode
from urlparse import urlparse, urlunparse, parse_qs
url = 'http://example.com/?a=text&q2=text2&q3=text3&q2=text4&b#q2=keep_fragment'
u = urlparse(url)
query = parse_qs(u.query, keep_blank_values=True)
query.pop('q2', None)
u = u._replace(query=urlencode(query, True))
print(urlunparse(u))
Output:
http://example.com/?a=text&q3=text3&b=#q2=keep_fragment
u = u._replace(query='')
also work? We could avoid an extra import this way. –
Apuleius from urllib.parse import urlencode, urlparse, urlunparse, parse_qs
–
Burnham To remove all query string parameters:
from urllib.parse import urljoin, urlparse
url = 'http://example.com/?a=text&q2=text2&q3=text3&q2=text4'
urljoin(url, urlparse(url).path) # 'http://example.com/'
For Python2, replace the import with:
from urlparse import urljoin, urlparse
Isn't this just a matter of splitting a string on a character?
>>> url = http://example.com/?a=text&q2=text2&q3=text3&q2=text4
>>> url = url.split('?')[0]
'http://example.com/'
Using python's url manipulation library furl:
import furl
f = furl.furl("http://example.com/?a=text&q2=text2&q3=text3&q2=text4")
f.remove(['q2'])
print(f.url)
query_string = "https://example.com/api/api.php?user=chris&auth=true"
url = query_string[:query_string.find('?', 0)]
Or simply put, just use url_query_cleaner()
from w3lib.url
from w3lib.url import url_query_cleaner
url = 'http://example.com/?a=text&q2=text2&q3=text3&q2=text4'
url_query_cleaner(url, ('q2'), remove=True)
Output: http://example.com/?a=text&q3=text3
Another method that you can use to have more control over what you want to do is urlunparse()
which takes a tuple of the parts returned from urlparse()
.
For example, recently I needed to change the path but keep the query:
from urllib.parse import urlparse, urlunparse
url = 'https://test.host.com/some/path?type_id=7'
parsed_url = urlparse(url)
modified_path = f'{parsed_url.path}/new_path_ending'
output_url = urlunparse((
parsed_url.scheme,
parsed_url.netloc,
modified_path,
parsed_url.params,
parsed_url.query,
parsed_url.fragment
))
print(output_url)
'https://test.host.com/some/path/new_path_ending?type_id=7'
This method preserves all of the URL and gives you granular control of what you want to keep, change, and remove.
import re
q ="http://example.com/?a=text&q2=text2&q3=text3&q2=text4"
todelete="q2"
#Delete every query string matching the pattern
r = re.sub(r''+todelete+'=[a-zA-Z_0-9]*\&*',r'',q)
#Delete the possible trailing #
r = re.sub(r'&$',r'',r)
print r
Or you could just use strip
>>> l='http://example.com/?a=text&q2=text2&q3=text3&q2=text4'
>>> l.strip('&q2=text4')
'http://example.com/?a=text&q2=text2&q3=text3'
>>>
© 2022 - 2024 — McMap. All rights reserved.
print(u.geturl())
– Flofloat