How to remove a path prefix in python?
Asked Answered
S

5

97

I wanted to know what is the pythonic function for this :

I want to remove everything before the wa path.

p = path.split('/')
counter = 0
while True:
    if p[counter] == 'wa':
        break
    counter += 1
path = '/'+'/'.join(p[counter:])

For instance, I want '/book/html/wa/foo/bar/' to become '/wa/foo/bar/'.

Septal answered 1/1, 2012 at 12:11 Comment(1)
Fyi, when dealing with paths better use the split/join functions from the os.path moduleAugustine
A
239

A better answer would be to use os.path.relpath:

http://docs.python.org/3/library/os.path.html#os.path.relpath

>>> import os
>>> full_path = '/book/html/wa/foo/bar/'
>>> relative_path = '/book/html'
>>> print(os.path.relpath(full_path, relative_path))
'wa/foo/bar'
Aleppo answered 8/11, 2013 at 10:35 Comment(2)
This is a much better answer because it avoids any issues with different path separators.Decanter
Totally agree with @intrepidhero's comment, plus this works whether or not full_path contains the trailing / character or not—so it's even more general than that.Stately
R
42

For Python 3.4+, you should use pathlib.PurePath.relative_to. From the documentation:

>>> p = PurePosixPath('/etc/passwd')
>>> p.relative_to('/')
PurePosixPath('etc/passwd')

>>> p.relative_to('/etc')
PurePosixPath('passwd')

>>> p.relative_to('/usr')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pathlib.py", line 694, in relative_to
    .format(str(self), str(formatted)))
ValueError: '/etc/passwd' does not start with '/usr'

Also see this StackOverflow question for more answers to your question.

Rotative answered 24/4, 2017 at 23:6 Comment(1)
While the pathlib module is very "user friendly", it wasn't created until very late in the game. Personally I still prefer using os.path.relpath() as shown in the accepted answer because it will work in most versions of Python (including Python 2).Stately
A
27
>>> path = '/book/html/wa/foo/bar/'
>>> path[path.find('/wa'):]
'/wa/foo/bar/'
Airel answered 1/1, 2012 at 12:20 Comment(4)
+1: compared to using a regular expression, this is simpler, and probably about as fast.Ferrell
This returns the last character if the string doesn't contain /wa (path[-1:]), so if that might happen you'd want to check if "/wa" in path firstBanker
alternately, you can use str.index instead of str.find to raise an exception when the needle is not in the haystack.Jacaranda
This doesn't work with multiple folders of same names.Longobard
D
0
import re

path = '/book/html/wa/foo/bar/'
m = re.match(r'.*(/wa/[a-z/]+)',path)
print m.group(1)
Discern answered 1/1, 2012 at 12:29 Comment(1)
This helps for my second question which was how to remove the last path if it is a integer. Nice :)Septal
N
0

There is a new string functions called .removeprefix and .removesuffix in Python 3.9 and later.

https://peps.python.org/pep-0616/

These built-in functions behave like the following

def removeprefix(self: str, prefix: str, /) -> str:
    if self.startswith(prefix):
        return self[len(prefix):]
    else:
        return self[:]

def removesuffix(self: str, suffix: str, /) -> str:
    # suffix='' should not call self[:-0].
    if suffix and self.endswith(suffix):
        return self[:-len(suffix)]
    else:
        return self[:]

this does not directly remove it per the question but if you know the full prefix it can be a help.

Norenenorfleet answered 12/2 at 22:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.