How to remove all characters after a specific character in python?

C

11

241

I have a string. How do I remove all text after a certain character? (In this case ...)
The text after will ... change so I that's why I want to remove all characters after a certain one.

Consuetudinary answered 24/5, 2009 at 21:56 Comment(0)

O

402

Split on your separator at most once, and take the first piece:

sep = '...'
stripped = text.split(sep, 1)[0]

You didn't say what should happen if the separator isn't present. Both this and Alex's solution will return the entire string in that case.

Obligatory answered 24/5, 2009 at 22:1 Comment(4)

Request is "remove all the text after" the separator, not "get" that text, so I think you want [0], not [-1], in your otherwise excellent solution. – Breckenridge 24/5, 2009 at 22:9

Worked perfectly thanks, as I'm sure Ayman & Alex's did as well, so thank you all. – Consuetudinary 24/5, 2009 at 22:51

Use rsplit() if you need to split by a character starting from the end of the string. – Nightmare 16/12, 2014 at 0:3

rsplit() actually answers the question if there are multiple occurrences of the separator – Mensal 1/5, 2015 at 15:49

S

139

Assuming your separator is '...', but it can be any string.

text = 'some string... this part will be removed.'
head, sep, tail = text.partition('...')

>>> print head
some string

If the separator is not found, head will contain all of the original string.

The partition function was added in Python 2.5.

S.partition(sep) -> (head, sep, tail)

Searches for the separator sep in S, and returns the part before it, the separator itself, and the part after it. If the separator is not found, returns S and two empty strings.

Segment answered 24/5, 2009 at 22:2 Comment(6)

Yet another excellent solution -- are we violating TOOOWTDI?-) Maybe worth a timeit run to check... – Breckenridge 24/5, 2009 at 22:11

.partition wins -- 0.756 usec per loop, vs 1.13 for .split (comment formatting doesn't really let me show the exact tests, but I'm using @Ayman's text and separator) -- so, +1 for @Ayman's answer! – Breckenridge 24/5, 2009 at 22:15

and btw, for completeness, the RE-based solution is 2.54 usec, i.e., way slower than either @Ayman's or @Ned's. – Breckenridge 24/5, 2009 at 22:58

partition wins if you're in 2.5 land :) For us suckers stuck in 2.4, we have to live with relatively glacial slowness of split. – Thickknee 27/5, 2009 at 16:15

Example is really helpful. – Carbamate 16/10, 2019 at 18:14

Small improvement, you can simply discard the other values if you don't need them: head, *_ = text.partition('...') – Tidwell 27/3, 2023 at 15:13

M

34

If you want to remove everything after the last occurrence of separator in a string I find this works well:

<separator>.join(string_to_split.split(<separator>)[:-1])

For example, if string_to_split is a path like root/location/child/too_far.exe and you only want the folder path, you can split by "/".join(string_to_split.split("/")[:-1]) and you'll get root/location/child

Meridith answered 14/9, 2015 at 22:18 Comment(2)

additionally, you can change that -1 to any index to be the occurrence at which you drop text. – Meridith 14/9, 2015 at 22:19

this is the most flexible solution – Melinite 1/12, 2023 at 15:29

B

11

Without a regular expression (which I assume is what you want):

def remafterellipsis(text):
  where_ellipsis = text.find('...')
  if where_ellipsis == -1:
    return text
  return text[:where_ellipsis + 3]

or, with a regular expression:

import re

def remwithre(text, there=re.compile(re.escape('...')+'.*')):
  return there.sub('', text)

Breckenridge answered 24/5, 2009 at 22:0 Comment(2)

Might want to use sep='...' as a kwarg and use len(sep) instead of hard-coding the 3 to make it slightly more future-proof. – Wingo 24/5, 2009 at 22:49

Yep, but then you need to recompile the RE on each call, so performance suffers for the RE solution (no real difference for the non-RE solution). Some generality is free, some isn't...;-) – Breckenridge 24/5, 2009 at 22:56

S

6

import re
test = "This is a test...we should not be able to see this"
res = re.sub(r'\.\.\..*',"",test)
print(res)

Output: "This is a test"

Solangesolano answered 26/2, 2020 at 14:12 Comment(1)

kindly please explain – Ambiguous 3/4, 2020 at 2:47

E

4

The method find will return the character position in a string. Then, if you want remove every thing from the character, do this:

mystring = "123⋯567"
mystring[ 0 : mystring.index("⋯")]

>> '123'

If you want to keep the character, add 1 to the character position.

Epistaxis answered 18/6, 2020 at 2:42 Comment(0)

R

3

From a file:

import re
sep = '...'

with open("requirements.txt") as file_in:
    lines = []
    for line in file_in:
        res = line.split(sep, 1)[0]
        print(res)

Requiescat answered 19/3, 2020 at 23:42 Comment(0)

T

1

Oneliner for in-place replacement:

text, *_ = text.partition('...')

Credits to original answer: https://mcmap.net/q/116547/-how-to-remove-all-characters-after-a-specific-character-in-python

Tidwell answered 27/3, 2023 at 15:18 Comment(0)

B

0

This is in python 3.7 working to me In my case I need to remove after dot in my string variable fees

fees = 45.05
split_string = fees.split(".", 1)

substring = split_string[0]

print(substring)

Bernadette answered 3/3, 2021 at 10:54 Comment(0)

B

0

Yet another way to remove all characters after the last occurrence of a character in a string (assume that you want to remove all characters after the final '/').

path = 'I/only/want/the/containing/directory/not/the/file.txt'

while path[-1] != '/':
    path = path[:-1]

Baur answered 25/5, 2021 at 22:18 Comment(1)

I think this will create a new copy of path at each iteration, so it's not a particularly efficient solution, although I agree it should work. – Castle 26/5, 2021 at 2:27

D

-1

another easy way using re will be

import re, clr

text = 'some string... this part will be removed.'

text= re.search(r'(\A.*)\.\.\..+',url,re.DOTALL|re.IGNORECASE).group(1)

// text = some string

Discophile answered 20/5, 2015 at 10:42 Comment(1)

Regular expressions are difficult because of the shorthand. This might be a useful answer, but you didn't explain any of it. – Wacke 27/10, 2021 at 18:30

Recommended topics

Hot tags