How to remove all characters after a specific character in python?
Asked Answered
C

11

241

I have a string. How do I remove all text after a certain character? (In this case ...)
The text after will ... change so I that's why I want to remove all characters after a certain one.

Consuetudinary answered 24/5, 2009 at 21:56 Comment(0)
O
402

Split on your separator at most once, and take the first piece:

sep = '...'
stripped = text.split(sep, 1)[0]

You didn't say what should happen if the separator isn't present. Both this and Alex's solution will return the entire string in that case.

Obligatory answered 24/5, 2009 at 22:1 Comment(4)
Request is "remove all the text after" the separator, not "get" that text, so I think you want [0], not [-1], in your otherwise excellent solution.Breckenridge
Worked perfectly thanks, as I'm sure Ayman & Alex's did as well, so thank you all.Consuetudinary
Use rsplit() if you need to split by a character starting from the end of the string.Nightmare
rsplit() actually answers the question if there are multiple occurrences of the separatorMensal
S
139

Assuming your separator is '...', but it can be any string.

text = 'some string... this part will be removed.'
head, sep, tail = text.partition('...')

>>> print head
some string

If the separator is not found, head will contain all of the original string.

The partition function was added in Python 2.5.

S.partition(sep) -> (head, sep, tail)

Searches for the separator sep in S, and returns the part before it, the separator itself, and the part after it. If the separator is not found, returns S and two empty strings.

Segment answered 24/5, 2009 at 22:2 Comment(6)
Yet another excellent solution -- are we violating TOOOWTDI?-) Maybe worth a timeit run to check...Breckenridge
.partition wins -- 0.756 usec per loop, vs 1.13 for .split (comment formatting doesn't really let me show the exact tests, but I'm using @Ayman's text and separator) -- so, +1 for @Ayman's answer!Breckenridge
and btw, for completeness, the RE-based solution is 2.54 usec, i.e., way slower than either @Ayman's or @Ned's.Breckenridge
partition wins if you're in 2.5 land :) For us suckers stuck in 2.4, we have to live with relatively glacial slowness of split.Thickknee
Example is really helpful.Carbamate
Small improvement, you can simply discard the other values if you don't need them: head, *_ = text.partition('...') Tidwell
M
34

If you want to remove everything after the last occurrence of separator in a string I find this works well:

<separator>.join(string_to_split.split(<separator>)[:-1])

For example, if string_to_split is a path like root/location/child/too_far.exe and you only want the folder path, you can split by "/".join(string_to_split.split("/")[:-1]) and you'll get root/location/child

Meridith answered 14/9, 2015 at 22:18 Comment(2)
additionally, you can change that -1 to any index to be the occurrence at which you drop text.Meridith
this is the most flexible solutionMelinite
B
11

Without a regular expression (which I assume is what you want):

def remafterellipsis(text):
  where_ellipsis = text.find('...')
  if where_ellipsis == -1:
    return text
  return text[:where_ellipsis + 3]

or, with a regular expression:

import re

def remwithre(text, there=re.compile(re.escape('...')+'.*')):
  return there.sub('', text)
Breckenridge answered 24/5, 2009 at 22:0 Comment(2)
Might want to use sep='...' as a kwarg and use len(sep) instead of hard-coding the 3 to make it slightly more future-proof.Wingo
Yep, but then you need to recompile the RE on each call, so performance suffers for the RE solution (no real difference for the non-RE solution). Some generality is free, some isn't...;-)Breckenridge
S
6
import re
test = "This is a test...we should not be able to see this"
res = re.sub(r'\.\.\..*',"",test)
print(res)

Output: "This is a test"

Solangesolano answered 26/2, 2020 at 14:12 Comment(1)
kindly please explainAmbiguous
E
4

The method find will return the character position in a string. Then, if you want remove every thing from the character, do this:

mystring = "123⋯567"
mystring[ 0 : mystring.index("⋯")]

>> '123'

If you want to keep the character, add 1 to the character position.

Epistaxis answered 18/6, 2020 at 2:42 Comment(0)
R
3

From a file:

import re
sep = '...'

with open("requirements.txt") as file_in:
    lines = []
    for line in file_in:
        res = line.split(sep, 1)[0]
        print(res)
Requiescat answered 19/3, 2020 at 23:42 Comment(0)
T
1

Oneliner for in-place replacement:

text, *_ = text.partition('...')

Credits to original answer: https://mcmap.net/q/116547/-how-to-remove-all-characters-after-a-specific-character-in-python

Tidwell answered 27/3, 2023 at 15:18 Comment(0)
B
0

This is in python 3.7 working to me In my case I need to remove after dot in my string variable fees

fees = 45.05
split_string = fees.split(".", 1)

substring = split_string[0]

print(substring)

Bernadette answered 3/3, 2021 at 10:54 Comment(0)
B
0

Yet another way to remove all characters after the last occurrence of a character in a string (assume that you want to remove all characters after the final '/').

path = 'I/only/want/the/containing/directory/not/the/file.txt'

while path[-1] != '/':
    path = path[:-1]
Baur answered 25/5, 2021 at 22:18 Comment(1)
I think this will create a new copy of path at each iteration, so it's not a particularly efficient solution, although I agree it should work.Castle
D
-1

another easy way using re will be

import re, clr

text = 'some string... this part will be removed.'

text= re.search(r'(\A.*)\.\.\..+',url,re.DOTALL|re.IGNORECASE).group(1)

// text = some string
Discophile answered 20/5, 2015 at 10:42 Comment(1)
Regular expressions are difficult because of the shorthand. This might be a useful answer, but you didn't explain any of it.Wacke

© 2022 - 2024 — McMap. All rights reserved.