Analyzing string input until it reaches a certain letter on Python
Asked Answered
S

8

47

I need help in trying to write a certain part of a program. The idea is that a person would input a bunch of gibberish and the program will read it till it reaches an "!" (exclamation mark) so for example:

input("Type something: ")

Person types: wolfdo65gtornado!salmontiger223

If I ask the program to print the input it should only print wolfdo65gtornado and cut anything once it reaches the "!" The rest of the program is analyzing and counting the letters, but those part I already know how to do. I just need help with the first part. I been trying to look through the book but it seems I'm missing something.

I'm thinking, maybe utilizing a for loop and then placing restriction on it but I can't figure out how to make the random imputed string input be analyzed for a certain character and then get rid of the rest.

If you could help, I'll truly appreciate it. Thanks!

Selene answered 17/11, 2011 at 4:12 Comment(0)
B
72

The built-in str.partition() method will do this for you. Unlike str.split() it won't bother to cut the rest of the str into different strs.

text = raw_input("Type something:")
left_text = text.partition("!")[0]

Explanation

str.partition() returns a 3-tuple containing the beginning, separator, and end of the string. The [0] gets the first item which is all you want in this case. Eg.:

"wolfdo65gtornado!salmontiger223".partition("!")

returns

('wolfdo65gtornado', '!', 'salmontiger223')
Butterflies answered 17/11, 2011 at 4:18 Comment(7)
Of course it will, the difference is that partition keeps the "!" character (in this case) inside the array (well it actually outputs to a list rather than an array): >>> s.partition('!') ('wolfdo65gtornado', '!', 'salmontiger223') >>> s.split('!') ['wolfdo65gtornado', 'salmontiger223'] >>>Reiser
I'm saying that if there are multiple "!" characters, you won't get a list with the split at every single one of them.Butterflies
well, yes you will: s = "wolfdo!65gtornado!salmo!ntig!er223" s.split('!') ['wolfdo', '65gtornado', 'salmo', 'ntig', 'er223'] and you won't have to jump every second position in the array if you iteriate over it afterwards, as you would with rpartitionReiser
I'm talking about the case where you use str.partition() to do this, not str.split(). This is exactly the use case for which str.partition() was added to the language.Butterflies
well in this particular use case I see no reason to use str.partition over str.split but please do feel free to enlighten me - is it faster or something?Reiser
i had some type problems so i found that you can also use casting on another type and then do partition() or split() , maybe someone finds it useful str(e).partition(":")[0]) Glasgo
@MichaelHoffman What is that [0] at the end? What if I want to take a string from the end of the long string? In the above case it would be salmontiger223Shithead
R
19
>>> s = "wolfdo65gtornado!salmontiger223"
>>> s.split('!')[0]
'wolfdo65gtornado'
>>> s = "wolfdo65gtornadosalmontiger223"
>>> s.split('!')[0]
'wolfdo65gtornadosalmontiger223'

if it doesnt encounter a "!" character, it will just grab the entire text though. if you would like to output an error if it doesn't match any "!" you can just do like this:

s = "something!something"
if "!" in s:
  print "there is a '!' character in the context"
else:
  print "blah, you aren't using it right :("
Reiser answered 17/11, 2011 at 4:16 Comment(1)
The split() function is not efficient for this application. Suppose that the string is 10,000 characters long. the 53rd character is an exclamation mark !. If you read the string from left to right, then we can stop as soon as we encounter the delimiter ! We only need the beginning (prefix) of the string. The split() function will keep going and going.Valorous
S
8

You want itertools.takewhile().

>>> s = "wolfdo65gtornado!salmontiger223"
>>> '-'.join(itertools.takewhile(lambda x: x != '!', s))
'w-o-l-f-d-o-6-5-g-t-o-r-n-a-d-o'



>>> s = "wolfdo65gtornado!salmontiger223!cvhegjkh54bgve8r7tg"
>>> i = iter(s)
>>> '-'.join(itertools.takewhile(lambda x: x != '!', i))
'w-o-l-f-d-o-6-5-g-t-o-r-n-a-d-o'
>>> '-'.join(itertools.takewhile(lambda x: x != '!', i))
's-a-l-m-o-n-t-i-g-e-r-2-2-3'
>>> '-'.join(itertools.takewhile(lambda x: x != '!', i))
'c-v-h-e-g-j-k-h-5-4-b-g-v-e-8-r-7-t-g'
Savonarola answered 17/11, 2011 at 5:9 Comment(2)
i have a variable f_name='file.txt' using ur valuable idea of itertools.takewhile(), i tried this bck_f_name='backup'.join(itertools.takewhile(lambda x: x == ".",f_name)) with an expectation of getting the bck_f_name as 'filebackup.txt' but i couldn't achieve this.....any help wud b appreciableScherle
This solution is ~10x slower than other solutions .split() or .partition() in terms of performanceTitanothere
N
7

Try this:

s = "wolfdo65gtornado!salmontiger223"
m = s.index('!')
l = s[:m]
Nakada answered 20/9, 2013 at 19:33 Comment(2)
What if there is no ! in the string? Then an exception, ValueError, is raised.Valorous
Surprisingly, this solution is a tiny bit slower than the one with .partition()Titanothere
C
4

To explain accepted answer.

Splitting

partition() function splits string in list with 3 elements:

mystring = "123splitABC"
x = mystring.partition("split")
print(x)

will give:

('123', 'split', 'ABC')

Access them like list elements:

print (x[0]) ==> 123

print (x[1]) ==> split

print (x[2]) ==> ABC

Colombes answered 12/7, 2019 at 5:23 Comment(0)
V
0

Suppose we have:

s = "wolfdo65gtornado!salmontiger223" + some_other_string

s.partition("!")[0] and s.split("!")[0] are both a problem if some_other_string contains a million strings, each a million characters long, separated by exclamation marks. I recommend the following instead. It's much more efficient.

import itertools as itts
get_start_of_string = lambda stryng, last, *, itts=itts:\
                          str(itts.takewhile(lambda ch: ch != last, stryng))
###########################################################
s = "wolfdo65gtornado!salmontiger223"
start_of_string = get_start_of_string(s, "!")

Why the itts=itts

Inside of the body of a function, such as get_start_of_string, itts is global.
itts is evaluated when the function is called, not when the function is defined.
Consider the following example:

color = "white"
get_fleece_color = lambda shoop: shoop + ", whose fleece was as " + color + " as snow."

print(get_fleece_color("Igor"))

# [... many lines of code later...]

color = "pink polka-dotted"
print(get_fleece_color("Igor's cousin, 3 times removed"))

The output is:

Igor, whose fleece was white as snow.
Igor's cousin, 3 times removed Igor, whose fleece was as pink polka-dotted as snow.
Valorous answered 25/10, 2019 at 19:5 Comment(0)
V
0

You can extract the beginning of a string, up until the first delimiter is encountered, by using regular expressions.

import re

slash_if_special = lambda ch:\
    "\\" if ch in "\\^$.|?*+()[{" else ""

prefix_slash_if_special = lambda ch, *, _slash=slash_if_special: \
    _slash(ch) + ch

make_pattern_from_char = lambda ch, *, c=prefix_slash_if_special:\
    "^([^" + c(ch) + "]*)"

def get_string_up_untill(x_stryng, x_ch):
    i_stryng = str(x_stryng)
    i_ch = str(x_ch)
    assert(len(i_ch) == 1)
    pattern = make_pattern_from_char(ch)
    m = re.match(pattern, x_stryng)
    return m.groups()[0]

An example of the code above being used:

s = "wolfdo65gtornado!salmontiger223"
result = get_string_up_untill(s, "!")
print(result)
# wolfdo65gtornado
Valorous answered 25/10, 2019 at 21:55 Comment(2)
Running the code as written throws an NameError at pattern = make_pattern_from_char(ch)Niggardly
Surprisingly, this bulky-looking solution is only 2x slower than .partition()Titanothere
R
-2

We can use itertools

s = "wolfdo65gtornado!salmontiger223"
result = "".join(itertools.takewhile(lambda x : x!='!' , s))

>>"wolfdo65gtornado"
Reliquiae answered 21/6, 2019 at 8:5 Comment(1)
Isn't this the same as one of the other answers?Voroshilovsk

© 2022 - 2024 — McMap. All rights reserved.