How to check if a string contains only characters from a given set in python [duplicate]
Asked Answered
P

6

11

I have a a user-inputted polynomial and I only want to use it if it only has characters in the string 1234567890^-+x.

How can I check if it does or not without using external packages? I only want to use built-in Python 2.5 functions.

I am writing a program that runs on any Mac without needing external packages.

Plato answered 22/12, 2013 at 3:40 Comment(0)
G
18

Here are some odd ;-) ways to do it:

good = set('1234567890^-+x')

if set(input_string) <= good:
    # it's good
else:
    # it's bad

or

if input_string.strip('1234567890^-+x'):
    # it's bad!
else:
    # it's good
Gusty answered 22/12, 2013 at 3:51 Comment(6)
I love the set subset approach :-) Care to take a timeit potshot at it?Pandit
@MartijnPieters, not a chance - there are sooooo many ways to do this I don't want to spend half an hour organizing them all ;-)Gusty
@user2357112, just because I can never remember what .issuperset() does, exactly, without looking it up :-( But I understand <= at once, and as Martijn says, they're the same thing in the end. .issuperset() is likely slower due to the method lookup expense.Gusty
issuperset() is simply the inverse of issubset(); the >= to your <=.Pandit
@MartijnPieters, sure, but with the method spellings I can never remember which one is <= and which >= without looking it up. So I avoid them. <= and >= are obvious to me.Gusty
@MartijnPieters: issuperset would... wait, apparently it doesn't short-circuit. I thought it would do the test without building a set from the input_string, but when I tried {0}.issuperset(xrange(1000000000)) a few seconds ago, it started eating memory in a way that seems to indicate it turns the input into a set. I guess there's no performance advantage.Binion
P
10

Use a regular expression:

import re

if re.match('^[-0-9^+x]*$', text):
    # Valid input

The re module comes with Python 2.5, and is your fastest option.

Demo:

>>> re.match('^[-0-9^+x]*$', '1x2^4-2')
<_sre.SRE_Match object at 0x10f0b6780>
Pandit answered 22/12, 2013 at 3:42 Comment(0)
A
4
  1. You can convert the valid chars to a set, as sets offer faster lookup
  2. Then you can use all function like this

    valid_chars = set("1234567890^-+x")  # Converting to a set
    if all(char in valid_chars for char in input_string):
        # Do stuff if input is valid
    
  3. We can convert the input string also a set and check if all characters in the inputstring is in the valid list.

    valid_chars = set("1234567890^-+x")  # Converting to a set
    if set(input_string).issubset(valid_chars):
        # Do stuff if input is valid
    
Afford answered 22/12, 2013 at 3:42 Comment(6)
Sets have an issuperset method for this.Binion
@Binion Please check my answer now, I have used issubsetAfford
That's not the syntax for turning a string into a set; you want set("1234567890^-+x"). (Why did I see the other thing first?)Binion
@Binion That's called set comprehension.Afford
No, it's not. It's a set literal with 1 element.Binion
@TimPeters Ya, he is correct :) I always get confused with that notation :(Afford
C
4

What about just convert both the string into set and checking input_set is subset of good_set as below:

>>> good_set = set('1234567890^-+x')
>>> input_set1 = set('xajfb123')
>>> input_set2 = set('122-32+x')
>>> input_set1.issubset(good_set)
False
>>> input_set2.issubset(good_set)
True
>>>
Cerotype answered 22/12, 2013 at 3:57 Comment(0)
M
1

Yet another way to do it, now using string.translate():

>>> import string
>>> all_chars = string.maketrans('', '')
>>> has_only = lambda s, valid_chars: not s.translate(all_chars, valid_chars)
>>> has_only("abc", "1234567890^-+x.")
False
>>> has_only("x^2", "1234567890^-+x.")
True

It is not the most readable way. It should be one of the fastest if you need it.

Mesoderm answered 22/12, 2013 at 4:1 Comment(2)
You can pass as None as the first argument to translate() instead.Gusty
Does anyone even have a Python 2.5 anymore to check? LOL ;-) It works in all current Pythons :-)Gusty
D
0
whitelist = '1234567890^-+x'

str = 'x^2+2x+1'
min([ch in whitelist for ch in str])
True


str='x**2 + 1' 
min([ch in whitelist for ch in str])
False
Drongo answered 22/12, 2013 at 3:47 Comment(2)
min() is hardly the best function for this task.Pandit
Agree that all would be the best choice. Hangover from other languages.Drongo

© 2022 - 2024 — McMap. All rights reserved.