Defining the alphabet to any letter string to then later use to check if a word has a certain amount of characters
Asked Answered
G

5

5

This is what I have so far:

alphabet = "a" or "b" or "c" or "d" or "e" or "f" or \
           "g" or "h" or "i" or "j" or "k" or "l" or \
           "m" or "n" or "o" or "p" or "q" or "r" or \
           "s" or "t" or "u" or "v" or "w" or "x" or \
           "y" or "z"

letter_word_3 = any(alphabet + alphabet + alphabet)

print("Testing: ice")

if "ice" == letter_word_3:

    print("Worked!")

else:

    print("Didn't work")

print(letter_word_3) # just to see

I want to be able to eventually scan a document and have it pick out 3 letter words but I can't get this portion to work. I am new to coding in general and python is the first language I've learned so I am probably making a big stupid mistake.

Gorky answered 13/8, 2017 at 18:46 Comment(2)
Have you tried printing the variable alphabet?Malindamalinde
@Malindamalinde yes, and it always just prints "a". I'm pretty sure with python if you have an 'or' statement, it will always just do the first 'or' if given the option. I want to be able to scan through the entire alphabet to see if it can find a letter, and then if so, see if it can find another letter in the alphabet next to it, and then one more time to find a 3 letter word.Gorky
G
0
words = [word for word in line.split() if len(word) == 3 and all(ch in ascii_lowercase for ch in word)]
Gorky answered 17/8, 2017 at 5:18 Comment(0)
Y
5

You've got some good ideas, but that kind of composition of functions is really reserved for functional languages (i.e. syntax like this would work well in Haskell!)

In Python, "a" or "b" or ... evaluates to just one value, it's not a function like you're trying to use it. All values have a "truthiness" to them. All strings are "truthy" if they're not empty (e.g. bool("a") == True, but bool("") == False). or doesn't change anything here, since the first value is "truthy", so alphabet evaluates to True (more specifically to "a".

letter_word_3 then tries to do any("a" + "a" + "a"), which is always True (since "a" is truthy)


What you SHOULD do instead is to length-check each word, then check each letter to make sure it's in "abcdefghijklmnopqrtuvwxyz". Wait a second, did you notice the error I just introduced? Read that string again. I forgot an "s", and so might you! Luckily Python's stdlib has this string somewhere handy for you.

from string import ascii_lowercase  # a-z lowercase.

def is_three_letter_word(word):
    if len(word) == 3:
        if all(ch in ascii_lowercase for ch in word):
            return True
    return False

# or more concisely:
# def is_three_letter_word(word):
#     return len(word) == 3 and all(ch in ascii_lowercase for ch in word)
Yelena answered 13/8, 2017 at 18:56 Comment(0)
D
5

There are a couple things wrong. First off alphabet is always being evaluated to "a".

The or in the declaration just means "if the previous thing is false, use this instead." Since "a" is truthy, it stops there. The rest of the letters aren't even looked at by Python.

Next is the any. any just checks if something in an iterable is true. alphabet + alphabet + alphabet is being evaluated as "aaa", so letter_word_3 always returns True.

When you check if "ice" == letter_word_3' it's being evaluated as "ice" == True.

To check if an arbitrary word is three letters, the easiest way is using the following:

import re
def is_three_letters(word):
    return bool(re.match(r"[a-zA-Z]{3}$", word))

You can then use

is_three_letters("ice") # True
is_three_letters("ICE") # True
is_three_letters("four") # False
is_three_letters("to") # False
is_three_letters("111") # False (numbers not allowed)

To also allow numbers, use

import re
def is_three_letters(word):
    return bool(re.match(r"[a-zA-Z\d]{3}$", word))

That'll allow stuff like "h2o" to also be considered a three letter word.

EDIT:

import re
def is_three_letters(word):
    return bool(re.match(r"[a-z]{3}$", word))

The above code will allow only lowercase letter (no numbers or capitals).

import re
def is_three_letters(word):
    return bool(re.match(r"[a-z\d]{3}$", word))

That'll allow only lowercase letters and numbers (no capitals).

EDIT:

To check for n amount of letters, simply change the "{3}" to whatever length you want in the strings in the code above. e.g.

import re
def is_eight_letters(word):
    return bool(re.match(r"[a-zA-Z\d]{8}$", word))

The above will look for eight-long words that allow capitals, lowercase, and numbers.

Downwash answered 13/8, 2017 at 19:7 Comment(2)
N.B. that [a-zA-Z] will allow uppercase lettersYelena
Fixed for all cases now. @Adam SmithDownwash
G
3

It is more logical that letter_word_3 is a function, and not a variable. Here's how you can implement letter_word_3 and use it in your code:

alphabet = 'abcdefghijklmnopqrstuvwxyz'

def letter_word_3(word):
    return len(word) == 3 and all(x in alphabet for x in word)

print("Testing: ice")

if letter_word_3("ice"):
    print("Worked!")
else:
    print("Didn't work")

I removed last line printing letter_word_3 because it would not make much sense to print the function object.

Initially, I incorrectly assumed that your code had to generate all 3-letter strings and check if "ice" is amongst those, and fixed it as follows:

alphabet = "abcdefghijklmnopqrstuvwxyz"

letter_word_3 = [a+b+c for a in alphabet for b in alphabet for c in alphabet]

print("Testing: ice")

if "ice" in letter_word_3: # it will search amongst 17000+ strings!
    print("Worked!")
else:
    print("Didn't work")

print(letter_word_3) # it will print 17000+ strings!

this is of course very inefficient, so don't do it. But since it has been discussed, I'll leave it here.

Some useful things you should know about Python:

  • strings are sequences, so they can be iterated (character by character)
  • a character is a string itself
  • x in sequence returns True if x is contained in sequence
  • a or b evaluates to a if a evaluates to True, otherwise it evaluates to b
  • a (non-empty) string evaluates to True
  • two strings can be concatenated with +

However, I recommend you reading a good introduction about the Python language.

Gounod answered 13/8, 2017 at 19:1 Comment(3)
Note that letter_word_3 becomes very very long, and then checking it for membership over and over again becomes quite expensive! (I know you know this, but for OP...)Yelena
Yep, this solution scales very poorly. The size of letter_word_n will be n! so it is pretty much impossible to use this solution for n>13.Malindamalinde
You're right, I misinterpreted what OP wanted to do. I added a second solution on top which is more plausible, but left the old one for reference.Gounod
S
3

The most straightforward implementation of this is to use the following function:

def is_three_letter_word(word):
    return len(word) == 3 and word.isalpha()

So, for example:

>>> is_three_letters("ice") # True
True
>>> is_three_letters("ICE") # True
True
>>> is_three_letters("four") # False
False
>>> is_three_letters("to") # False
False
>>> is_three_letters("111") # False (numbers not allowed)
False

Using all is fine, but won't be faster than using built-in string methods. Plus, you shouldn't reinvent the wheel. If the language provides an adequate method, you should use it.

Socio answered 13/8, 2017 at 19:31 Comment(0)
G
0
words = [word for word in line.split() if len(word) == 3 and all(ch in ascii_lowercase for ch in word)]
Gorky answered 17/8, 2017 at 5:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.