Find there is an emoji in a string in python3 [duplicate]
Asked Answered
F

2

19

I want to check that a string contains only one emoji, using Python 3. For example, there is a is_emoji function that checks that the string has only one emoji.

def is_emoji(s):
    pass

is_emoji("😘") #True
is_emoji("πŸ˜˜β—ΌοΈ") #False

I try to use regular expressions but emojis didn't have fixed length. For example:

print(len("◼️".encode("utf-8"))) # 6 
print(len("😘".encode("utf-8"))) # 4
Farleigh answered 25/3, 2016 at 8:39 Comment(7)
Welcome to Stack Overflow! Please add some code that you've already tried. – Bulldoze
@OrangeFlash81thanks, I try to use regular expressions but i think there is no pattern for it so I encode the string in utf-8 for example "◼️".encode("utf-8") but there is no fix length for emojis . – Farleigh
Why does the length matter? Have you considered whether the encoded version has any patterns you can use? – Effrontery
In principle you could use unicodedata but 😘 did not exist in the unicodedata db on Python 2.7 so YMMV. – Onetoone
@Effrontery There are patterns like b'\xf0\x9f\x98\x98' and b'\xe2\x97\xbc\xef\xb8\x8f' but how can i understand there is only one emoji ? – Farleigh
m.youtube.com/watch?v=sgHbC6udIqc. This talk on unicode might help you undersranding basic concept on encoding – Bosporus
The package was updated. emoji.is_emoji('N') – Kokoschka
E
6

This works in Python 3:

def is_emoji(s):
    emojis = "πŸ˜˜β—ΌοΈ" # add more emojis here
    count = 0
    for emoji in emojis:
        count += s.count(emoji)
        if count > 1:
            return False
    return bool(count)

Test:

>>> is_emoji("😘")
True
>>> is_emoji('β—Ό')
True
>>> is_emoji("πŸ˜˜β—ΌοΈ")
False

Combine with Dunes' answer to avoid typing all emojis:

from emoji import UNICODE_EMOJI

def is_emoji(s):
    count = 0
    for emoji in UNICODE_EMOJI:
        count += s.count(emoji)
        if count > 1:
            return False
    return bool(count)

This is not terrible fast because UNICODE_EMOJI contains nearly 1330 items, but it works.

Erek answered 25/3, 2016 at 8:58 Comment(4)
I think it's good ;) but I have to Type all emojis there :D – Farleigh
Sorry to necropost. The last code example here doesn't seem to work with characters that are combined. Such as the normal one symbol which is a combination of digit one and the keycap modifier? – Metrical
There is a better way of doing it without writing a custom function. Use emoji package! – Blackpoll
The second part of this answer is using the package emoji already. ;). – Nova
L
35

You could try using this emoji package. It's primarily used to convert escape sequences into unicode emoji, but as a result it contains an up to date list of emojis.

from emoji import UNICODE_EMOJI

def is_emoji(s):
    return s in UNICODE_EMOJI

There are complications though, as sometimes two unicode code points can map to one printable glyph. For instance, human emoji followed by an "emoji modifier fitzpatrick type" should modify the colour of the preceding emoji; and certain emoji separated by a "zero width joiner" should be treated like a single character.

Lewallen answered 25/3, 2016 at 9:51 Comment(4)
This will check if the character is an emoji or not. Is there any way to check if a whole string contains an emoji? – Supramolecular
Note that UNICODE_EMOJI has 4 keys representing supported languages. This means that instead of the return statement above, use return s in UNICODE_EMOJI['en'] – Gravimeter
To check if a whole string contains an emoji make a reg exp like 1. r = '|'.join(list(UNICODE_EMOJI['en'].keys())) 2. r = r.replace('|*', '|*') 3. r = re.compile(r) – Eugene
The UNICODE_EMOJI dictionary has been removed in 2.0. I would suggest using bool(emoji.emoji_count(s)) – Zymo
E
6

This works in Python 3:

def is_emoji(s):
    emojis = "πŸ˜˜β—ΌοΈ" # add more emojis here
    count = 0
    for emoji in emojis:
        count += s.count(emoji)
        if count > 1:
            return False
    return bool(count)

Test:

>>> is_emoji("😘")
True
>>> is_emoji('β—Ό')
True
>>> is_emoji("πŸ˜˜β—ΌοΈ")
False

Combine with Dunes' answer to avoid typing all emojis:

from emoji import UNICODE_EMOJI

def is_emoji(s):
    count = 0
    for emoji in UNICODE_EMOJI:
        count += s.count(emoji)
        if count > 1:
            return False
    return bool(count)

This is not terrible fast because UNICODE_EMOJI contains nearly 1330 items, but it works.

Erek answered 25/3, 2016 at 8:58 Comment(4)
I think it's good ;) but I have to Type all emojis there :D – Farleigh
Sorry to necropost. The last code example here doesn't seem to work with characters that are combined. Such as the normal one symbol which is a combination of digit one and the keycap modifier? – Metrical
There is a better way of doing it without writing a custom function. Use emoji package! – Blackpoll
The second part of this answer is using the package emoji already. ;). – Nova

© 2022 - 2024 β€” McMap. All rights reserved.