How to query only the data that has emojis (postgresql)
Asked Answered
C

1

8

I have data that contains emojis within a database column, i.e.

message_text
-------
πŸ™‚
πŸ˜€
Hi πŸ˜€

I want to query only the rows that have data containing emojis. Is there a simple way to do this in postgres?

select data from table where data_col contains emoji

Currently I use a simple query

select message_text from cb_messages_v1 cmv
 where message_text IN ('πŸ‘πŸ»','πŸ˜€','😐','πŸ™‚', '😧')

but I want it to be more dynamic, where if future emotions are added it will capture the data.

Combination answered 11/2, 2021 at 15:27 Comment(2)
The same way you would search for filed containing single character "A" or "B" or ... Try this: #12958493 – Aphanite
@Aphanite that is helpful, but I what if a new emoji character is introduced? i would have to update my code and include it. Is there a way to do this without having to hard code a list of all possible emojis? – Combination
P
20

From your example it seems like you are not only interested in emoticons (U+1F601 - U+1F64F), but also in Miscellaneous Symbols And Pictographs (U+1F300 - U+1F5FF) and Transport And Map Symbols (U+1F680 - U+1F6C5).

You can find values that contain one of these with

WHERE textcol ~ '[\U0001F300-\U0001F6FF]'

~ is the regular expression matching operator, and the pattern is a range of Unicode characters.

Polish answered 11/2, 2021 at 17:49 Comment(5)
Oh wow... so simple. No need for my long answer anymore :D Thanks, lesson learned. – Scratch
Would you happen to know how to do the same in python, as well? – Combination
Just like that - SQL is SQL. – Polish
FYI, this doesn't support all emojis. For instance, the flag emojis aren't detected by the regex. – Draftee
@Draftee Well, these are. But I guess this is a moving target. I have extended the range to 1F6FF. – Polish

© 2022 - 2024 β€” McMap. All rights reserved.