This question has been asked here Python : How to remove all emojis Without a solution, I have as step towards the solution. But need help finishing it off.
I went and got all the emoji hex code points from the emoji site: https://www.unicode.org/emoji/charts/emoji-ordering.txt
I then read in the file like so:
file = open('emoji-ordering.txt')
temp = file.readline()
final_list = []
while temp != '':
#print(temp)
if not temp[0] == '#' :
utf_8_values = ((temp.split(';')[0]).rstrip()).split(' ')
values = ["u\\"+(word[0]+((8 - len(word[2:]))*'0' + word[2:]).rstrip()) for word in utf_8_values]
#print(values[0])
final_list = final_list + values
temp = file.readline()
print(final_list)
I hoped this would give me unicode literals. It does not, my goal is to get unicode literals so I can use part of the solution from the last question and be able to exclude all emojis. Any ideas what we need to get a solution?