I am using python 2.7.10 on a Mac. Flags in emoji are indicated by a pair of Regional Indicator Symbols. I would like to write a python regex to insert spaces between a string of emoji flags.
For example, this string is two Brazilian flags:
u"\U0001F1E7\U0001F1F7\U0001F1E7\U0001F1F7"
which will render like this: π§π·π§π·
I'd like to insert spaces between any pair of regional indicator symbols. Something like this:
re.sub(re.compile(u"([\U0001F1E6-\U0001F1FF][\U0001F1E6-\U0001F1FF])"),
r"\1 ",
u"\U0001F1E7\U0001F1F7\U0001F1E7\U0001F1F7")
...which would result in:
u"\U0001F1E7\U0001F1F7 \U0001F1E7\U0001F1F7 "
...but that code gives me an error:
sre_constants.error: bad character range
A hint (I think) at what's going wrong is the following, which shows that \U0001F1E7 is turning into two "characters" in the regex:
re.search(re.compile(u"([\U0001F1E7])"),
u"\U0001F1E7\U0001F1F7\U0001F1E7\U0001F1F7").group(0)
This results in:
u'\ud83c'
Sadly my understanding of unicode is too weak for me to make further progress.
sys.maxunicode
is >= 1114111 (wide builds), not 65535 (narrow builds). See Unicode in Python - just UTF-16? β Skied