As you all know emoji symbols are coded up to 3 or 4 bytes, so it may occupy 2 symbols in my string. For example '๐wew๐'.length = 7 I want to find those symbols in my text and replace them to the value that is dependent from its code. Reading SO, I came up to XRegExp library with unicode plugin, but have not found the way how to make it work.
var str = '๐wew๐';// \u1F601 symbol
var reg = XRegExp('[\u1F601-\u1F64F]', 'g'); // /[แฝ 1-แฝคF]/g -doesn't make a lot of sense
//var reg = XRegExp('[\uD83D\uDE01-\uD83D\uDE4F]', 'g'); //Range out of order in character class
//var reg = XRegExp('\\p{L}', 'g'); //doesn't match my symbols
console.log(XRegExp.replace(str, reg, function(match){
return encodeURIComponent(match);// here I want to have smth like that %F0%9F%98%84 to be able to map anything I want to this value and replace to it
}));
I really don't want to bruteforce the string looking for the sequence of characters from my range. Could someone help me to find the way to do that with regexp's.
EDITED Just came up with an idea of enumerating all the emoji symbols. Better than brutforce but still looking for the better idea
var reg = XRegExp('\uD83D\uDE01|\uD83D\uDE4F|...','g');
'[\u1F601-\u1F64F]'
is the correct way to match these points (although the block is U+1F300-U+1F5FF). โ Modena\u1F601
in a Javascript string encodes two characters, U+1F60 followed by ASCII '1'. There's no way to useU+1F601
in a character class. โ Exaggerate/[\uD800-\uDBFF][\uDC00-\uDFFF]/g
solved my problem. It includes not only emojis but also special characters. Referred #3745221 โ Decibel