How to remove emoji code using javascript?
Asked Answered
S

20

70

How do I remove emoji code using JavaScript? I thought I had taken care of it using the code below, but I still have characters like πŸ”΄.

function removeInvalidChars() {
    return this.replace(/[\uE000-\uF8FF]/g, '');
}
Stitching answered 12/6, 2012 at 8:22 Comment(3)
There's a lot of characters in that range -- perhaps you should instead remove individual codepoints you dislike? – Multivocal
I think #3745221 answers your question. – Derose
Here's a good article that also deals with those ranges: crocodillon.com/blog/parsing-emoji-unicode-in-javascript – Anopheles
M
67

The range you have selected is the Private Use Area, containing non-standard characters. Carriers used to encode emoji as different, inconsistent values inside this range.

More recently, the emoji have been given standardised 'unified' codepoints. Many of these are outside of the Basic Multilingual Plane, in the block U+1F300–U+1F5FF, including your example πŸ”΄ U+1F534 Large Red Circle.

You could detect these characters with [\U0001F300-\U0001F5FF] in a regex engine that supported non-BMP characters, but JavaScript's RegExp is not such a beast. Unfortunately the JS string model is based on UTF-16 code units, so you'd have to work with the UTF-16 surrogates in a regexp:

return this.replace(/([\uE000-\uF8FF]|\uD83C[\uDF00-\uDFFF]|\uD83D[\uDC00-\uDDFF])/g, '')

However, note that there are other characters in the Basic Multilingual Plane that are used as emoji by phones but which long predate emoji. For example U+2665 is the traditional Heart Suit character β™₯, but it may be rendered as an emoji graphic on some devices. It's up to you whether you treat this as emoji and try to remove it. See this list for more examples.

Marcasite answered 12/6, 2012 at 15:32 Comment(2)
Also, it should take into account that if one inserting the string later to the database, replacing with empty string could expose security issue. instead replace with the replacement character U+FFFD, see : unicode.org/reports/tr36/#Deletion_of_Noncharacters – Cowry
How would you handle emojis with mutliple codepoints such as the warning emoji? That one uses U+26A0 followed by U+FE0F. Your regex would leave the second codepoint untouched – Quadroon
C
115

For me none of the answers completely removed all emojis so I had to do some work myself and this is what i got :

text.replace(/([\u2700-\u27BF]|[\uE000-\uF8FF]|\uD83C[\uDC00-\uDFFF]|\uD83D[\uDC00-\uDFFF]|[\u2011-\u26FF]|\uD83E[\uDD10-\uDDFF])/g, '');

Also, it should take into account that if one inserting the string later to the database, replacing with empty string could expose security issue. instead replace with the replacement character U+FFFD, see : http://www.unicode.org/reports/tr36/#Deletion_of_Noncharacters

Cowry answered 9/1, 2017 at 8:20 Comment(10)
Tried many solutions, but this one was a great success! A note to anyone working with the Twitter API - this worked for me! – Fleisher
This covers a pretty solid range, but I had to make a few edits to cover some omissions. Specifically, I extended the existing character set [\u2694-\u2697] to [\u2580-\u27BF] to include some additional shapes and dingbats, which now matches the common ❀️ character (\u2764\uFE0F). I also extended \uD83E[\uDD10-\uDD5D] to \uD83E[\uDD10-\uDDFF] to catch a handful of emoji such as 🧠, πŸ¦„, 🦊, πŸ₯¦, and πŸ₯ͺ. – Governorship
@CalebMiller, would you care to post your final regex? – Presence
Hi @avalanche1, yeah this is what I used, I ended up making additional improvements as well: /[\uE000-\uF8FF]|\uD83C[\uDC00-\uDFFF]|\uD83D[\uDC00-\uDFFF]|[\u2580-\u27BF]|\uD83E[\uDD10-\uDDFF]/g – Governorship
This doesn't remove this kind 1️⃣ (1\uFE0F\u20E3) – Shien
Awesome answer. I was trying to use the one found here regextester.com/106421, but it was disallowing certain Japanese characters. The one you guys came up with here is great. Thanks so much! – Spanos
Add | before \g, and it'll take care of any whitespaces too. In case anyone need it: /[\uE000-\uF8FF]|\uD83C[\uDC00-\uDFFF]|\uD83D[\uDC00-\uDFFF]|[\u2580-\u27BF]|\uD83E[\uDD10-\uDDFF]| /g – Nervous
Will this block all current and future emojis? New emojis are added every year to unicode: washingtonpost.com/kidspost/2022/01/31/… – Kinlaw
This seems cannot resolve 'πŸ‘©β€πŸ‘©β€πŸ‘¦β€πŸ‘¦' – Lighter
I'd also add |[\uFE00-\uFE0F] for the "variation selectors" codepoints.net/variation_selectors. They don't show up as anything, but it was making it so that my .trim() didn't work since the string had one at the beginning after removing the emoji proper. – Thibodeaux
M
67

The range you have selected is the Private Use Area, containing non-standard characters. Carriers used to encode emoji as different, inconsistent values inside this range.

More recently, the emoji have been given standardised 'unified' codepoints. Many of these are outside of the Basic Multilingual Plane, in the block U+1F300–U+1F5FF, including your example πŸ”΄ U+1F534 Large Red Circle.

You could detect these characters with [\U0001F300-\U0001F5FF] in a regex engine that supported non-BMP characters, but JavaScript's RegExp is not such a beast. Unfortunately the JS string model is based on UTF-16 code units, so you'd have to work with the UTF-16 surrogates in a regexp:

return this.replace(/([\uE000-\uF8FF]|\uD83C[\uDF00-\uDFFF]|\uD83D[\uDC00-\uDDFF])/g, '')

However, note that there are other characters in the Basic Multilingual Plane that are used as emoji by phones but which long predate emoji. For example U+2665 is the traditional Heart Suit character β™₯, but it may be rendered as an emoji graphic on some devices. It's up to you whether you treat this as emoji and try to remove it. See this list for more examples.

Marcasite answered 12/6, 2012 at 15:32 Comment(2)
Also, it should take into account that if one inserting the string later to the database, replacing with empty string could expose security issue. instead replace with the replacement character U+FFFD, see : unicode.org/reports/tr36/#Deletion_of_Noncharacters – Cowry
How would you handle emojis with mutliple codepoints such as the warning emoji? That one uses U+26A0 followed by U+FE0F. Your regex would leave the second codepoint untouched – Quadroon
W
52

I solved it by using a regex with Unicode property escapes. I got it from this article, it's for Java but still very helpful - Remove Emojis from a Java String.

'SmileπŸ˜€'.replace(/[^\p{L}\p{N}\p{P}\p{Z}^$\n]/gu, '');

It removes all symbols except:

  • \p{L} - all letters from any language
  • \p{N} - numbers
  • \p{P} - punctuation
  • \p{Z} - whitespace separators
  • ^$\n - add any symbols you want to keep

This one should be more correct and it works, but for me it leaves some trash symbols in the string:

    'SmileπŸ˜€'.replace(/\p{Emoji}/gu, '');

Edit: added symbols from comments

Whosoever answered 18/8, 2020 at 8:3 Comment(9)
That's very elegant – Presence
Unfortunately it removes ^ and $. Should be /[^\p{L}\p{N}\p{P}\p{Z}{\^\$}]/gu – Presence
Works fine, but disables enter key. /[^\p{L}\p{N}\p{P}\p{Z}\n]/gu - this enables enter – Leanoraleant
What's the trash symbols? – Acuna
In case of complex emoji. For example: 'πŸ‘¨πŸΏβ€πŸŽ€'.replace(/\p{Emoji}/gu, '').charCodeAt(0) – Whosoever
Clear and consice – Immune
I tested /\p{Emoji}/gu and it remove numeric values. "Test123😊" become "Test". – Troyes
@Troyes replacing "Emoji_Presentation" instead of "Emoji" appears to remove emojis but not numbers. – Uredo
@Whosoever Saved me a lot of time. Thanks :) – Lentil
H
22

I've found many suggestions around but the regex that have solved my problem is:

/(?:[\u2700-\u27bf]|(?:\ud83c[\udde6-\uddff]){2}|[\ud800-\udbff][\udc00-\udfff]|[\u0023-\u0039]\ufe0f?\u20e3|\u3299|\u3297|\u303d|\u3030|\u24c2|\ud83c[\udd70-\udd71]|\ud83c[\udd7e-\udd7f]|\ud83c\udd8e|\ud83c[\udd91-\udd9a]|\ud83c[\udde6-\uddff]|\ud83c[\ude01-\ude02]|\ud83c\ude1a|\ud83c\ude2f|\ud83c[\ude32-\ude3a]|\ud83c[\ude50-\ude51]|\u203c|\u2049|[\u25aa-\u25ab]|\u25b6|\u25c0|[\u25fb-\u25fe]|\u00a9|\u00ae|\u2122|\u2139|\ud83c\udc04|[\u2600-\u26FF]|\u2b05|\u2b06|\u2b07|\u2b1b|\u2b1c|\u2b50|\u2b55|\u231a|\u231b|\u2328|\u23cf|[\u23e9-\u23f3]|[\u23f8-\u23fa]|\ud83c\udccf|\u2934|\u2935|[\u2190-\u21ff])/g

A short example

function removeEmojis (string) {
  var regex = /(?:[\u2700-\u27bf]|(?:\ud83c[\udde6-\uddff]){2}|[\ud800-\udbff][\udc00-\udfff]|[\u0023-\u0039]\ufe0f?\u20e3|\u3299|\u3297|\u303d|\u3030|\u24c2|\ud83c[\udd70-\udd71]|\ud83c[\udd7e-\udd7f]|\ud83c\udd8e|\ud83c[\udd91-\udd9a]|\ud83c[\udde6-\uddff]|\ud83c[\ude01-\ude02]|\ud83c\ude1a|\ud83c\ude2f|\ud83c[\ude32-\ude3a]|\ud83c[\ude50-\ude51]|\u203c|\u2049|[\u25aa-\u25ab]|\u25b6|\u25c0|[\u25fb-\u25fe]|\u00a9|\u00ae|\u2122|\u2139|\ud83c\udc04|[\u2600-\u26FF]|\u2b05|\u2b06|\u2b07|\u2b1b|\u2b1c|\u2b50|\u2b55|\u231a|\u231b|\u2328|\u23cf|[\u23e9-\u23f3]|[\u23f8-\u23fa]|\ud83c\udccf|\u2934|\u2935|[\u2190-\u21ff])/g;
  return string.replace(regex, '');
}

Hope it can help you

Haslett answered 15/12, 2016 at 12:32 Comment(3)
Great answer for me. However a mistake in the regex causes this to also match right brackets ([). Looks like just a mistake from copying and pasting code, but Lucas please fix. Fixed regex here: pastebin.com/0VZZKfWf – Anticlinal
Thank you for your suggestion, @MarcGuiselin – Haslett
I would extend given regex with [\u200d] and [\ufe0f]. They are both special characters, which helps create emoji sequences. If you use just the regex above to remove emojis, your text will contain a lot of these whitespace chars. See evgenyzborovsky.com/2018/04/07/the-ultimate-guide-to-emojis – Elissaelita
H
16

Just an addition to @hababr answer.

If you need to get rid of complicated emojis, you have to remove also additional things like modifiers and etc:

'πŸ‘¨πŸΏβ€πŸŽ€'.replace(/[\p{Emoji}\p{Emoji_Modifier}\p{Emoji_Component}\p{Emoji_Modifier_Base}\p{Emoji_Presentation}]/gu, '')

But note that *#0-9 are technically considered Emoji characters with a text representation by default, per the Unicode Standard.

So to remove emojis without removing those characters:

'πŸ‘¨πŸΏβ€πŸŽ€'.replace(/(?![*#0-9]+)[\p{Emoji}\p{Emoji_Modifier}\p{Emoji_Component}\p{Emoji_Modifier_Base}\p{Emoji_Presentation}]/gu, '')
Hildegard answered 21/10, 2021 at 11:29 Comment(1)
This seems to be the best answer as of 2022. – Rhu
W
7

After searching and trying lots of unicode regex, I suggest you try this, it can cover all of emojis:

function removeEmoji(str) {
  let strCopy = str;
  const emojiKeycapRegex = /[\u0023-\u0039]\ufe0f?\u20e3/g;
  const emojiRegex = /\p{Extended_Pictographic}/gu;
  const emojiComponentRegex = /\p{Emoji_Component}/gu;
  if (emojiKeycapRegex.test(strCopy)) {
    strCopy = strCopy.replace(emojiKeycapRegex, '');
  }
  if (emojiRegex.test(strCopy)) {
    strCopy = strCopy.replace(emojiRegex, '');
  }
  if (emojiComponentRegex.test(strCopy)) {
    // eslint-disable-next-line no-restricted-syntax
    for (const emoji of (strCopy.match(emojiComponentRegex) || [])) {
      if (/[\d|*|#]/.test(emoji)) {
        continue;
      }
      strCopy = strCopy.replace(emoji, '');
    }
  }

  return strCopy;
}
let a = "1️⃣aaπŸ€Ήβ€β™‚οΈb#οΈβƒ£πŸ”€βœ…βŽ23#!^*bbπŸ€ΉπŸΎπŸ€Ήβ€β™€οΈπŸš΄πŸ»ccc";
console.log(removeEmoji(a))

Refrence: Unicode Emoij Document

Widgeon answered 11/5, 2021 at 7:31 Comment(0)
P
6

@bobince's solution didn't work for me. Either the Emojis stayed there or they were swapped by a different Emoji.

This solution did the trick for me:

var ranges = [
  '\ud83c[\udf00-\udfff]', // U+1F300 to U+1F3FF
  '\ud83d[\udc00-\ude4f]', // U+1F400 to U+1F64F
  '\ud83d[\ude80-\udeff]' // U+1F680 to U+1F6FF
];


$('#mybtn').on('click', function() {
  removeInvalidChars();
})

function removeInvalidChars() {
  var str = $('#myinput').val();

  str = str.replace(new RegExp(ranges.join('|'), 'g'), '');
  $("#myinput").val(str);
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<input type="text" id="myinput"/>
<input type="submit" id="mybtn" value="clear"/>

Source

Polly answered 17/8, 2016 at 2:26 Comment(2)
this solution is also not working for many characters like πŸ€“πŸ€”πŸ₯“πŸ₯“πŸ€”πŸ€”πŸ€”πŸ€”πŸ€” – Circumjacent
I entered all the emoji list. But some got cleared and result was this = βœŒπŸ€žπŸ€¦β€β™‚οΈπŸ€¦β€β™€οΈβ€πŸ€£πŸ€·β€β™€οΈπŸ€·β€β™‚οΈβ€β€πŸ€³β€β€β€β€βœ”βœ¨πŸ€’πŸ€” – Injustice
F
6

I know this post is a bit old, but I stumbled across this very problem at work and a colleague came up with an interesting idea. Basically instead of stripping emoji character only allow valid characters in. Consulting this ASCII table:

http://www.asciitable.com/

A function such as this could only keep legal characters (the range itself dependent on what you are after)

function (input) {
            var result = '';
            if (input.length == 0)
                return input;
            for (var indexOfInput = 0, lengthOfInput = input.length; indexOfInput < lengthOfInput; indexOfInput++) {
                var charAtSpecificIndex = input[indexOfInput].charCodeAt(0);
                if ((32 <= charAtSpecificIndex) && (charAtSpecificIndex <= 126)) {
                    result += input[indexOfInput];
                }
            }
            return result;
        };

This should preserve all numbers, letters and special characters of the Alphabet for a situation where you wish to preserve the English alphabet + number + special characters. Hope it helps someone :)

Frumenty answered 8/11, 2016 at 15:46 Comment(2)
Great. Would add new line and carriage return to that though (ASCII 10 and 13) – Ancient
What if I need to keep locale-specific characters like cyrillic, hebrew, etc – Presence
A
4

None of the answers here worked for all the unicode characters I tested (specifically characters in the miscellaneous range such as β›½ or ☯️).

Here is one that worked for me, (heavily) inspired from this SO PHP answer:

function _removeEmojis(str) {
  return str.replace(/([#0-9]\u20E3)|[\xA9\xAE\u203C\u2047-\u2049\u2122\u2139\u3030\u303D\u3297\u3299][\uFE00-\uFEFF]?|[\u2190-\u21FF][\uFE00-\uFEFF]?|[\u2300-\u23FF][\uFE00-\uFEFF]?|[\u2460-\u24FF][\uFE00-\uFEFF]?|[\u25A0-\u25FF][\uFE00-\uFEFF]?|[\u2600-\u27BF][\uFE00-\uFEFF]?|[\u2900-\u297F][\uFE00-\uFEFF]?|[\u2B00-\u2BF0][\uFE00-\uFEFF]?|(?:\uD83C[\uDC00-\uDFFF]|\uD83D[\uDC00-\uDEFF])[\uFE00-\uFEFF]?/g, '');
}

(My use case is sorting in a data grid where emojis can come first in a string but users want the text ordered by the actual words.)

Assiduous answered 27/11, 2016 at 3:8 Comment(2)
thank you for this. One thing I noticed it that it wasn't catching all emojis. I found another regex string but it is doing something funky like deleting the character before and adding characters. I can't seem to figure out the difference. Here is a comparison in JSbin between yours and the other: link – Greensward
Thank you. This saves me a lot of time. Don't forget to add .trim() after the end to remove the empty spaces. – Lydialydian
A
3

sandre89's answer is good but not perfect. I spent some time on the subject and have a working solution.

var ranges = [
  '[\u00A0-\u269f]',
  '[\u26A0-\u329f]',
  // The following characters could not be minified correctly
  // if specifed with the ES6 syntax \u{1F400}
  '[πŸ€„-πŸ§€]'
  //'[\u{1F004}-\u{1F9C0}]'
];


$('#mybtn').on('click', function() {
  removeInvalidChars();
});

function removeInvalidChars() {
  var str = $('#myinput').val();
  str = str.replace(new RegExp(ranges.join('|'), 'ug'), '');
  $("#myinput").val(str);
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<input type="text" id="myinput" />
<input type="submit" id="mybtn" value="clear" />

Here is my CodePen

There are some points to note, though.

  1. Unicode characters from U+1F000 up need a special notation, so you can use sandre89's way, or opt for the \u{1F000} ES6 notation, which may or may not work with your minificator. I succeeded pasting the emojis directly in the UTF-8 encoded script.

  2. Don't forget the u flag in the regex, or your Javascript engine may throw an error.

Beware that things may not be working due to the file encoding, character set, or minificator. In my case nothing worked until I took the script off an .isml file (Demandware) and pasted it into a .js file.

You may gain some insight by referring to Wikipedia Emoji page and How many bytes does one Unicode character take?, and by tinkering with this Online Unicode converter, as I did.

Arica answered 12/10, 2017 at 11:9 Comment(0)
C
2
var emoji =/([#0-9]\u20E3)|[\xA9\xAE\u203C\u2047-\u2049\u2122\u2139\u3030\u303D\u3297\u3299][\uFE00-\uFEFF]?|[\u2190-\u21FF][\uFE00-\uFEFF]?|[\u2300-\u23FF][\uFE00-\uFEFF]?|[\u2460-\u24FF][\uFE00-\uFEFF]?|[\u25A0-\u25FF][\uFE00-\uFEFF]?|[\u2600-\u27BF][\uFE00-\uFEFF]?|[\u2900-\u297F][\uFE00-\uFEFF]?|[\u2B00-\u2BF0][\uFE00-\uFEFF]?|(?:\uD83C[\uDC00-\uDFFF]|\uD83D[\uDC00-\uDEFF])[\uFE00-\uFEFF]?|[\u20E3]|[\u26A0-\u3000]|\uD83E[\udd00-\uddff]|[\u00A0-\u269F]/g;

str.replace(emoji, "");

i add this '\uD83E[\udd00-\uddff]'

these emojis were updated when 2018 june

if u want block emojis after other update then use this

str.replace(/[^0-9a-zA-Zγ„±-힣+Γ—Γ·=%β™€β™‘β˜†β™§)(*&^/~#@!-:;,?`_|<>{}Β₯£€$β—‡β– β–‘β—β—‹β€’Β°β€»Β€γ€Šγ€‹Β‘ΒΏβ‚©\[\]\"\' \\]/g ,"");

u can block all emojis and u can only use eng, num, hangle, and some Characters thx :)

Chris answered 2/5, 2019 at 3:35 Comment(3)
Please add explanation about your answer, for better understanding of readers. – Crusty
This doesn't support any any language. – Carabao
Please explain your answer. – Echikson
C
2

There is a modern solution using categories

Modern browsers support Unicode property, which allows you to match emojis based on their belonging in the Emoji Unicode category. For example, you can use Unicode property escapes like \p{Emoji} or \P{Emoji} to match/no match emoji characters. Note that 0123456789#* and other characters are interpreted as emojis using the previous Unicode category. Therefore, a better way to do this is to use the {Extended_Pictographic} Unicode category that denotes all the characters typically understood as emojis instead of the {Emoji} category.

const withEmojis = /\p{Extended_Pictographic}/u

withEmojis.test('πŸ˜€πŸ˜€'); 
//true

withEmojis.test('ab');
//false

withEmojis.test('1');
//false
Coinsure answered 14/11, 2022 at 10:51 Comment(1)
This won't match keycap variation of digits: 1️⃣, 2️⃣, *️⃣, etc. @grabus' answer will, and will convert those to the non-keycap ascii variation, which might be desirable for some. – Earthaearthborn
E
2

You can use mathiasbynens/emoji-regex package to remove or replace emojis.

You can see the latest build's content to grab the regex by visiting following url:

http://unpkg.com/emoji-regex/index.js

Elude answered 9/2, 2023 at 14:48 Comment(0)
S
1

You can use this function to replace emojis with nothing:

function msgAfterClearEmojis(msg)
{
    var new_msg = msg.replace(/([#0-9]\u20E3)|[\xA9\xAE\u203C\u2047-\u2049\u2122\u2139\u3030\u303D\u3297\u3299][\uFE00-\uFEFF]?|[\u2190-\u21FF][\uFE00-\uFEFF]?|[\u2300-\u23FF][\uFE00-\uFEFF]?|[\u2460-\u24FF][\uFE00-\uFEFF]?|[\u25A0-\u25FF][\uFE00-\uFEFF]?|[\u2600-\u27BF][\uFE00-\uFEFF]?|[\u2900-\u297F][\uFE00-\uFEFF]?|[\u2B00-\u2BF0][\uFE00-\uFEFF]?|(?:\uD83C[\uDC00-\uDFFF]|\uD83D[\uDC00-\uDEFF])[\uFE00-\uFEFF]?|[\u20E3]|[\u26A0-\u3000]|\uD83E[\udd00-\uddff]|[\u00A0-\u269F]/g, '').trim();
    return new_msg;
}
Stopple answered 5/8, 2019 at 9:36 Comment(0)
E
1

You can check here with emoji..

😊  , 😌  ,  πŸ‘½

function removeEmoji() {
  var y = document.getElementById('textbox_id1');
  y.value = y.value.replace(/([\u2700-\u27BF]|[\uE000-\uF8FF]|\uD83C[\uDC00-\uDFFF]|\uD83D[\uDC00-\uDFFF]|[\u2011-\u26FF]|\uD83E[\uDD10-\uDDFF])/g, '');
}
input {
  padding: 5px;
}
<input type="text" id="textbox_id1" placeholder="Remove emoji..." oninput="removeEmoji()">

You can take more emojis from here: Emoji Keyboard Online

Ecker answered 21/1, 2021 at 7:30 Comment(0)
P
1

This is the iteration on @hababr's answer.
His answer removes lots of standard chars like $, +, < and so on.
This version keeps all of them (except for the \ backslash - dunno how to properly escape it).

"hey😁 hauπŸ’“ ahoyπŸ΄β€β˜ οΈ !@#$%^&*()-_=+Β±Β§;:'\|`~/?[]{},.<>".replace(/[^\p{L}\p{N}\p{P}\p{Z}{^$=+Β±\\'|`\\~<>}]/gu, "")
// "hey hau ahoy !@#$%^&*()-_=+Β±Β§;:'|`~/?[]{},.<>"
Presence answered 15/3, 2021 at 18:18 Comment(0)
W
1

I have this regex and it works for all emojis i found on this page

try this regex

<:[^:\s]+:\d+>|<a:[^:\s]+:\d+>|(\u00a9|\u00ae|[\u2000-\u3300]|\ud83c[\ud000-\udfff]|\ud83d[\ud000-\udfff]|\ud83e[\ud000-\udfff]|\ufe0f)
Wrongful answered 19/12, 2022 at 15:50 Comment(0)
B
0
var emojiRegex = /\uD83C\uDFF4(?:\uDB40\uDC67\uDB40\uDC62(?:\uDB40\uDC65\uDB40\uDC6E\uDB40\uDC67|\uDB40\uDC77\uDB40\uDC6C\uDB40\uDC73|\uDB40\uDC73\uDB40\uDC63\uDB40\uDC74)\uDB40\uDC7F|\u200D\u2620\uFE0F)|\uD83D\uDC69\u200D\uD83D\uDC69\u200D(?:\uD83D\uDC66\u200D\uD83D\uDC66|\uD83D\uDC67\u200D(?:\uD83D[\uDC66\uDC67]))|\uD83D\uDC68(?:\u200D(?:\u2764\uFE0F\u200D(?:\uD83D\uDC8B\u200D)?\uD83D\uDC68|(?:\uD83D[\uDC68\uDC69])\u200D(?:\uD83D\uDC66\u200D\uD83D\uDC66|\uD83D\uDC67\u200D(?:\uD83D[\uDC66\uDC67]))|\uD83D\uDC66\u200D\uD83D\uDC66|\uD83D\uDC67\u200D(?:\uD83D[\uDC66\uDC67])|\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E[\uDDB0-\uDDB3])|(?:\uD83C[\uDFFB-\uDFFF])\u200D(?:\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E[\uDDB0-\uDDB3]))|\uD83D\uDC69\u200D(?:\u2764\uFE0F\u200D(?:\uD83D\uDC8B\u200D(?:\uD83D[\uDC68\uDC69])|\uD83D[\uDC68\uDC69])|\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E[\uDDB0-\uDDB3])|\uD83D\uDC69\u200D\uD83D\uDC66\u200D\uD83D\uDC66|(?:\uD83D\uDC41\uFE0F\u200D\uD83D\uDDE8|\uD83D\uDC69(?:\uD83C[\uDFFB-\uDFFF])\u200D[\u2695\u2696\u2708]|\uD83D\uDC68(?:(?:\uD83C[\uDFFB-\uDFFF])\u200D[\u2695\u2696\u2708]|\u200D[\u2695\u2696\u2708])|(?:(?:\u26F9|\uD83C[\uDFCB\uDFCC]|\uD83D\uDD75)\uFE0F|\uD83D\uDC6F|\uD83E[\uDD3C\uDDDE\uDDDF])\u200D[\u2640\u2642]|(?:\u26F9|\uD83C[\uDFCB\uDFCC]|\uD83D\uDD75)(?:\uD83C[\uDFFB-\uDFFF])\u200D[\u2640\u2642]|(?:\uD83C[\uDFC3\uDFC4\uDFCA]|\uD83D[\uDC6E\uDC71\uDC73\uDC77\uDC81\uDC82\uDC86\uDC87\uDE45-\uDE47\uDE4B\uDE4D\uDE4E\uDEA3\uDEB4-\uDEB6]|\uD83E[\uDD26\uDD37-\uDD39\uDD3D\uDD3E\uDDB8\uDDB9\uDDD6-\uDDDD])(?:(?:\uD83C[\uDFFB-\uDFFF])\u200D[\u2640\u2642]|\u200D[\u2640\u2642])|\uD83D\uDC69\u200D[\u2695\u2696\u2708])\uFE0F|\uD83D\uDC69\u200D\uD83D\uDC67\u200D(?:\uD83D[\uDC66\uDC67])|\uD83D\uDC69\u200D\uD83D\uDC69\u200D(?:\uD83D[\uDC66\uDC67])|\uD83D\uDC68(?:\u200D(?:(?:\uD83D[\uDC68\uDC69])\u200D(?:\uD83D[\uDC66\uDC67])|\uD83D[\uDC66\uDC67])|\uD83C[\uDFFB-\uDFFF])|\uD83C\uDFF3\uFE0F\u200D\uD83C\uDF08|\uD83D\uDC69\u200D\uD83D\uDC67|\uD83D\uDC69(?:\uD83C[\uDFFB-\uDFFF])\u200D(?:\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E[\uDDB0-\uDDB3])|\uD83D\uDC69\u200D\uD83D\uDC66|\uD83C\uDDF6\uD83C\uDDE6|\uD83C\uDDFD\uD83C\uDDF0|\uD83C\uDDF4\uD83C\uDDF2|\uD83D\uDC69(?:\uD83C[\uDFFB-\uDFFF])|\uD83C\uDDED(?:\uD83C[\uDDF0\uDDF2\uDDF3\uDDF7\uDDF9\uDDFA])|\uD83C\uDDEC(?:\uD83C[\uDDE6\uDDE7\uDDE9-\uDDEE\uDDF1-\uDDF3\uDDF5-\uDDFA\uDDFC\uDDFE])|\uD83C\uDDEA(?:\uD83C[\uDDE6\uDDE8\uDDEA\uDDEC\uDDED\uDDF7-\uDDFA])|\uD83C\uDDE8(?:\uD83C[\uDDE6\uDDE8\uDDE9\uDDEB-\uDDEE\uDDF0-\uDDF5\uDDF7\uDDFA-\uDDFF])|\uD83C\uDDF2(?:\uD83C[\uDDE6\uDDE8-\uDDED\uDDF0-\uDDFF])|\uD83C\uDDF3(?:\uD83C[\uDDE6\uDDE8\uDDEA-\uDDEC\uDDEE\uDDF1\uDDF4\uDDF5\uDDF7\uDDFA\uDDFF])|\uD83C\uDDFC(?:\uD83C[\uDDEB\uDDF8])|\uD83C\uDDFA(?:\uD83C[\uDDE6\uDDEC\uDDF2\uDDF3\uDDF8\uDDFE\uDDFF])|\uD83C\uDDF0(?:\uD83C[\uDDEA\uDDEC-\uDDEE\uDDF2\uDDF3\uDDF5\uDDF7\uDDFC\uDDFE\uDDFF])|\uD83C\uDDEF(?:\uD83C[\uDDEA\uDDF2\uDDF4\uDDF5])|\uD83C\uDDF8(?:\uD83C[\uDDE6-\uDDEA\uDDEC-\uDDF4\uDDF7-\uDDF9\uDDFB\uDDFD-\uDDFF])|\uD83C\uDDEE(?:\uD83C[\uDDE8-\uDDEA\uDDF1-\uDDF4\uDDF6-\uDDF9])|\uD83C\uDDFF(?:\uD83C[\uDDE6\uDDF2\uDDFC])|\uD83C\uDDEB(?:\uD83C[\uDDEE-\uDDF0\uDDF2\uDDF4\uDDF7])|\uD83C\uDDF5(?:\uD83C[\uDDE6\uDDEA-\uDDED\uDDF0-\uDDF3\uDDF7-\uDDF9\uDDFC\uDDFE])|\uD83C\uDDE9(?:\uD83C[\uDDEA\uDDEC\uDDEF\uDDF0\uDDF2\uDDF4\uDDFF])|\uD83C\uDDF9(?:\uD83C[\uDDE6\uDDE8\uDDE9\uDDEB-\uDDED\uDDEF-\uDDF4\uDDF7\uDDF9\uDDFB\uDDFC\uDDFF])|\uD83C\uDDE7(?:\uD83C[\uDDE6\uDDE7\uDDE9-\uDDEF\uDDF1-\uDDF4\uDDF6-\uDDF9\uDDFB\uDDFC\uDDFE\uDDFF])|[#\*0-9]\uFE0F\u20E3|\uD83C\uDDF1(?:\uD83C[\uDDE6-\uDDE8\uDDEE\uDDF0\uDDF7-\uDDFB\uDDFE])|\uD83C\uDDE6(?:\uD83C[\uDDE8-\uDDEC\uDDEE\uDDF1\uDDF2\uDDF4\uDDF6-\uDDFA\uDDFC\uDDFD\uDDFF])|\uD83C\uDDF7(?:\uD83C[\uDDEA\uDDF4\uDDF8\uDDFA\uDDFC])|\uD83C\uDDFB(?:\uD83C[\uDDE6\uDDE8\uDDEA\uDDEC\uDDEE\uDDF3\uDDFA])|\uD83C\uDDFE(?:\uD83C[\uDDEA\uDDF9])|(?:\uD83C[\uDFC3\uDFC4\uDFCA]|\uD83D[\uDC6E\uDC71\uDC73\uDC77\uDC81\uDC82\uDC86\uDC87\uDE45-\uDE47\uDE4B\uDE4D\uDE4E\uDEA3\uDEB4-\uDEB6]|\uD83E[\uDD26\uDD37-\uDD39\uDD3D\uDD3E\uDDB8\uDDB9\uDDD6-\uDDDD])(?:\uD83C[\uDFFB-\uDFFF])|(?:\u26F9|\uD83C[\uDFCB\uDFCC]|\uD83D\uDD75)(?:\uD83C[\uDFFB-\uDFFF])|(?:[\u261D\u270A-\u270D]|\uD83C[\uDF85\uDFC2\uDFC7]|\uD83D[\uDC42\uDC43\uDC46-\uDC50\uDC66\uDC67\uDC70\uDC72\uDC74-\uDC76\uDC78\uDC7C\uDC83\uDC85\uDCAA\uDD74\uDD7A\uDD90\uDD95\uDD96\uDE4C\uDE4F\uDEC0\uDECC]|\uD83E[\uDD18-\uDD1C\uDD1E\uDD1F\uDD30-\uDD36\uDDB5\uDDB6\uDDD1-\uDDD5])(?:\uD83C[\uDFFB-\uDFFF])|(?:[\u231A\u231B\u23E9-\u23EC\u23F0\u23F3\u25FD\u25FE\u2614\u2615\u2648-\u2653\u267F\u2693\u26A1\u26AA\u26AB\u26BD\u26BE\u26C4\u26C5\u26CE\u26D4\u26EA\u26F2\u26F3\u26F5\u26FA\u26FD\u2705\u270A\u270B\u2728\u274C\u274E\u2753-\u2755\u2757\u2795-\u2797\u27B0\u27BF\u2B1B\u2B1C\u2B50\u2B55]|\uD83C[\uDC04\uDCCF\uDD8E\uDD91-\uDD9A\uDDE6-\uDDFF\uDE01\uDE1A\uDE2F\uDE32-\uDE36\uDE38-\uDE3A\uDE50\uDE51\uDF00-\uDF20\uDF2D-\uDF35\uDF37-\uDF7C\uDF7E-\uDF93\uDFA0-\uDFCA\uDFCF-\uDFD3\uDFE0-\uDFF0\uDFF4\uDFF8-\uDFFF]|\uD83D[\uDC00-\uDC3E\uDC40\uDC42-\uDCFC\uDCFF-\uDD3D\uDD4B-\uDD4E\uDD50-\uDD67\uDD7A\uDD95\uDD96\uDDA4\uDDFB-\uDE4F\uDE80-\uDEC5\uDECC\uDED0-\uDED2\uDEEB\uDEEC\uDEF4-\uDEF9]|\uD83E[\uDD10-\uDD3A\uDD3C-\uDD3E\uDD40-\uDD45\uDD47-\uDD70\uDD73-\uDD76\uDD7A\uDD7C-\uDDA2\uDDB0-\uDDB9\uDDC0-\uDDC2\uDDD0-\uDDFF])|(?:[#\*0-9\xA9\xAE\u203C\u2049\u2122\u2139\u2194-\u2199\u21A9\u21AA\u231A\u231B\u2328\u23CF\u23E9-\u23F3\u23F8-\u23FA\u24C2\u25AA\u25AB\u25B6\u25C0\u25FB-\u25FE\u2600-\u2604\u260E\u2611\u2614\u2615\u2618\u261D\u2620\u2622\u2623\u2626\u262A\u262E\u262F\u2638-\u263A\u2640\u2642\u2648-\u2653\u265F\u2660\u2663\u2665\u2666\u2668\u267B\u267E\u267F\u2692-\u2697\u2699\u269B\u269C\u26A0\u26A1\u26AA\u26AB\u26B0\u26B1\u26BD\u26BE\u26C4\u26C5\u26C8\u26CE\u26CF\u26D1\u26D3\u26D4\u26E9\u26EA\u26F0-\u26F5\u26F7-\u26FA\u26FD\u2702\u2705\u2708-\u270D\u270F\u2712\u2714\u2716\u271D\u2721\u2728\u2733\u2734\u2744\u2747\u274C\u274E\u2753-\u2755\u2757\u2763\u2764\u2795-\u2797\u27A1\u27B0\u27BF\u2934\u2935\u2B05-\u2B07\u2B1B\u2B1C\u2B50\u2B55\u3030\u303D\u3297\u3299]|\uD83C[\uDC04\uDCCF\uDD70\uDD71\uDD7E\uDD7F\uDD8E\uDD91-\uDD9A\uDDE6-\uDDFF\uDE01\uDE02\uDE1A\uDE2F\uDE32-\uDE3A\uDE50\uDE51\uDF00-\uDF21\uDF24-\uDF93\uDF96\uDF97\uDF99-\uDF9B\uDF9E-\uDFF0\uDFF3-\uDFF5\uDFF7-\uDFFF]|\uD83D[\uDC00-\uDCFD\uDCFF-\uDD3D\uDD49-\uDD4E\uDD50-\uDD67\uDD6F\uDD70\uDD73-\uDD7A\uDD87\uDD8A-\uDD8D\uDD90\uDD95\uDD96\uDDA4\uDDA5\uDDA8\uDDB1\uDDB2\uDDBC\uDDC2-\uDDC4\uDDD1-\uDDD3\uDDDC-\uDDDE\uDDE1\uDDE3\uDDE8\uDDEF\uDDF3\uDDFA-\uDE4F\uDE80-\uDEC5\uDECB-\uDED2\uDEE0-\uDEE5\uDEE9\uDEEB\uDEEC\uDEF0\uDEF3-\uDEF9]|\uD83E[\uDD10-\uDD3A\uDD3C-\uDD3E\uDD40-\uDD45\uDD47-\uDD70\uDD73-\uDD76\uDD7A\uDD7C-\uDDA2\uDDB0-\uDDB9\uDDC0-\uDDC2\uDDD0-\uDDFF])\uFE0F|(?:[\u261D\u26F9\u270A-\u270D]|\uD83C[\uDF85\uDFC2-\uDFC4\uDFC7\uDFCA-\uDFCC]|\uD83D[\uDC42\uDC43\uDC46-\uDC50\uDC66-\uDC69\uDC6E\uDC70-\uDC78\uDC7C\uDC81-\uDC83\uDC85-\uDC87\uDCAA\uDD74\uDD75\uDD7A\uDD90\uDD95\uDD96\uDE45-\uDE47\uDE4B-\uDE4F\uDEA3\uDEB4-\uDEB6\uDEC0\uDECC]|\uD83E[\uDD18-\uDD1C\uDD1E\uDD1F\uDD26\uDD30-\uDD39\uDD3D\uDD3E\uDDB5\uDDB6\uDDB8\uDDB9\uDDD1-\uDDDD])/g;
console.log(text.replace(emojiRegex,'');
Boundary answered 30/10, 2018 at 14:26 Comment(2)
You might want to explain what and how the code you submitted works. – Alcoholometer
Please explain this answer as it has been marked for deletion. I would love for you to get credit on it so please post a what is happening in this snippit – Winding
S
0
<!DOCTYPE html>
<html>
<head>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script>
<script>
function isEmoji(str) {
    var ranges = [       
       '[\uE000-\uF8FF]',
       '\uD83C[\uDC00-\uDFFF]',
       '\uD83D[\uDC00-\uDFFF]',
       '[\u2011-\u26FF]',
       '\uD83E[\uDD10-\uDDFF]'         
    ];
    if (str.match(ranges.join('|'))) {
        return true;
    } else {
        return false;
    }
}
$(document).ready(function(){
 $('input').on('input',function(){
    var $th = $(this);
    console.log("Value of Input"+$th.val());
    emojiInput= isEmoji($th.val());
    if (emojiInput==true) {
        $th.val("");
    }
});
});
</script>
</head>
<body>
Enter your name: <input type="text">
</body>
</html>
Swim answered 19/8, 2019 at 13:50 Comment(1)
While this code may answer the question, providing additional context regarding why and/or how this code answers the question improves its long-term value. For more help see how to answer. – Libelant
I
0

In detail, this function first uses TextEncoder to convert content into a byte array with utf-8 encoding, then loops through this array, if it finds a byte whose first five bits are 11110 (i.e. 0xF0), it means this is an emoji start, then it replaces this byte and the next three bytes with 0x30 (i.e. number 0). Finally, it uses TextDecoder to convert the modified byte array back to a string, and uses replaceAll method to remove extra 0s.

function removeEmoji (content) {
     let conByte = new TextEncoder("utf-8").encode(content);
     for (let i = 0; i < conByte.length; i++) {
        if ((conByte[i] & 0xF8) == 0xF0) {
            for (let j = 0; j < 4; j++) {
                conByte[i+j]=0x30;
            }
            i += 3;
        }
     }
     content = new TextDecoder("utf-8").decode(conByte);
     return content.replaceAll("0000", "");
}
Isreal answered 16/2, 2023 at 2:30 Comment(2)
Please explain your answer. Don't just dump code and expect everyone to understand it. – Niemeyer
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center. – Awfully

© 2022 - 2024 β€” McMap. All rights reserved.