Detect if a user has typed an emoji character in UITextView

Asked 15/1, 2013 at 0:10 Answered 27/11, 2019 at 8:30

Solved ios objective-c unicode detect emoji

I have a UITextView and I need to detect if a user enters an emoji character.

I would think that just checking the unicode value of the newest character would suffice but with the new emoji 2s, some characters are scattered all throughout the unicode index (i.e. Apple's newly designed copyright and register logos).

Perhaps something to do with checking the language of the character with NSLocale or LocalizedString values?

Does anyone know a good solution?

Thanks!

Tullis answered 15/1, 2013 at 0:10 Comment(4)

Out of curiosity, why do you want to detect this? – Live 15/1, 2013 at 0:56

I'm making a text editor that adds text effects via HTML/CSS but the text is entered through a UITextField.... Emojis don't display properly with my CSS effects so I need to not allow users to use them. – Tullis 15/1, 2013 at 1:4

Back after 3 years - it may be possible to add them to a UILabel and see if the font assigned is AppleColorEmoji? You could also snapshot UILabel w/ the character and average the pixels into one and see if it's black, if it's not it's an emoji (with the exception of solid black emojis) – Tullis 22/9, 2016 at 22:46

Actually, don't add them to UILabel. Put them to the NSMutableAttributedString, and then call .fixAttributes on it. Then check what fonts are assigned to them. And you might want to check if it is anything other than Helvetica: there are certain characters that use other fonts. – Plataea 9/1, 2021 at 0:28

Over the years these emoji-detecting solutions keep breaking as Apple adds new emojis w/ new methods (like skin-toned emojis built by pre-cursing a character with an additional character), etc.

I finally broke down and just wrote the following method which works for all current emojis and should work for all future emojis.

The solution creates a UILabel with the character and a black background. CG then takes a snapshot of the label and I scan all pixels in the snapshot for any non solid-black pixels. The reason I add the black background is to avoid issues of false-coloring due to Subpixel Rendering

The solution runs VERY fast on my device, I can check hundreds of characters a second, but it should be noted that this is a CoreGraphics solution and should not be used heavily like you could with a regular text method. Graphics processing is data heavy so checking thousands of characters at once could result in noticeable lag.

-(BOOL)isEmoji:(NSString *)character {
    
    UILabel *characterRender = [[UILabel alloc] initWithFrame:CGRectMake(0, 0, 1, 1)];
    characterRender.text = character;
    characterRender.backgroundColor = [UIColor blackColor];//needed to remove subpixel rendering colors
    [characterRender sizeToFit];
    
    CGRect rect = [characterRender bounds];
    UIGraphicsBeginImageContextWithOptions(rect.size,YES,0.0f);
    CGContextRef contextSnap = UIGraphicsGetCurrentContext();
    [characterRender.layer renderInContext:contextSnap];
    UIImage *capturedImage = UIGraphicsGetImageFromCurrentImageContext();
    UIGraphicsEndImageContext();
    
    CGImageRef imageRef = [capturedImage CGImage];
    NSUInteger width = CGImageGetWidth(imageRef);
    NSUInteger height = CGImageGetHeight(imageRef);
    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
    unsigned char *rawData = (unsigned char*) calloc(height * width * 4, sizeof(unsigned char));
    NSUInteger bytesPerPixel = 4;
    NSUInteger bytesPerRow = bytesPerPixel * width;
    NSUInteger bitsPerComponent = 8;
    CGContextRef context = CGBitmapContextCreate(rawData, width, height,
                                                 bitsPerComponent, bytesPerRow, colorSpace,
                                                 kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);
    CGColorSpaceRelease(colorSpace);
    
    CGContextDrawImage(context, CGRectMake(0, 0, width, height), imageRef);
    CGContextRelease(context);
    
    BOOL colorPixelFound = NO;
    
    int x = 0;
    int y = 0;
    while (y < height && !colorPixelFound) {
        while (x < width && !colorPixelFound) {
            
            NSUInteger byteIndex = (bytesPerRow * y) + x * bytesPerPixel;
            
            CGFloat red = (CGFloat)rawData[byteIndex];
            CGFloat green = (CGFloat)rawData[byteIndex+1];
            CGFloat blue = (CGFloat)rawData[byteIndex+2];
            
            CGFloat h, s, b, a;
            UIColor *c = [UIColor colorWithRed:red green:green blue:blue alpha:1.0f];
            [c getHue:&h saturation:&s brightness:&b alpha:&a];
            
            b /= 255.0f;
            
            if (b > 0) {
                colorPixelFound = YES;
            }
            
            x++;
        }
        x=0;
        y++;
    }
    
    return colorPixelFound;
    
}

*Note: If Apple were to ever roll out a solid black emoji this technique could be improved by running the process twice, once with black font and black background, then again with white font and white background, and OR'ing the results.

Tullis answered 23/1, 2013 at 3:51 Comment(7)

I have never seen a solution where creating a hard-coded array of values was a good idea. This suggestion is exceptionally bad in being rife for error and not future-proof. A better solution would be the combined use of querying the textInputMode of the UITextView in question and seeing if the primaryLanguage is "emoji" – Flavorous 3/9, 2014 at 19:25

@Flavorous Yeah, unless they copy paste an emoji in, or use a custom keyboard (iOS8 App Extension) which has emoji character but has primaryLanguage set to English. – Tullis 4/9, 2014 at 17:14

Your second if block should have an else. Also, FYI, instead of writing if(bool) { return NO; } else { return YES; } you can, and may want to, write return !bool. – Wynn 23/7, 2015 at 22:46

Edited out my old solutions and wrote a CG solution, disregard previous comments – Tullis 23/2, 2017 at 23:20

@AlbertRenshaw Interesting solution, but I'd post it as a completely new answer and rollback this answer to its previous state. – Despinadespise 24/2, 2017 at 0:5

@AlbertRenshaw You may wish to see the community wiki answer I posted. It provides a much more efficient and cleaner implementation of your code (in both Objective-C and Swift). Thanks for your starting point. – Tracheotomy 18/6, 2019 at 8:9

FYI - rawData[byteIndex] returns an unsigned char with a value in the range 0-255. Casting it into a CGFloat makes no sense. You should divide the byte by 255 to convert it to a float in the range 0.0 - 1.0 for the use in the UIColor init. Also, the call to getHue:saturation:brightness:alpha: can fail so you should check the return value. Also, it returns values in the range 0.0 - 1.0 so dividing b by 255 isn't appropriate. And since you don't need the hue, saturation, or alpha, you can pass nil for those values. – Kazoo 21/4, 2023 at 22:7

First let's address your "55357 method" – and why it works for many emoji characters.

In Cocoa, an NSString is a collection of unichars, and unichar is just a typealias for unsigned short which is the same as UInt16. Since the maximum value of UInt16 is 0xffff, this rules out quite a few emoji from being able to fit into one unichar, as only two out of the six main Unicode blocks used for emoji fall under this range:

Miscellaneous Symbols (U+2600–U+26FF)
Dingbats (U+2700–U+27BF)

These blocks contain 113 emoji, and an additional 66 emoji that can be represented as a single unichar can be found spread around various other blocks. However, these 179 characters only represent a fraction of the 1126 emoji base characters, the rest of which must be represented by more than one unichar.

Let's analyse your code:

unichar unicodevalue = [text characterAtIndex:0];

What's happening is that you're simply taking the first unichar of the string, and while this works for the previously mentioned 179 characters, it breaks apart when you encounter a UTF-32 character, since NSString converts everything into UTF-16 encoding. The conversion works by substituting the UTF-32 value with surrogate pairs, which means that the NSString now contains two unichars.

And now we're getting to why the number 55357, or 0xd83d, appears for many emoji: when you only look at the first UTF-16 value of a UTF-32 character you get the high surrogate, each of which have a span of 1024 low surrogates. The range for the high surrogate 0xd83d is U+1F400–U+1F7FF, which starts in the middle of the largest emoji block, Miscellaneous Symbols and Pictographs (U+1F300–U+1F5FF), and continues all the way up to Geometric Shapes Extended (U+1F780–U+1F7FF) – containing a total of 563 emoji, and 333 non-emoji characters within this range.

So, an impressive 50% of emoji base characters have the the high surrogate 0xd83d, but these deduction methods still leave 384 emoji characters unhandled, along with giving false positives for at least as many.

So, how can you detect whether a character is an emoji or not?

I recently answered a somewhat related question with a Swift implementation, and if you want to, you can look at how emoji are detected in this framework, which I created for the purpose of replacing standard emoji with custom images.

Anyhow, what you can do is extract the UTF-32 code point from the characters, which we'll do according to the specification:

- (BOOL)textView:(UITextView *)textView shouldChangeTextInRange:(NSRange)range replacementText:(NSString *)text {

    // Get the UTF-16 representation of the text.
    unsigned long length = text.length;
    unichar buffer[length];
    [text getCharacters:buffer];

    // Initialize array to hold our UTF-32 values.
    NSMutableArray *array = [[NSMutableArray alloc] init];

    // Temporary stores for the UTF-32 and UTF-16 values.
    UTF32Char utf32 = 0;
    UTF16Char h16 = 0, l16 = 0;

    for (int i = 0; i < length; i++) {
        unichar surrogate = buffer[i];

        // High surrogate.
        if (0xd800 <= surrogate && surrogate <= 0xd83f) {
            h16 = surrogate;
            continue;
        }
        // Low surrogate.
        else if (0xdc00 <= surrogate && surrogate <= 0xdfff) {
            l16 = surrogate;

            // Convert surrogate pair to UTF-32 encoding.
            utf32 = ((h16 - 0xd800) << 10) + (l16 - 0xdc00) + 0x10000;
        }
        // Normal UTF-16.
        else {
            utf32 = surrogate;
        }

        // Add UTF-32 value to array.
        [array addObject:[NSNumber numberWithUnsignedInteger:utf32]];
    }

    NSLog(@"%@ contains values:", text);

    for (int i = 0; i < array.count; i++) {
        UTF32Char character = (UTF32Char)[[array objectAtIndex:i] unsignedIntegerValue];
        NSLog(@"\t- U+%x", character);
    }

    return YES;
}

Typing "😎" into the UITextView writes this to console:

😎 contains values:
    - U+1f60e

With that logic, just compare the value of character to your data source of emoji code points, and you'll know exactly if the character is an emoji or not.

P.S.

There are a few "invisible" characters, namely Variation Selectors and zero-width joiners, that also should be handled, so I recommend studying those to learn how they behave.

Despinadespise answered 7/12, 2016 at 3:37 Comment(2)

Thank you for that in depth explanation! I was wondering how it all worked. Surrogate pairs are interesting! It should also be noted that many emoji characters have been added to the unicode standard since my original post in 2013, back then it did account for almost all emojis I believe except maybe a few flags. I'll mark this as the new accepted answer, thanks again! – Tullis 7/12, 2016 at 6:42

No problems! And absolutely, it probably worked for most emoji, but also take into account that it would have dismissed a few hundred non-emoji characters. Yes, flags consist of two combined regional indicator symbols, so all flags would fall outside the range of the 55357 high surrogate. – Despinadespise 7/12, 2016 at 10:13

Another solution: https://github.com/woxtu/NSString-RemoveEmoji

Then, after import this extension, you can use it like this:

- (BOOL)textView:(UITextView *)textView shouldChangeTextInRange:(NSRange)range replacementText:(NSString *)text
{
    // Detect if an Emoji is in the string "text"
    if(text.isIncludingEmoji) {
        // Show an UIAlertView, or whatever you want here
        return NO;
    }

    return YES;
}

Hope that helps ;)

Schrock answered 8/1, 2015 at 12:56 Comment(1)

Please note that iOS 9.1 added more emojis that above mentioned method doesnt recognize (especially these ones:🤐🤑🤒🤓🤔🤕🤖🤗🤘🦀🦁🦂🦃🦄🧀). FIX: replace return (0x1d000 <= codepoint && codepoint <= 0x1f77f); in isEmoji method with return (0x1d000 <= codepoint && codepoint <= 0x1f77f) || (0x1F900 <= codepoint && codepoint <=0x1f9ff); – Porter 4/11, 2015 at 18:11

if your do not want your keyboard to show emoji you can use YOURTEXTFIELD/YOURTEXTVIEW.keyboardType = .ASCIICapable
This will show a keyboard with no emoji

Roentgenotherapy answered 2/6, 2016 at 7:30 Comment(1)

Yes but the user can still paste emojis in – Tullis 2/6, 2016 at 18:40

Here is the emoji detection method in Swift. It works fine. Hope it will help others.

 func isEmoji(_ character: String?) -> Bool {

        if character == "" || character == "\n" {
            return false
        }
        let characterRender = UILabel(frame: CGRect(x: 0, y: 0, width: 1, height: 1))
        characterRender.text = character
        characterRender.backgroundColor = UIColor.black  
        characterRender.sizeToFit()
        let rect: CGRect = characterRender.bounds
        UIGraphicsBeginImageContextWithOptions(rect.size, true, 0.0)

        if let contextSnap:CGContext = UIGraphicsGetCurrentContext() {
            characterRender.layer.render(in: contextSnap)
        }

        let capturedImage: UIImage? = (UIGraphicsGetImageFromCurrentImageContext())
        UIGraphicsEndImageContext()
        var colorPixelFound:Bool = false

        let imageRef = capturedImage?.cgImage
        let width:Int = imageRef!.width
        let height:Int = imageRef!.height

        let colorSpace = CGColorSpaceCreateDeviceRGB()

        let rawData = calloc(width * height * 4, MemoryLayout<CUnsignedChar>.stride).assumingMemoryBound(to: CUnsignedChar.self)

            let bytesPerPixel:Int = 4
            let bytesPerRow:Int = bytesPerPixel * width
            let bitsPerComponent:Int = 8

            let context = CGContext(data: rawData, width: Int(width), height: Int(height), bitsPerComponent: Int(bitsPerComponent), bytesPerRow: Int(bytesPerRow), space: colorSpace, bitmapInfo: CGImageAlphaInfo.premultipliedLast.rawValue | CGBitmapInfo.byteOrder32Big.rawValue)



        context?.draw(imageRef!, in: CGRect(x: 0, y: 0, width: width, height: height))

            var x:Int = 0
            var y:Int = 0
            while (y < height && !colorPixelFound) {

                while (x < width && !colorPixelFound) {

                    let byteIndex: UInt  = UInt((bytesPerRow * y) + x * bytesPerPixel)
                    let red = CGFloat(rawData[Int(byteIndex)])
                    let green = CGFloat(rawData[Int(byteIndex+1)])
                    let blue = CGFloat(rawData[Int(byteIndex + 2)])

                    var h: CGFloat = 0.0
                    var s: CGFloat = 0.0
                    var b: CGFloat = 0.0
                    var a: CGFloat = 0.0

                    var c = UIColor(red:red, green:green, blue:blue, alpha:1.0)
                    c.getHue(&h, saturation: &s, brightness: &b, alpha: &a)

                    b = b/255.0

                    if Double(b) > 0.0 {
                        colorPixelFound = true
                    }
                    x+=1
                }
                x=0
                y+=1
            }

        return colorPixelFound
}

Horticulture answered 30/4, 2019 at 9:43 Comment(2)

Thanks for converting! – Tullis 30/4, 2019 at 22:23

@Horticulture You may wish to see the community wiki answer I posted. It provides a more efficient and cleaner version. There were actually several issues in the original Objective-C code that carry over into your translation. – Tracheotomy 18/6, 2019 at 8:11

The following are cleaner and more efficient implementations of the code that checks to see if the drawn character has any color or not.

These have been written as category/extension methods to make them easier to use.

Objective-C:

NSString+Emoji.h:

#import <Foundation/Foundation.h>

@interface NSString (Emoji)

- (BOOL)hasColor;

@end

NSString+Emoji.m:

#import "NSString+Emoji.h"
#import <UIKit/UIKit.h>

@implementation NSString (Emoji)

- (BOOL)hasColor {
    UILabel *characterRender = [[UILabel alloc] initWithFrame:CGRectZero];
    characterRender.text = self;
    characterRender.textColor = UIColor.blackColor;
    characterRender.backgroundColor = UIColor.blackColor;//needed to remove subpixel rendering colors
    [characterRender sizeToFit];

    CGRect rect = characterRender.bounds;
    UIGraphicsBeginImageContextWithOptions(rect.size, YES, 1);
    CGContextRef contextSnap = UIGraphicsGetCurrentContext();
    [characterRender.layer renderInContext:contextSnap];
    UIImage *capturedImage = UIGraphicsGetImageFromCurrentImageContext();
    UIGraphicsEndImageContext();

    CGImageRef imageRef = capturedImage.CGImage;
    size_t width = CGImageGetWidth(imageRef);
    size_t height = CGImageGetHeight(imageRef);
    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
    size_t bytesPerPixel = 4;
    size_t bitsPerComponent = 8;
    size_t bytesPerRow = bytesPerPixel * width;
    size_t size = height * width * bytesPerPixel;
    unsigned char *rawData = (unsigned char *)calloc(size, sizeof(unsigned char));
    CGContextRef context = CGBitmapContextCreate(rawData, width, height,
                                                 bitsPerComponent, bytesPerRow, colorSpace,
                                                 kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);
    CGColorSpaceRelease(colorSpace);

    CGContextDrawImage(context, CGRectMake(0, 0, width, height), imageRef);
    CGContextRelease(context);

    BOOL result = NO;
    for (size_t offset = 0; offset < size; offset += bytesPerPixel) {
        unsigned char r = rawData[offset];
        unsigned char g = rawData[offset+1];
        unsigned char b = rawData[offset+2];

        if (r || g || b) {
            result = YES;
            break;
        }
    }

    free(rawData);

    return result;
}

@end

Example usage:

if ([@"😎" hasColor]) {
    // Yes, it does
}
if ([@"@" hasColor]) {
} else {
    // No, it does not
}

Swift:

String+Emoji.swift:

import UIKit

extension String {
    func hasColor() -> Bool {
        let characterRender = UILabel(frame: .zero)
        characterRender.text = self
        characterRender.textColor = .black
        characterRender.backgroundColor = .black
        characterRender.sizeToFit()
        let rect = characterRender.bounds
        UIGraphicsBeginImageContextWithOptions(rect.size, true, 1)

        let contextSnap = UIGraphicsGetCurrentContext()!
        characterRender.layer.render(in: contextSnap)

        let capturedImageTmp = UIGraphicsGetImageFromCurrentImageContext()
        UIGraphicsEndImageContext()
        guard let capturedImage = capturedImageTmp else { return false }

        let imageRef = capturedImage.cgImage!
        let width = imageRef.width
        let height = imageRef.height

        let colorSpace = CGColorSpaceCreateDeviceRGB()

        let bytesPerPixel = 4
        let bytesPerRow = bytesPerPixel * width
        let bitsPerComponent = 8
        let size = width * height * bytesPerPixel
        let rawData = calloc(size, MemoryLayout<CUnsignedChar>.stride).assumingMemoryBound(to: CUnsignedChar.self)

        guard let context = CGContext(data: rawData, width: width, height: height, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: CGImageAlphaInfo.premultipliedLast.rawValue | CGBitmapInfo.byteOrder32Big.rawValue) else { return false }

        context.draw(imageRef, in: CGRect(x: 0, y: 0, width: width, height: height))

        var result = false
        for offset in stride(from: 0, to: size, by: 4) {
            let r = rawData[offset]
            let g = rawData[offset + 1]
            let b = rawData[offset + 2]

            if (r > 0 || g > 0 || b > 0) {
                result = true
                break
            }
        }

        free(rawData)

        return result
    }
}

Example usage:

if "😎".hasColor() {
    // Yes, it does
}
if "@".hasColor() {
} else {
    // No, it does not
}

Tracheotomy answered 15/1, 2013 at 0:10 Comment(0)

Swift's String type has a property .isEmoji

Best to check the documentation for the isEmojiPresentation caveat

https://developer.apple.com/documentation/swift/unicode/scalar/properties/3081577-isemoji

Unless answered 27/11, 2019 at 8:30 Comment(2)

While this is explicitly an objective-c question, this will be useful for many visitors I'm sure, thank you. – Tullis 27/11, 2019 at 9:15

My pleasure, I was amazed to see those snapshot and check color answers also in Swift. In terms of Objc it really might be worth considering restructuring projects to run in Swift and import Objc code via a bridging header to access some of this functionality. – Unless 27/11, 2019 at 9:28

-3

Well you can detect whether it only has ascii characters using this:

[myString canBeConvertedToEncoding:NSASCIIStringEncoding];

It will say no if it fails (or has emoji). Then you can do a if else statement that does not allow them to click enter or something.

Peeples answered 15/1, 2013 at 2:41 Comment(5)

I wouldn't do that. Basically any non-English user needs diacritics, and these aren't ASCII. That would cause a lot of false positives. – Malposition 15/1, 2013 at 2:45

@Malposition then what type of encoding contains diacritics. In the code, you can change NSASCIIStringEncoding to some other encoding you know. – Peeples 15/1, 2013 at 2:53

only Unicode has emojis, but at the same time, only Unicode has all the characters of all languages. There's no single encoding that has all characters except emojis. That's why I don't like this solution. – Malposition 15/1, 2013 at 3:1

@Malposition Maybe if you do something like a and statement using &&and maybe have a or statement usint || It might work! (ex:

([myString canBeConvertedToEncoding:NSASCIIStringEncoding] || [myString canBeConvertedToEncoding:NSNSUTF8StringEncoding])

) – Peeples 15/1, 2013 at 3:9

All Unicode characters (including Emoji) can be converted to UTF-8, so this does not help. – Corridor 15/1, 2013 at 4:54

-4

Emoji characters length is 2 and so check if string length is 2 in method that is shouldChangeTextInRange: that is called after each key on keyboard hit

- (BOOL)textView:(UITextView *)textView shouldChangeTextInRange:(NSRange)range replacementText:(NSString *)text

{

    // Detect if an Emoji is in the string "text"
    if([text length]==2) {
        // Show an UIAlertView, or whatever you want here
        return YES;
    }
    else
{

       return NO;
}

}

Folacin answered 28/1, 2015 at 15:19 Comment(1)

No, not all emoji characters are length 2, and also there are MANY unicode characters with length 2 that will give false-positives for this. – Tullis 28/1, 2015 at 20:32

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

So, how can you detect whether a character is an emoji or not?

Recommended topics

Hot tags