codepoint Questions

3

Solved

What is the difference between String.prototype.codePointAt() and String.prototype.charCodeAt() in JavaScript? 'A'.codePointAt(); // 65 'A'.charCodeAt(); // 65
Interlanguage asked 10/4, 2016 at 8:40

3

Various programming languages use a 2-byte char datatype (not to be confused with C/C++'s char, which is just one byte) out of which strings are constructed. Various utility functions will try to f...
Brandi asked 15/10, 2020 at 18:18

3

Solved

The hex string '\xd3' can also be represented as: Ó. The easiest way I've found to get the character representation of the hex string to the console is: print unichr(ord('\xd3')) Or in English...
Chavira asked 9/8, 2011 at 16:44

4

Solved

Recently I ran into codePointAt method of String in Java. I found also a few other codePoint methods: codePointBefore, codePointCount etc. They definitely have something to do with Unicode but I do...
Liegnitz asked 5/9, 2012 at 11:51

4

Solved

I'm looking for sample 1-byte, 2-byte, 3-byte, 4-byte, 5-byte, and 6-byte unicode characters. Any links to some sort of reference of all the different unicode characters out there and how big they ...
Polemoniaceous asked 19/5, 2011 at 18:23

1

Solved

Unicode categorizes characters as belonging to a script, such as the Latin script. How do I test whether a particular character (code point) is in a particular script?
Distiller asked 30/5, 2020 at 23:18

2

Solved

When creating an emoji font, is any sequence of ZERO WIDTH JOINER valid? For instance: can I use 🏳‍★‍🟩 (Waving White Flag + zwj + Black Star + zwj + Green Square) to represent a white flag with...
Gold asked 1/5, 2020 at 8:0

4

Solved

Splitting a JavaScript string into "characters" can be done trivially but there are problems if you care about Unicode (and you should care about Unicode). JavaScript natively treats characters as...
Alleged asked 28/1, 2014 at 5:9

5

Solved

In Python API, is there a way to extract the unicode code point of a single character? Edit: In case it matters, I'm using Python 2.7.
Soluk asked 3/9, 2011 at 4:12

1

Solved

I'm just starting kotlin so I'm sure there is an easy way to do this but I don't see it. I want to split a into single-length sub strings using codepoints. In Java 8, this works: public class UtfS...
Slap asked 16/12, 2018 at 3:12

1

Solved

Why is the maximum Unicode code point restricted to 0x10FFFF? Is it possible to represent Unicode above this code point - for e.g. 0x10FFFF + 0x000001 = 0x110000 - through any encoding schemes like...
Filip asked 6/9, 2018 at 11:43

3

Solved

I have an application that is supposed to deal with all kinds of characters and at some point display information about them. I use Qt and its inherent Unicode support in QChar, QString etc. Now I...
Element asked 7/8, 2011 at 12:41

2

Solved

It appears that the red heart emoji (❤️) "\u2764\uFE0F" requires two Unicode codepoints, specifically Heavy Black Heart followed by a Variation Selector. However, blue 💙, green 💚, yellow 💛, and ...
Retrenchment asked 8/3, 2017 at 19:0

3

Solved

I have read many articles in order to know what is the maximum number of the Unicode code points, but I did not find a final answer. I understood that the Unicode code points were minimized to mak...
Colorist asked 11/12, 2014 at 5:26

4

Solved

I'm trying to output unicode string into RTF format. (using c# and winforms) From wikipedia: If a Unicode escape is required, the control word \u is used, followed by a 16-bit signed decimal in...
Garfield asked 2/9, 2009 at 14:23

2

Solved

In C++, it's possible create a UTF-8 string using this kind of notation: "\uD840\uDC50". However this doesn't work in PHP. Is there a similar notation? If not, is there any built-in way to create...
Pluton asked 19/4, 2013 at 6:44

2

Solved

I am trying to compare characters to see if they match. I can't figure out why it doesn't work. I'm expecting true on the output, but I'm getting false. character: "a" word: "aardvark" (first wor...
Pottle asked 31/1, 2014 at 23:15

4

Solved

For example, my $str = '中國c'; # Chinese language of china I want to print out the numeric values 20013,22283,99
Fourinhand asked 22/8, 2010 at 17:19

1

Solved

In C++ there is a way to cast a char to int and get the ascii value in return. Is there such a way to do the same with a qchar? Since unicode supports so many characters and some of them are actual...
Albric asked 21/8, 2013 at 17:56

3

Solved

In your experience which Unicode characters, codepoints, ranges outside the BMP (Basic Multilingual Plane) are the most common so far? These are the ones which require 4 bytes in UTF-8 or sur...
Bowery asked 6/4, 2011 at 13:36

2

Solved

The G-Clef (U+1D11E) is not part of the Basic Multilingual Plane (BMP), which means that it requires more than 16 bit. Almost all of Java's read functions return only a char or a int containing als...
Ostracod asked 28/6, 2013 at 9:14

2

Solved

Reading the Wikipedia article on UTF-8, I've been wondering about the term overlong. This term is used various times but the article doesn't provide a definition or reference for its meaning. I wo...
Tabathatabb asked 18/8, 2011 at 19:37

2

Solved

It seems that SQL Server uses Unicode UCS-2, a 2-byte fixed-length character encoding, for nchar/nvarchar fields. Meanwhile, C# uses Unicode UTF-16 encoding for its strings (note: Some people don't...
Satirical asked 13/4, 2011 at 20:36

2

I need to find out the names for Unicode characters when the user enters the number for it. An example would be to enter 0041 and get given "Latin Capital Letter A" as the result.
Verleneverlie asked 26/9, 2010 at 16:53

2

Solved

I added an answer to this question here: Sorting List<String> in C# which calls for a natural sort order, one that handles embedded numbers. My implementation, however, is naive, and in lieu...
Affront asked 15/9, 2010 at 11:26

© 2022 - 2024 — McMap. All rights reserved.