unicode-normalization Questions
3
The Unicode Normalization FAQ includes the following paragraph:
Programs should always compare canonical-equivalent Unicode strings as equal ... The Unicode Standard provides well-defined normal...
Coverlet asked 13/4, 2013 at 8:37
2
If I accept full Unicode for passwords, how should I normalize the string before passing it to the hash function?
Goals
Without normalization, if someone sets their password to "mañana" (ma\u00F1...
Groningen asked 23/4, 2013 at 15:26
2
Solved
I am trying to use the normalizer_normalize() function introduced in PHP 5.3 (says the doc), however I can't use it:
$ php -r 'echo normalizer_normalize("tést");'
PHP Fatal error: Call to undefine...
Gama asked 21/1, 2012 at 0:32
5
Solved
I want to compare two strings in JavaScript that are the same, and yet the equality operator == returns false. One string contains a special character (eg. the danish å).
JavaScript code:
var fil...
Nomanomad asked 29/5, 2012 at 19:50
5
Solved
I'm trying to download some content from a dictionary site like http://dictionary.reference.com/browse/apple?s=t
The problem I'm having is that the original paragraph has all those squiggly lines,...
Remembrance asked 2/1, 2013 at 7:28
1
This question is related to text editing. Say you have a piece of text in normalization form NFC, and a cursor that points to an extended grapheme cluster boundary within this text. You want to ins...
Udine asked 18/3, 2021 at 14:49
4
Solved
I need to delete accents from characters in Spanish and others languages from different datasets.
I already did a function based in the code provided in this post that removes special the accents...
Freestyle asked 13/7, 2016 at 18:49
4
Solved
E.g., for the character "a", I want to get a string (list of chars) like "aàáâãäåāăą" (not sure if that example list is complete...) (basically all unicode chars with names "Latin Small Letter A wi...
Mantelletta asked 23/7, 2019 at 17:38
1
Solved
Hi im hoping this is a simple issue I am loading some simple data via an API however some users have made their username's in fancy fonts as below.
𝓦𝓮𝓫 𝓡𝓮𝓹𝓸𝓼𝓽𝓼
How to do I convert thi...
Telemetry asked 9/3, 2019 at 20:11
1
Solved
Are there JavaScript polyfill implementations of String.toLowerCase() and String.toUpperCase(), or other methods in JavaScript that can work with Unicode characters and are consistent across browse...
Osana asked 26/11, 2018 at 19:48
2
Solved
I'm working on a program that deals with Korean sentences and I need a way to break down a syllable, or block, into its letters. For those who don't know Hangul, a syllable is composed of 2-4 lette...
Dialysis asked 24/12, 2016 at 0:54
1
Solved
Java Normalize already allows me to take accented characters and output non-accented characters. It does not, however, seem to deal with composite characters (Œ, Æ) very well at all.
Is there a wa...
Andean asked 22/1, 2018 at 15:32
3
Solved
In Ruby, Javascript and Java (others I didn't try), have cyrillic chars Я̆ Я̄ Я̈ length 2. When I try to check length of string with these chars indside, I get bad output value.
"Я̈".mb_chars.leng...
Orison asked 15/1, 2018 at 22:57
4
Solved
I'm under the impression that JavaScript interpreter assumes that the source code it is interpreting has already been normalized. What, exactly does the normalizing? It can't be the text editor, ot...
Sextain asked 14/10, 2011 at 19:23
1
Solved
Elsewhere I've seen it told that Swift's comparisons use NFD normalization.
However, running in the iSwift playground I've found that
print("\u{0071}\u{0307}\u{0323}" == "\u{0071}\u{0323}\u{0307}...
Lancelancelet asked 31/1, 2016 at 15:44
7
Solved
consider this simple code:
echo iconv('UTF-8', 'ASCII//TRANSLIT', 'è');
it prints
`e
instead of just
e
do you know what I am doing wrong?
nothing changed after adding setlocale
setlo...
Samy asked 6/2, 2011 at 0:14
1
Solved
I am working a list of file names in Java.
I observe that some single characters in the file names, like a, ö and ü actually consist of a sequence you could describe as two single ASCII chars fol...
Ergonomics asked 4/11, 2015 at 10:34
3
Solved
When rendering the following Unicode text in HTML, it turns out that the browser (Google Chrome) do some form of Unicode normalization when posting the data back to the server. (Probably in Form C)...
Noontide asked 24/6, 2012 at 10:12
5
Solved
Am wondering how to normalize strings (containing utf-8/utf-16) in C/C++.
In .NET there is a function String.Normalize .
I used UTF8-CPP in the past but it does not provide such a function.
ICU a...
Raber asked 3/2, 2011 at 10:18
2
Solved
I want to match the lower case of "I" of English (i) to lower case of "İ" of Turkish (i). They are the same glyph but they don't match. When I do System.out.println("İ".toLowerCase()); the characte...
Lysander asked 9/6, 2015 at 6:45
5
Solved
Are there any standalonenish solutions for normalizing international unicode text to safe ids and filenames in Python?
E.g. turn My International Text: åäö to my-international-text-aao
plone.i18n...
Waldron asked 28/1, 2012 at 2:46
1
I have run into what is, to me, some serious weirdness with string behavior in Firefox when using the .normalize() Unicode normalization function.
Here is a demo, view the console in Firefox to se...
Bin asked 19/3, 2015 at 19:39
3
We upgraded our security scanner recently, and it's reporting a new issue.
What's the recommended fix? (We happen to be on ACF9.)
(Also, if you have an example exploit geared to CF, I'd appreciat...
Moye asked 17/6, 2013 at 21:27
2
Solved
In .NET you can normalize (NFC, NFD, NFKC, NFKD) strings with String.Normalize() and there is a Text.NormalizationForm enum.
In .NET for Windows Store Apps, both are not available. I have looked i...
Campman asked 8/2, 2013 at 12:59
1
Solved
I am trying to insert spaces into a string of IPA characters, e.g. to turn ɔ̃wɔ̃tɨ into ɔ̃ w ɔ̃ t ɨ. Using split/join was my first thought:
s = ɔ̃w̃ɔtɨ
s.split('').join(' ') #=> ̃ ɔ w ̃ ɔ p t ɨ...
Puerility asked 26/5, 2014 at 15:51
1 Next >
© 2022 - 2024 — McMap. All rights reserved.