Arabic Problem Replace أً with just ا
Asked Answered
S

3

9

How to replace the alf bel tanween with a normal alf

Serendipity answered 13/1, 2011 at 16:7 Comment(2)
Any reason for wanting to use a regex for this?Pu
You might want to supply some additional contextual information such as how you're storing the string, etc.Czarism
S
1

Thanks to Bolo's enlightment after a couple of minutes of searching i did it like that:

string s = "";
        foreach (Char c in x)
        {
            if (((int)c).ToString("x").ToLower() != "64b")
                s += c.ToString();

        }

where x is my string

Like that I excluded the ARABIC FATHATAN from the string

Serendipity answered 16/1, 2011 at 8:52 Comment(0)
R
5

I don't know C#, but that's more a UNICODE question. I would do it by means of UNICODE normalization, using this function.

First, normalize to decomposed form. Next, filter out all characters from the "Mark, Nonspacing" category [Mn]. Finally, normalize back to composed form.

If I see correctly, your glyph is represented in UNICODE by ARABIC LETTER ALEF WITH HAMZA ABOVE (U+0623, [Lo]) followed by ARABIC FATHATAN (U+064B, [Mn]). The first character decomposes to ARABIC LETTER ALEF (U+0627, [Lo]) + ARABIC HAMZA ABOVE (U+0654, [Mn]).

Here's the chain of transformations (the first arrow indicates a decomposition, the second – filtering out nonspacing marks, the third – a composition):

U+0623 + U+064B → U+0627 + U+0654 + U+064B → U+0627 → U+0627

After you decompose, remove all the characters from the [Mn] category, and compose back, you're left with ARABIC LETTER ALEF only.

Rann answered 13/1, 2011 at 16:27 Comment(1)
I use this method to remove diacritics from texts written using Latin alphabet. Next, I need to handle a couple of exceptions, like Ł, but the described method covers most of the cases.Rann
V
2

Take a look at this project which provides examples of how to replace unicode characters in strings: http://www.codeproject.com/KB/string/FontGlyphSet.aspx

See also:

Vinosity answered 13/1, 2011 at 16:15 Comment(0)
S
1

Thanks to Bolo's enlightment after a couple of minutes of searching i did it like that:

string s = "";
        foreach (Char c in x)
        {
            if (((int)c).ToString("x").ToLower() != "64b")
                s += c.ToString();

        }

where x is my string

Like that I excluded the ARABIC FATHATAN from the string

Serendipity answered 16/1, 2011 at 8:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.