ICU Layout sample renders text differently than Microsoft Notepad and Word
Asked Answered
B

1

5

I have a bidirectional text

1002   -ابو ماجد الانصاري

Most editors notepad++, notepad etc. show the text as it is shown here. But when I get this text processed through ICU the number is shifted to the right then spaces and hyphen and then Arabic. ICU's sample application layout.exe also shows the number on the right. I have modified paragraphlayout.cpp and set all possible reordering modes but result is still the same:

See Problem Text here

Can someone help to configure ICU to provide output as other display engines do.

Bookcase answered 16/11, 2017 at 10:14 Comment(4)
Why so C and C++ tags? You will get down votes for that. Keep it more specific to one language.Fink
How are you processing the text? Can you give a code sample of what you're doing?Luxurious
@Luxurious Layout.exe is a sample application included in ICU's samples. It also shows the number on the right. It can be downloaded from ICU WebsiteBookcase
Mixing Western punctuation and LTR text with Arabic RTL text is never not a problem. You can't expect the text renderer to always get that right, whatever version of "right" you favor. Sometimes "left" is better :) Be sure to include hints so the renderer knows what you want, U+200E and U+200F.Divulgate
A
6

If I understand correctly, your text 'begins' with the numeric, which is followed by the hyphen and text. Notepad and other editors let you choose the 'writing direction'. If you choose right-to-left, you get the same result as your screenshot,

RTL screenshot

If you want to maintain left-to-right writing direction, you can set it explicitly

ubidi_setPara(para, "1002   -ابو ماجد الانصاري", ‭25, UBIDI_LTR, NULL, pErrorCode);

or you can embed a UNICODE flag U+202A (LEFT-TO-RIGHT EMBEDDING) into your string that will enforce this direction. If your code is in C++, you can write something like

icu::UnicodeString string_to_layout = "\x202a";
string_to_layout += "1002   -ابو ماجد الانصاري";

and not you can use string_to_layout as input parameter for renderParagraph() (see http://icu-project.org/apiref/icu4c-latest/ubidi_8h.htm).

Aorist answered 16/11, 2017 at 11:24 Comment(1)
Thank you @Alex Cohn this worked. I was using UBIDI_DEFAULT_LTR.Bookcase

© 2022 - 2024 — McMap. All rights reserved.