I have a text box implementation that uses pango. If i put a string that starts with a word in right-to-left script, followed by a space, followed by word in left-to-right based script, the word wrapping that pango uses gets messed up (using PANGO_WRAP_WORD_CHAR
). For the string العربية ENGLISH I get the following:
If I add the unicode character U+200F
after the space, then I get the expected word wrapping:
Also, if I replace the Arabic script above with Hindi (which is left-to-right like the English next to it) then I still get the problem, so it doesn't seem to be a strictly left-to-right, right-to-left thing. In the Hindi case, I put in a hack that inserts a 0x200E
after the space it resolves the problem.
Is this a bug in pango? Are there work-arounds I can try that are generic enough to fix the problem but not break other cases? The current work around I'm using inserts either a 0x200E
or 0x200F
after every space based on the direction of the previous strongly directed character in the string, but I'm not sure if there's certain strings that this will cause problems with.
Update: I was able to reproduce this problem on Ubuntu 12.04 with gedit (with Enable Text Wrapping and Do no split words over two lines settings enabled). I simply typed Hello world
over and over until it wrapped several times, then replaced all instances of world
with पहुंचगया
, and everything collapsed to a single line.