Converting windows-1255 to UTF-8 in PHP 5
Asked Answered
O

1

9

I have a page in my website which gets it's main content from an old mainframe. The content encoding from the mainframe is windows-1255 (Hebrew). My website's encoding is UTF-8.

At first I used an iframe to display the received answer from the mainframe. In that solution I had no problem setting the encoding of the page and the characters display was fine, but I had some problems styling the page responsively (My all website is responsive).

Then I tried fetching the content with file_get_contents and add it in the right place, but all the characters look like this: ����� ��, I then converted the content:

iconv("cp1255","UTF-8",file_get_contents("my_url"));

The result of that was reversed Hebrew. For example the word "nice" appears as "ecin". The content also includes HTML tags, not only Hebrew text, so I can't simply reverse the text with hebrev.

I saw that in PHP 4 the function fribidi_log2vis exists, which seems to solve my problem, but it's not supported in PHP 5 (I'm working with PHP 5.3.3).

Is there a way handling it better than loading the content into an iframe?

UPDATE

I tried to fetch a test file that I created (with encoding windows-1255) and my original code works OK. I suspect that the content I'm getting is not windows-1255, at least not in the terms of Hebrew letters order. The conversion on the mainframe might be the cause. I'll have to look into that (I have to wait until Sunday cause I don't have a direct access to the server).

Obliterate answered 3/1, 2014 at 15:44 Comment(7)
Have you tried mb_convert_encoding?Andromache
@Andromache mb_convert_encoding also results with reversed text.Obliterate
I know nothing about Hebrew but it seems you've converted to UTF-8 quite successfully; perhaps you just need to tweak your HTML markup to inform the browser that such text must be displayed as RTL.Carving
@ÁlvaroG.Vicario I set the page to RTL. The rest of the UTF-8 text in hebrew, like my menu text, is being displayed OK, but the converted text is reversed.Obliterate
BTW, fribidi_log2vis() is supported in PHP 5, it's just not bundled with PHP any more. See the PECL page for further details and even Windows downloads.Carving
First, you can cheat and only reverse hebrew substrings within the resulting string with some preg_replace_callback. Secondly, it appears as if the content coming from the mainframe is not cp1255, or the content contains some bidi symbols, which control the text direction. Anyhow it's hard to tell from here, but if you could upload an example file content we might be able to help furtherTeleology
Try this: #20601343Woo
S
2

The problem that file_get_contents geting the content with ISO 8859-1 as character encoding. You must create a stream context by function stream_context_create with charset Windows-1255 for file_get_contents:

$opts = array('http' => array('header' => 'Accept-Charset: windows-1255,utf-8;q=0.7,*;q=0.7'));
$context = stream_context_create($opts);

$content = file_get_contents('my_url', false, $context);
iconv("cp1255", "UTF-8", $content);
Saltire answered 3/1, 2014 at 16:44 Comment(5)
I don't know if this is the problem, but the solution you suggested is not working.Obliterate
what is the output after iconv?Saltire
Still the same result.Obliterate
Can yo give the url, where do you getting the content?Saltire
I know this as on old post but I got my answer but not from this post, but from this one: #15593894Garonne

© 2022 - 2024 — McMap. All rights reserved.