I'm trying to display a data feed on a page. We're experiencing encoding issues with a weird character. For some reason, in the feed there's the U+FFFD
character. And htmlentities()
will not escape the character, so I need to replace it manually. (I'm using PHP 5.3)
I've tried the following:
$string = str_replace( "\xFFFD", "_", $string );
$string = str_replace( "\XFFFD", "_", $string );
$string = str_replace( "\uFFFD", "_", $string );
$string = str_replace("\x{FFFD}", "_", $string );
$string = str_replace("\X{FFFD}", "_", $string );
$string = str_replace("\P{FFFD}", "_", $string );
$string = str_replace("\p{FFFD}", "_", $string );
None of the above work.
After reading this page - http://php.net/manual/en/regexp.reference.unicode.php - I'm not sure what I'm doing wrong. Do I need to compile UTF-8 support into PCRE?
U+FFFD
character for what it's not meant to be. – Spermiogenesis