htmlspecialchars(): Invalid multibyte sequence in argument
Asked Answered
M

6

16

I am getting this error in my local site.

Warning (2): htmlspecialchars(): Invalid multibyte sequence in argument in [/var/www/html/cake/basics.php, line 207]

Does anyone knows, what is the problem or what should be the solution for this?

Thanks.

Mccomb answered 27/9, 2010 at 12:52 Comment(0)
F
15

Be sure to specify the encoding to UTF-8 if your files are encoded as such:

htmlspecialchars($str, ENT_COMPAT, 'UTF-8');

The default charset for htmlspecialchars is ISO-8859-1 (as of PHP v5.4 the default charset was turned to 'UTF-8'), which might explain why things go haywire when it meets multibyte characters.

Faydra answered 27/9, 2010 at 12:55 Comment(3)
Line 207 is here. $charset = 'UTF-8'; htmlspecialchars($text, ENT_QUOTES, $charset); // Line 207Mccomb
For me, this problem ended up being the reverse, that my data's characterset was actually 'ISO-8859-1' when I was trying to encode it as 'UTF-8' in htmlspecialchars. I switched the charset argument to 'ISO-8859-1' and that resolved the problem. At least, until I can fully update everything to 'UTF-8'.Giliane
Starting from PHP 5.4.0, the default value of the 3rd parameter of htmlspecialchars() is 'UTF-8' - this answer should be updated.Massingill
N
5

I ran in to this error on production and found this great post about it -

http://insomanic.me.uk/post/191397106/php-htmlspecialchars-htmlentities-invalid

It appears to be a bug in PHP (for CentOS at least) that displays this error on when display errors is Off!

Nonpartisan answered 1/7, 2012 at 15:17 Comment(0)
W
4

You are feeding corrupted character data into the function, or not specifying the right encoding.

I had this issue a while ago, old behavior (prior to PHP 5.2.7 I believe) was to return the string despite corruption, but since that version it will throw this error instead.

My solution involved writing a script to feed my strings through iconv using the //IGNORE modifier to remove corrupted data.

(We had a corrupted database which had some strings in UTF-8, some in latin-1 usually with incorrectly defined character types on the columns).

(Looking at the comment to Tatu's answer, I would start by looking at (and playing with) the contents of the $charset variable.

Wherefore answered 27/9, 2010 at 19:30 Comment(2)
I agree. I've passed user data through iconv or mb_convert_encoding(), with the 'from' and 'to' charsets the same. There's usually an option to strip invalid characters.Cliquish
Corrupted data here as well, mb_convert_encoding($var, 'UTF-8') did the job.Trost
E
1

The correct code in order not to get any error is:

htmlentities($string, ENT_IGNORE, 'UTF-8') ;

Beside this you can also use str_replace to replace some bad characters to your needs and then use htmlentities function.

Have a look at this rss feed it replaced the greater html sign to gt; tag which might not look nice when reading thee rss feed. You can replace this with something like "-" sign or ")" and etc.

Elata answered 20/5, 2014 at 15:48 Comment(0)
A
1

Had the same problem because I was using substr on utf-8 string.
Error was infrequent and seemingly random. Error occurred only if string was cut on multibyte char!

mb_substr solved the problem :)

Arianna answered 23/10, 2014 at 13:10 Comment(0)
E
0

That's actually one of the most frequent errors I get.

Sometimes I dont use __() translation - just plain German text containing äöü. There it is especially important to mind the encoding of the files.

So make sure you properly save the files that contain special chars as UTF8.

Ex answered 28/9, 2010 at 22:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.