Bingbot converts unicode characters to not understandable symbols
Asked Answered
J

1

7

I get a lot of errors from my site when bing trying to index some pages which have unicode characters.

For example:

http://www.example.com/kjøp 

Bing is trying to index

http://www.example.com/kjøp

Then I get en error "System.NullReferenceException: Object reference not set to an instance of an object." because there is no such controller.

Google works good with such links. How to help bing to understand norwegian letters?

Jemappes answered 30/7, 2014 at 8:21 Comment(6)
Do you explicitly specify the encoding/charset of your pages ?Weil
Do you mean this one? <meta http-equiv="content-type" content="text/html;charset=utf-8" /> I have it.Jemappes
Yep, that's what I meant. So if you have this tag and you indeed have valid utf-8 content, maybe the issue is on Bing side ? Btw, possible duplicate:Weil
How exactly do you know that Bing tries to index this URL?Juliennejuliet
I get error from Elmah "System.NullReferenceException: Object reference not set to an instance of an object." where HTTP_FROM is bingbot(at)microsoft.com and the wrong URL is example.com/kjøpJemappes
seems problem here, bing just messed up all the UTF-8 encoding in the URLExurbanite
H
0

You can confirm that Bing does not index these URLs correctly by doing an "INURL:" search like this... https://www.bing.com/search?q=inurl%3A%C3%B8

Only 6 pages are indexed which cannot be correct.

Unfortunately you won't be able to fix Bing. You may be able to do compensate for its shortcoming by making some changes to your site however. It is a burden that you shouldn't have to deal with. However the other option is to do nothing and continue not getting pages properly linked.

Bing will likely have issues with URLs containing characters in this list... https://www.i18nqa.com/debug/utf8-debug.html

Your webserver needs to look for URL requests containing these characters. You will then replace the wrong characters with the correct ones and do a 301 redirect to the correct page. The specifics depend on what kind of server and programming language you are using. In your case it is most likely IIS and MVC so you would most likely look into Microsoft's URL Rewrite extension. https://www.iis.net/downloads/microsoft/url-rewrite

Before doing this however I would see what errors Bing's webmaster tools might provide. https://www.bing.com/toolbox/webmaster

The other option is to not use those characters in your URL. My recommendation is to take the time to use the wrong to right translation. Bing will eventually fix this but it could be quite a while.

Haskel answered 5/7, 2018 at 17:57 Comment(1)
"could be quite a while?" :D An eternity... because these live in the stone age!Feaster

© 2022 - 2024 — McMap. All rights reserved.