Why ASP.NET Core convert Persian(or Arabic) text to Character reference (&#xhhhh;) in view
Asked Answered
C

4

10

The source code:

@{ ViewBag.Title = "سلام علیک"; }

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8">
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <title>@ViewBag.Title</title>
</head>
<body>

    <div class="container" dir="rtl">
        @RenderBody()
    </div>

</body>
</html>

It's well rendered in browser but I want the same text in html source (for some search engine optimizer software)

ViewBag problem in Arabic text

And the output:

<!DOCTYPE html>
<html>
<head>
    <title>&#x633;&#x644;&#x627;&#x645; &#x639;&#x644;&#x6CC;&#x6A9;</title>
</head>
<body>
...
</body>
</html>
Curculio answered 25/10, 2016 at 6:37 Comment(0)
W
32

Because, by default, the HTML encoding engine will only safelist the basic latin alphabet (because browsers have bugs. So we're trying to protect against unknown problems). The &XXX values you see still render as correctly as you can see in your screen shots, so there's no real harm, aside from the increased page size.

If the increased page size bothers you then you can customise the encoder to safe list your own character pages (not language, Unicode doesn't think in terms on language)

To widen the characters treated as safe by the encoder you would insert the following line into the ConfigureServices() method in startup.cs;

services.AddSingleton<HtmlEncoder>( HtmlEncoder.Create(allowedRanges: new[] { UnicodeRanges.BasicLatin, UnicodeRanges.Arabic }));

Arabic has quite a few blocks in Unicode, so you may need to add more blocks to get the full range you need.

Willis answered 25/10, 2016 at 17:3 Comment(3)
awesome, I am very thankful that you are considering my problem.Frivolous
Very interesting, but is this safe? I mean I only use your proposal as I think about SEO optimization, but should I do other stuff for safety? Also, if I didn't do this and kept the view source as is encoded, will it be SEO friendly?Speechmaking
Love you man...Aphid
P
0

Even in Blazor Web assembly, I solved this problem by copying the below line inside program.cs file:

builder.Services.AddSingleton<HtmlEncoder>(HtmlEncoder.Create(allowedRanges: new[ ] { UnicodeRanges.BasicLatin, UnicodeRanges.Arabic }));

If you are working with server projects or previous version's of core that has startup.cs you can write like this:

services.AddSingleton<HtmlEncoder>( HtmlEncoder.Create( allowedRanges: new[ ] { UnicodeRanges.BasicLatin, UnicodeRanges.Arabic } ) );
Poulenc answered 14/3, 2023 at 22:11 Comment(0)
P
-1

For non ACII chars, I recommend to use UTF-8 as the charset. You can add this line into your HTML file (shared layout). in the <head> tag.

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

And set dir="rtl" and lang="ar", use like :

<p dir="rtl" lang="ar" ">سلام علیک</p>

Also You can use ViewData["Title"] instead ViewBag.Title it should give same result.

Character encodings in HTML-wiki

Procaine answered 25/10, 2016 at 7:1 Comment(11)
I did but it doesn't change the result :|Frivolous
@Soren Try also with added dir and lang.Nebuchadnezzar
<p dir="rtl" lang="ar">&#x633;&#x644;&#x627;&#x645; &#x639;&#x644;&#x6CC;&#x6A9;</p>Frivolous
Are you try set title withou use ViewBag.Title? Just use <title dir="rtl" lang="ar">سلام علیک</title>Nebuchadnezzar
yup, without ViewBag everything is alright. but ViewBag changes it.Frivolous
even with @Html.Raw(ViewBag.Title) is OK. but I don't want to use Html.Raw in this case.Frivolous
I understand , can you Add <html lang="ar"> under "<!DOCTYPE html>" .Nebuchadnezzar
I did, same result :|Frivolous
Use ViewData["Title"] instead ViewBag.Title. I tested and it working with arabic language.Nebuchadnezzar
I test in 2 different project, unfortunately neither of them doesn't work.Frivolous
Let us continue this discussion in chat.Nebuchadnezzar
H
-2

You have to set the character encoding for the response to be UTF-8 in order to be able to output the non-Unicode characters like Arabic

<configuration>
  <system.web>
    <globalization requestEncoding="utf-8" responseEncoding="utf-8" />
  </system.web>
</configuration>
Hirz answered 25/10, 2016 at 9:54 Comment(2)
I added a picture for disambiguation.Frivolous
ASP.NET Core doesn't use web.config for its settings.Willis

© 2022 - 2024 — McMap. All rights reserved.