Robots.txt file in MVC.NET 4
Asked Answered
R

2

9

I have read an article about ignoring the robots from some url in my ASP MVC.NET project. In his article author said that we should add some action in some off controllers like this. In this example he adds the action to the Home Controller:

#region -- Robots() Method --
public ActionResult Robots()
{
    Response.ContentType = "text/plain";
    return View();
}
#endregion

then we should add a Robots.cshtml file in our project with this body

@{
    Layout = null;
}
# robots.txt for @this.Request.Url.Host

User-agent: *
Disallow: /Administration/
Disallow: /Account/

and finally we should add this line of code to the Gloabal.asax

routes.MapRoute("Robots.txt",
                "robots.txt",
                new { controller = "Home", action = "Robots" });

my question is that do robots crawl the controllers which has [Authorization] attribute like Administration?

Rosa answered 1/6, 2015 at 16:29 Comment(1)
For those, who try to use the code above: it works, but you have to activate that the path "robots.txt" is handled by MVC routes in your web.conf, see: https://mcmap.net/q/1138129/-how-to-add-route-to-dynamic-robots-txt-in-asp-net-mvcMinutiae
P
14

do robots crawl the controllers which has [Authorization] attribute like Administration

If they find a link to it, they are likely to try and crawl it, but they will fail just like anyone with a web browser that does not log in. Robots have no special ability to access your website differently than a standard browser.

Note that robots that conform to the Robots Exclusion Standard crawl the exact URL

http://mydomain/robots.txt

You can create a response for that URL however you like. One approach is certainly to have a controller that handles that request. You can also just add a text file with the same content you would have returned from the controller, e.g.

User-agent: *
Disallow: /Administration/
Disallow: /Account/

to the root folder of your project and make sure it is marked as content so that it is deployed to the website.

Adding this robots.txt entry will prevent conforming robots from attempting to browse controllers that require authentication (and lighten the load on your website slightly), but without the robots file they will just try the URL and fail.

Phenology answered 1/6, 2015 at 16:37 Comment(2)
No buddy In my question I have mentioned that some of private Controllers disallowed for robots. Can they crawl private controllers which need authorization?Rosa
No they cannot, no less than someone with a web browser can crawl any URL that requires authorization. Updated my answer.Phenology
J
4

This simple piece of code worked for my asp net core 3.1 site:

    [Route("/robots.txt")]
    public ContentResult RobotsTxt()
    {
        var sb = new StringBuilder();
        sb.AppendLine("User-agent: *")
            .AppendLine("Disallow:")
            .Append("sitemap: ")
            .Append(this.Request.Scheme)
            .Append("://")
            .Append(this.Request.Host)
            .AppendLine("/sitemap.xml");

        return this.Content(sb.ToString(), "text/plain", Encoding.UTF8);
    }
Josejosee answered 1/7, 2020 at 14:48 Comment(2)
I think for newer MVC's, you can put the file into the wwwroot? I haven't tested it, but I guess it would work.Carrol
@Carrol wwwroot is an IIS specific thing. For my case, I'm developing and deploying to linux using kestrelPelotas

© 2022 - 2024 — McMap. All rights reserved.