How can I avoid duplicate content in ASP.NET MVC due to case-insensitive URLs and defaults?
Asked Answered
C

7

24

Edit: Now I need to solve this problem for real, I did a little more investigation and came up with a number of things to reduce duplicate content. I posted detailed code samples on my blog: Reducing Duplicate Content with ASP.NET MVC

First post - go easy if I've marked this up wrong or tagged it badly :P

In Microsoft's new ASP.NET MVC framework it seems there are two things that could cause your content to be served up at multiple URLs (something which Google penalize for and will cause your PageRank to be split across them):

  • Case-insensitive URLs
  • Default URL

You can set the default controller/action to serve up for requests to the root of your domain. Let's say we choose HomeController/Index. We end up with the following URLs serving up the same content:

  • example.com/
  • example.com/Home/Index

Now if people start linking to both of these then PageRank would be split. Google would also consider it duplicate content and penalize one of them to avoid duplicates in their results.

On top of this, the URLs are not case sensitive, so we actually get the same content for these URLs too:

  • example.com/Home/Index
  • example.com/home/index
  • example.com/Home/index
  • example.com/home/Index
  • (the list goes on)

So, the question... How do I avoid these penalties? I would like:

  • All requests for the default action to be redirected (301 status) to the same URL
  • All URLs to be case sensitive

Possible?

Choosey answered 4/10, 2008 at 19:32 Comment(1)
Make sure you're not redirecting requests for images/style sheets/etc if they're in uppercased folders, as this will create a lot more round-tripping, meaning more latency for your visitors and more CPU/bandwidth for your site.Christoffer
P
3

Bump!

MVC 5 Now Supports producing only lowercase URLs and common trailing slash policy.

    public static void RegisterRoutes(RouteCollection routes)
    {
        routes.LowercaseUrls = true;
        routes.AppendTrailingSlash = false;
     }

Also on my application to avoid duplicate content on different Domains/Ip/Letter Casing etc...

http://yourdomain.example/en

https://yourClientIdAt.YourHostingPacket.example/

I tend to produce Canonical URLs based on a PrimaryDomain - Protocol - Controller - Language - Action

public static String GetCanonicalUrl(RouteData route,String host,string protocol)
{
    //These rely on the convention that all your links will be lowercase!
    string actionName = route.Values["action"].ToString().ToLower();
    string controllerName = route.Values["controller"].ToString().ToLower();
    //If your app is multilanguage and your route contains a language parameter then lowercase it also to prevent EN/en/ etc....
    //string language = route.Values["language"].ToString().ToLower();
    return String.Format("{0}://{1}/{2}/{3}/{4}", protocol, host, language, controllerName, actionName);
}

Then you can use @Gabe Sumner's answer to redirect to your action's canonical URL if the current request URL doesn't match it.

Procaine answered 22/12, 2016 at 9:16 Comment(1)
Cool; changed this to the accepted answer since it seems more relevant now :-)Choosey
S
11

I was working on this as well. I will obviously defer to ScottGu on this. I humbly offer my solution to this problem as well though.

Add the following code to global.asax:

protected void Application_BeginRequest(Object sender, EventArgs e)
{
    // If upper case letters are found in the URL, redirect to lower case URL.
    if (Regex.IsMatch(HttpContext.Current.Request.Url.ToString(), @"[A-Z]") == true)
    {
        string LowercaseURL = HttpContext.Current.Request.Url.ToString().ToLower();

        Response.Clear();
        Response.Status = "301 Moved Permanently";
        Response.AddHeader("Location",LowercaseURL);
        Response.End();
    }
}

A great question!

Sped answered 4/10, 2008 at 20:57 Comment(1)
This has a potential downside, as far as I can tell. Open Chrome (or another browser that has good debugging capabilities) and notice that all requests for images, stylesheets, javascript etc are redirected (assuming you have them in a folder called 'Content' or whatever.) You don't want the browser to have to double the number of requests for assets like these, so either ensure they're lowercase, or don't send 301s for links that aren't actually routes.Christoffer
C
9

As well as posting here, I emailed ScottGu to see if he had a good response. He gave a sample for adding constraints to routes, so you could only respond to lowercase urls:

public class LowercaseConstraint : IRouteConstraint
{
    public bool Match(HttpContextBase httpContext, Route route,
            string parameterName, RouteValueDictionary values,
            RouteDirection routeDirection)
    {
        string value = (string)values[parameterName];

        return Equals(value, value.ToLower());
    }

And in the register routes method:

public static void RegisterRoutes(RouteCollection routes)
{
    routes.IgnoreRoute("{resource}.axd/{*pathInfo}");

    routes.MapRoute(
        "Default",                                              // Route name
        "{controller}/{action}/{id}",                           // URL with parameters
        new { controller = "home", action = "index", id = "" },
        new { controller = new LowercaseConstraint(), action = new LowercaseConstraint() }
    );
}

It's a start, but 'd want to be able to change the generation of links from methods like Html.ActionLink and RedirectToAction to match.

Choosey answered 4/10, 2008 at 20:35 Comment(0)
P
3

Bump!

MVC 5 Now Supports producing only lowercase URLs and common trailing slash policy.

    public static void RegisterRoutes(RouteCollection routes)
    {
        routes.LowercaseUrls = true;
        routes.AppendTrailingSlash = false;
     }

Also on my application to avoid duplicate content on different Domains/Ip/Letter Casing etc...

http://yourdomain.example/en

https://yourClientIdAt.YourHostingPacket.example/

I tend to produce Canonical URLs based on a PrimaryDomain - Protocol - Controller - Language - Action

public static String GetCanonicalUrl(RouteData route,String host,string protocol)
{
    //These rely on the convention that all your links will be lowercase!
    string actionName = route.Values["action"].ToString().ToLower();
    string controllerName = route.Values["controller"].ToString().ToLower();
    //If your app is multilanguage and your route contains a language parameter then lowercase it also to prevent EN/en/ etc....
    //string language = route.Values["language"].ToString().ToLower();
    return String.Format("{0}://{1}/{2}/{3}/{4}", protocol, host, language, controllerName, actionName);
}

Then you can use @Gabe Sumner's answer to redirect to your action's canonical URL if the current request URL doesn't match it.

Procaine answered 22/12, 2016 at 9:16 Comment(1)
Cool; changed this to the accepted answer since it seems more relevant now :-)Choosey
O
2

I believe there is a better answer to this. If you put a canonical link in your page head like:

<link rel="canonical" href="http://example.com/Home/Index"/>

Then Google only shows the canonical page in their results and more importantly all of the Google goodness goes to that page with no penalty.

Outwardly answered 22/6, 2012 at 20:31 Comment(2)
Certainly an option, but only having one possible url in the first place is better, and doesn't rely on search engines being built to support this :-)Choosey
Why is it better? I think there are plenty of good business reasons for having the same page served by multiple urls. Also don't forget why this post was started - because Google penalize duplcate content. They support this link exactly because they don't want to penalize people who have duplicated content for good reasons.Outwardly
L
1

Like you, I had the same question; except I was unwilling to settle for an all-lowercase URL limitation, and did not like the canonical approach either (well, it's good but not on its own).

I could not find a solution, so we wrote and open-sourced a redirect class.

Using it is easy enough: each GET method in the controller classes needs to add just this one line at the start:

Seo.SeoRedirect(this);

The SEO rewrite class automatically uses C# 5.0's Caller Info attributes to do the heavy lifting, making the code above strictly copy-and-paste.

As I mention in the linked SO Q&A, I'm working on a way to get this converted to an attribute, but for now, it gets the job done.

The code will force one case for the URL. The case will be the same as the name of the controller's method - you choose if you want all caps, all lower, or a mix of both (CamelCase is good for URLs). It'll issue 301 redirects for case-insensitive matches, and caches the results in memory for best performance. It'll also redirect trailing backslashes (enforced for index listings, enforced off otherwise) and remove duplicate content accessed via the default method name (Index in a stock ASP.NET MVC app).

Leptophyllous answered 6/11, 2012 at 6:55 Comment(0)
P
0

i really don't know how you are going to feel after 8 years but Now ASP MVC 5 supports attribute routing for easy to remember routes and to solved duplicate content problems for SEO Friendly sites

just add routes.MapMvcAttributeRoutes(); in your RouteConfig and then define one and only route for each action like

    [Route("~/")]
    public ActionResult Index(int? page)
    {
        var query = from p in db.Posts orderby p.post_date descending select p;
        var pageNumber = page ?? 1;
        ViewData["Posts"] = query.ToPagedList(pageNumber, 7);         
        return View();
    }
    [Route("about")]
    public ActionResult About()
    {
        return View();
    }
    [Route("contact")]
    public ActionResult Contact()
    {
        return View();
    }
    [Route("team")]
    public ActionResult Team()
    {
        return View();
    }
    [Route("services")]
    public ActionResult Services()
    {
        return View();
    }
Puissance answered 26/1, 2016 at 3:11 Comment(1)
And for other routes enhancements , you can also usePuissance
A
0

Based on the answer from Gabe Sumner, but without redirects for JS, images and other content. Works only on controller actions. The idea is to do the redirect later in the pipeline when we already know its a route. For this we can use an ActionFilter.

public class RedirectFilterAttribute : ActionFilterAttribute
{
    public override void OnActionExecuting(ActionExecutingContext filterContext)
    {
        var url = filterContext.HttpContext.Request.Url;
        var urlWithoutQuery = url.GetLeftPart(UriPartial.Path);
        if (Regex.IsMatch(urlWithoutQuery, @"[A-Z]"))
        {
            string lowercaseURL = urlWithoutQuery.ToString().ToLower() + url.Query;
            filterContext.Result = new RedirectResult(lowercaseURL, permanent: true);
        }

        base.OnActionExecuting(filterContext);
    }
}

Note that the filter above does not redirect or change the casing for the query string.

Then bind the ActionFilter globally to all actions by adding it to the GlobalFilterCollection.

filters.Add(new RedirectFilterAttribute());

It is a good idea to still set the LowercaseUrls property to true on the RouteCollection.

Acinus answered 3/8, 2018 at 6:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.