Should sitemap be disallowed in robots.txt? and robot.txt itself? [closed]
Asked Answered
W

1

7

This a very basic question, but I can't find a direct answer anywhere online. When searching for my website on google, sitemap.xml and robots.txt are returned as search results (amongst more useful results). To prevent this should I add the following lines to robots.txt?:

Disallow: /sitemap.xml
Disallow: /robots.txt

This won't stop search engines accessing the sitemap or robots file?

Also/Instead should I use google's URL removal tool?

Webbed answered 1/7, 2011 at 18:48 Comment(0)
B
2

you won't stop the crawler from indexing robots.txt because its a chicken and the egg situation, however, if you aren't specifying google and other search engines to look directly at the sitemap, you could lose some indexing weight from denying your sitemap.xml. Is there a particular reason why you would want to not have users be able to see the sitemap? I actually do this which is specific just for the google crawler:

 Allow: /
 # Sitemap
 Sitemap: http://www.mysite.com/sitemap.xml
Bore answered 1/7, 2011 at 18:53 Comment(5)
I don't want to prevent users from seeing the sitemap file, but i just don't want it coming up in search results. Is there a way of doing this? As well as for robots.txt? I basically just want "useful" urls coming up in search results that contain website content.Webbed
well the thing about that is if your sitemaps and robots.txt files are getting more hits than your content you have to wonder the why of it. Your content should always pull the user more than an XML file. Again if you really are concerned you can do some back end server kungfu and use backend language or the web server to place whats called an X-Robots tag at the server head response. yoast.com/x-robots-tag-playBore
We will not index Sitemap (i.e. return a Sitemap in the results) unless it was linked from a public resource such as HTML page. If you list it only in the robot.txt file, we won't index it. One thing to note is that if you disallow the crawling of a Sitemap, we won't be able to crawl it and thus to use it.Slapdash
@Webbed no proper search engine is going to put your sitemap.xml or robots.txt up for grabs — they're specifically machine-read files in machine-read file formats. If your sitemap refers to any HTML file, and your sitemap is deemed to have any worth, then the search engine would present the that.Honeysuckle
@Slapdash [citation needed]Aimeeaimil

© 2022 - 2024 — McMap. All rights reserved.