I have 2 questions regarding crawlers and robots.
Background info
I only want Google and Bing to be excluded from the “disallow” and “noindex” limitations. In other words, I want ALL search engines except Google and Bing to follow the “disallow” and “noindex” rules. In addition, I would also like a “nosnippet” function for the search engines I mentioned (which all support “nosnippet”). Which code do I use to do this (using both robots.txt and X-Robots-Tag)?
I want to have it in both the robots.txt file as well as the htacess file as an X-Robots-Tag. I understand that robots.txt may be outdated, but I would like clear instructions to crawlers even if they’re considered “ineffective” and “outdated” unless you think otherwise.
Question 1
Did I get the following code right to only allow Google and Bing to index (to prevent other search engines from showing in their results), and, furthermore, prevent Bing and Google from showing snippets in their search results?
X-Robots-Tag code (Is this correct? Don't think I need to add "index" to googlebot and bingbot due to "index" being a default value, but not sure.)
X-Robots-Tag: googlebot: nosnippet
X-Robots-Tag: bingbot: nosnippet
X-Robots-Tag: otherbot: noindex
robots.txt code (Is this correct? I think the 1st one is, but not sure.)
User-agent: Googlebot
Disallow:
User-agent: Bingbot
Disallow:
User-agent: *
Disallow: /
or
User-agent: *
Disallow: /
User-agent: Googlebot
Disallow:
User-agent: Bingbot
Disallow:
Question 2: Conflicts between robots.txt and X-Robots-Tag
I anticipate conflicts between the robots.txt and the X-Robots-Tag due to the disallow function and the noindex functions not being allowed to work in conjunction (Is there any advantage of using X-Robot-Tag instead of robots.txt? ). How do I get around this, and what is your recommendation?
End goal
As mentioned, the main goal of this is to explicitly tell all older robots (still using the robots.txt) and all the newer ones except Google and Bing (using X-Robots-Tag) to not show any of my pages in their search results (which I'm assuming is summed up in the noindex function). I understand they may not all follow it, but I want them ALL to know except Google and Bing to not show my pages in search results. To this end, I am looking to find the right codes for both the robots.txt code and X-Robots-Tag code that will work without conflict for this function for the HTML sites I am trying to build.