How to avoid google indexing my site under development is completed
Asked Answered
I

1

0

I have a new site and a new domain, will take about 2 months to complete development and then will go live. Only then I want google starts crawling and indexing my site. So the question is how to "shut off" google indexing for these 2 months before going live? Right now I plan to use this index.html:

<html>
<meta name="googlebot" content="noindex">
UNDER CONSTRUCTION
</html>

I will start development in index.php, when done I will remove index.html, then googlebot will start indexing starting from index.php.

Don't know if this sounds like a good plan.

Iciness answered 14/9, 2016 at 16:29 Comment(2)
You might want to google robots.txt filesTsunami
Why not just password-protect your folder so nobody can access the site without the proper credentials? Web crawlers cannot access pw-protected pages either. Contact your hosting provider. Or just google password protection with htaccess.Croatian
P
0

You can create robots.txt in your project directory and add following to it:

User-agent: *
Disallow: /

So when a bot reaches your page it first checks robots.txt file and if disallow is there it would crawl your pages. Read more about it here

Pectase answered 14/9, 2016 at 16:36 Comment(8)
Why was this downvoted? I use it and umm it works for me ? Randomly downvoting answers without commenting wont help anyone especially if answer is correct :/Pectase
robots.txt stops crawling, not indexing. The site could still be indexed. so this is not a solution.Hickok
Ok.. Try adding robots.txt with disallow and searching site:yourpage. It will throw a message saying robots.txt and wont index it.. Please get your concepts right :)Pectase
Also @Hickok can you please explain how would a page be indexed if it is not crawled?Pectase
Here is a resource for you all about robots.txt and direct to a relevant faq > developers.google.com/webmasters/control-crawl-index/docs/…Hickok
And here is a video form google which will help you understand how a page can be indexed, but not crawled youtube.com/watch?v=KBdEwpRQRD0Hickok
And here is Google saying > ..... If the page is blocked by a robots.txt file, the crawler will never see the noindex tag, and the page can still appear in search results, for example if other pages link to it. support.google.com/webmasters/answer/93710?hl=en Blocking crawling does not block indexing.Hickok
Technically, if site is new and under construction the answer makes a lot of sense as there are no external links to the site. Disallow in robots.txt: Blocking a page from being crawled will typically prevent pages from being indexed, as search engines are only able to index the pages they know about. While a page may be indexed due to links pointing to it from other pages, Google will aim to make the page less visible in search results.Biotic

© 2022 - 2024 — McMap. All rights reserved.