Google 404 soft error on index page that is working fine
Asked Answered
F

4

11

A friend of mine has been having trouble getting her site indexed by google and asked me to have a look, but that is not something I really know much about and was hoping for some assistance.

Looking at her search console, google crawl shows an error of soft-404 on the index page. I marked this as fixed a few times, because the site looks fine to me but it keeps coming back.

If I fetch the site as google it seems to be working fine, although it is showing the mobile version instead of the desktop.

enter image description here

It keeps giving another reoccurring 404 of a page http://www.smeyan.com/new-page, which doesn't exist anywhere I can see including server files or sitemaps.

Here is what I know about this site:

It used to be a wix site and was moved to a host gator shared server 2-3 months ago.

It's using JavaScript/jQuery .load to get page content outside the index.html template.

It has 2 sitemaps one for the URLs and one for both URLs and images http://www.smeyan.com/sitemap_url.xml http://www.smeyan.com/sitemap.xml

It has been about 2 months since it was submitted for indexing and google has not indexed any of the content when you search for site:www.smeyan.com it shows some old stuff from the wix server. Although search console says it has 172 images indexed.

it has www. as a preference set in search console.

Has anyone experienced this and has an direction for a fix?

Fireproof answered 23/12, 2017 at 23:19 Comment(2)
Just a guess - can it be related to the fact that js rendering is used? Was it previously rendered via javascript too? Related article elephate.com/blog/javascript-seo-experimentLagging
Some quick guesses - this may be a crawl budget issue as Wix sites can be slow (effectively a 'timeout' when Google tries to crawl the site). Or it could be a redirection issue - what exact http responses are coming back if you use Fiddler or Postman to request the site?Ataractic
R
2

How long time was set for this site in Cache-Control header? If long, you should use "google removals" for obsolete snippets and cache. I simulated Google visit on your webpage. Correct 404 return code. Correct headers. Thus. Report google removals for "not found" pages. You must request visit of Googlebot and keep calm and wait for reaction.

BTW: For permanently removed content use 410 Gone for Google or... report via Removals. https://support.google.com/webmasters/answer/1663419?hl=en

Replete answered 1/1, 2018 at 16:19 Comment(2)
Google hasn't indexed any pages to be removed. It has been a little over 3 months since the site was requested for indexing the 1st time. Search console crawl rates show a consistent 3-4 pages crawled per day.Fireproof
If the google has moved from WIX the google can detect duplicated content. The duplicated content is often filtered. Can you send link to your content on WIX? The WIX is deindexed currently? Also: please read my second answer. SEO issue.Replete
R
1

I checked your site with Tor Browser which has... DISABLED SCRIPTS. You should provide any content on your site with use of <noscript/> tag. It doesn't have to be beautiful but should be visible for bots. <a href... ></a>, <img/> etc. and... TEXT. Without it the site is NOT OPTIMIZED for search bots. Read about SEO. The sitemap content can be never indexed if the content will be never linked.

Probably your webpage also doesn't meet requirements for screen readers (for blind people).

enter image description here Note: The image with "SMEYAN" caption is visible on webpage and is indexed.

second image on the webpage (in source): <img class="gallery-full-image" src="./galleries/home_gallery/smeyan_home-1.jpg" /> and indexed

The menu also doesn't work without scripts.

I thought the step is good implemented.

Please use <noscript/> element and implement version for blind people (without scripts, provide alt tag for images) and for noscript browsers. You can test it via disabling script or via NOSCRIPT extension for Firefox.

BTW. You should use HTML, CSS (including animations) and... use the JS ONLY if it is needed. Or... <noscript/> method.

Replete answered 3/1, 2018 at 19:14 Comment(1)
Use also W3C validator. You have errors in HTML. Check also your CSS on jigsaw validator. validator.w3.org/nu/?doc=http%3A%2F%2Fwww.smeyan.com%2FReplete
S
0

The only download error that I saw while using Chrome's Inspect function pertains to a SCRIPT tag with a Facebook url as the source (src) file.

This is the error as reported by Inspect. enter image description here

This is the SCRIPT tag that caused the error. enter image description here

I am not sure that this is the cause of the reoccurring 404 error, but it is an issue that needs attention on this website.

Spermophile answered 3/1, 2018 at 15:57 Comment(1)
It is an issue to improve but... many pages with broken 404 links are indexed properly. The problem is probably duplicated content. The Google can now not show the content but part of Google servers can still have it saved.Replete
O
0

Google bot currently use web rendering service (WRS) that is based on old Chrome 41 (M41), so it may fail where browsers succeed.

To learn how google boot works read this.

Add this code to the page to see the real error. You can see the error using Url Inspector live, from google search console. It will show at more info tab.

Note: if the bot gets a 301 code or if the page is too little to have significant content it will return a soft 404 error, and won't preview or show any other error.

Olimpia answered 31/3, 2019 at 23:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.