Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Soft 404's from pages blocked by robots.txt -- cause for concern?
-
We're seeing soft 404 errors appear in our google webmaster tools section on pages that are blocked by robots.txt (our search result pages).
Should we be concerned? Is there anything we can do about this?
-
Me too. It was that video that helped to clear things up for me. Then I could see when to use robots.txt vs the noindex meta tag. It has made a big difference in how I manage sites that have large amounts of content that can be sorted in a huge number of ways.
-
Good stuff. I was always under the impression they still crawled them (otherwise, how would you know if the block was removed).
-
Take a look at
http://www.youtube.com/watch?v=KBdEwpRQRD0
to see what I am talking about.
Robots.txt does prevent crawling according to Matt Cutts.
-
Robots.txt prevents indexation, not crawling. The good news is that Googlebot stops crawling 404s.
-
Just a couple of under the hood things to check.
-
Are you sure your robots.txt is setup correctly. Check in GWT to see that Google is reading it.
-
This may be a timing issue. Errors take 30-60 days to drop out (as what I have seen) so did they show soft 404 and then you added them to robots.txt?
If that was the case, this may be a sequence issue. If Google finds a soft 404 (or some other error) then it comes back to spider and is not able to crawl the page due to robots.txt - it does not know what the current status of the page is so it may just leave the last status that it found.
-
I tend to see soft 404 for pages that you have a 301 redirect on where you have a many to one association. In other words, you have a bunch of pages that are 301ing to a single page. You may want to consider changing where some of the 301s redirect so that they going to a specific page vs an index page.
-
If you have a page in robots.txt - you do not want them in Google, here is what I would do. Show a 200 on that page but then put in the meta tags a noindex nofollow.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93710
"When we see the noindex meta tag on a page, Google will completely drop the page from our search results, even if other pages link to it"
Let Google spider it so that it can see the 200 code - you get rid of the soft 404 errors. Then toss in the noindex nofollow meta tags to have the page removed from the Google index. It sounds backwards that you have to let Google spider to get it to remove stuff, but it works it you walk through the logic.
Good luck!
-
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can subdomains hurt your primary domain's SEO?
Our primary website https://domain.com has a subdomain https://subDomain.domain.com and on that subdomain we have a jive-hosted community, with a few links to and fro. In GA they are set up as different properties but there are many SEO issues in the jive-hosted site, in which many different people can create content, delete content, comment, etc. There are issues related to how jive structures content, broken links, etc. My question is this: Aside from the SEO issues with the subdomain, can the performance of that subdomain negatively impact the SEO performance and rank of the primary domain? I've heard and read conflicting reports about this and it would be nice to hear from the MOZ community about options to resolve such issues if they exist. Thanks.
Intermediate & Advanced SEO | | BHeffernan1 -
Paginated Pages Which Shouldnt' Exist..
Hi I have paginated pages on a crawl which shouldn't be paginated: https://www.key.co.uk/en/key/chairs My crawl shows: <colgroup><col width="377"></colgroup>
Intermediate & Advanced SEO | | BeckyKey
| https://www.key.co.uk/en/key/chairs?page=2 |
| https://www.key.co.uk/en/key/chairs?page=3 |
| https://www.key.co.uk/en/key/chairs?page=4 |
| https://www.key.co.uk/en/key/chairs?page=5 |
| https://www.key.co.uk/en/key/chairs?page=6 |
| https://www.key.co.uk/en/key/chairs?page=7 |
| https://www.key.co.uk/en/key/chairs?page=8 |
| https://www.key.co.uk/en/key/chairs?page=9 |
| https://www.key.co.uk/en/key/chairs?page=10 |
| https://www.key.co.uk/en/key/chairs?page=11 |
| https://www.key.co.uk/en/key/chairs?page=12 |
| https://www.key.co.uk/en/key/chairs?page=13 |
| https://www.key.co.uk/en/key/chairs?page=14 |
| https://www.key.co.uk/en/key/chairs?page=15 |
| https://www.key.co.uk/en/key/chairs?page=16 |
| https://www.key.co.uk/en/key/chairs?page=17 | Where is this coming from? Thank you0 -
Password Protected Page(s) Indexed
Hi, I am wondering if my website can get a penalty if some password protected pages are showing up when I search on google: site:www.example.com/sub-group/pass-word-protected-page That shows that my password protected page was indexed either before or after adding the password protection. I've seen people suggest no indexing the page. Is that the best method to take care of this? What if we are planning on pushing the page live later on? All of these pages have no title tag, meta description, image alt text, etc. Should I add them for each page? I am wondering what is the best step, especially if we are planning on pushing the page(s) live. Thanks for any help!
Intermediate & Advanced SEO | | aua0 -
404's - Do they impact search ranking/how do we get rid of them?
Hi, We recently ran the Moz website crawl report and saw a number of 404 pages from our site come back. These were returned as "high priority" issues to fix. My question is, how do 404's impact search ranking? From what Google support tells me, 404's are "normal" and not a big deal to fix, but if they are "high priority" shouldn't we be doing something to remove them? Also, if I do want to remove the pages, how would I go about doing so? Is it enough to go into Webmaster tools and list it as a link no to crawl anymore or do we need to do work from the website development side as well? Here are a couple of examples that came back..these are articles that were previously posted but we decided to close out: http://loyalty360.org/loyalty-management/september-2011/let-me-guessyour-loyalty-program-isnt-working http://loyalty360.org/resources/article/mark-johnson-speaks-at-motivation-show Thanks!
Intermediate & Advanced SEO | | carlystemmer0 -
Is a 404, then a meta refresh 301 to the home page OK for SEO?
Hi Mozzers I have a client that had a lot of soft 404s that we wanted to tidy up. Basically everything was going to the homepage. I recommended they implement proper 404s with a custom 404 page, and 301 any that really should be redirected to another page. What they have actually done is implemented a 404 (without the custom 404 page) and then after a short delay 301 redirected to the homepage. I understand why they want to do this as they don't want to lose the traffic, but is this a problem with SEO and the index? Or will Google treat as a hard 404 anyway? Many thanks
Intermediate & Advanced SEO | | Chammy0 -
Chinese Sites Linking With Bizarre Keywords Creating 404's
Just ran a link profile, and have noticed for the first time many spammy Chinese sites linking to my site with spammy keywords such as "Buy Nike" or "Get Viagra". Making matters worse, they're linking to pages that are creating 404's. Can anybody explain what's going on, and what I can do?
Intermediate & Advanced SEO | | alrockn0 -
Do Q&A 's work for SEO
If I create a good community in my particular field on my SEO site and have a quality Q&A section like this etc (ripping of MOZ's idea here sorry, I hope it's ok) will the long term returns be worth the effort of creating and man ageing this. Is the user created content of as much use as I think it will be?
Intermediate & Advanced SEO | | mark_baird0 -
A few questions on Google's Structured Data Markup Helper...
I'm trying to go through my site and add microdata with the help of Google's Structured Data Markup Helper. I have a few questions that I have not been able to find an answer for. Here is the URL I am referring to: http://www.howlatthemoon.com/locations/location-chicago My company is a bar/club, with only 4 out of 13 locations serving food. Would you mark this up as a local business or a restaurant? It asks for "URL" above the ratings. Is this supposed to be the URL that ratings are on like Yelp or something? Or is it the URL for the page? Either way, neither of those URLs are on the page so I can't select them. If it is for Yelp should I link to it? How do I add reviews? Do they have to be on the page? If I make a group of days for Day of the Week for Opening hours, such as Mon-Thu, will that work out? I have events on this page. However, when I tried to do the markup for just the event it told me to use itemscope itemtype="http://schema.org/Event" on the body tag of the page. That is just a small part of the page, I'm not sure why I would put the event tag on the whole body? Any other tips would be much appreciated. Thanks!
Intermediate & Advanced SEO | | howlusa0