WEbsite cannot be crawled

threecounties

I have received the following message from MOZ on a few of our websites now

Our crawler was not able to access the robots.txt file on your site. This often occurs because of a server error from the robots.txt. Although this may have been caused by a temporary outage, we recommend making sure your robots.txt file is accessible and that your network and server are working correctly. Typically errors like this should be investigated and fixed by the site webmaster.

I have spoken with our webmaster and they have advised the below:

The Robots.txt file is definitely there on all pages and Google is able to crawl for these files. Moz however is having some difficulty with finding the files when there is a particular redirect in place.

For example, the page currently redirects from threecounties.co.uk/ to https://www.threecounties.co.uk/ and when this happens, the Moz crawler cannot find the robots.txt on the first URL and this generates the reports you have been receiving. From what I understand, this is a flaw with the Moz software and not something that we could fix form our end.

_Going forward, something we could do is remove these rewrite rules to www., but these are useful redirects and removing them would likely have SEO implications. _

Has anyone else had this issue and is there anything we can do to rectify, or should we leave as is?

Roman-Delcarmen

Ok, I made a quick test of your robot.txt file and looks fine,
https://www.threecounties.co.uk/robots.txt

Then I made a test https://httpstatus.io/ to check the status code
of your robot.txt file and show me 200 status code (So it's fine)

Also, you need to make sure that your robot.txt file is accessible for the Rogerbot (Moz crawler)
This day the hosting providers have become very strict with third-party crawlers
This includes Moz, Majestic SEO, Semrush and Ahrefs.

Here you can find all the possible sources of the problem and recommended solutions
https://outdoorsrank.com/help/guides/moz-pro-overview/site-crawl/unable-to-crawl

Regards

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

WEbsite cannot be crawled

Browse Questions

Explore more categories

Related Questions

Why Only Our Homepage Can Be Crawled Showing a Redirect Message as the Meta Title

GOOGLE ANALYTIC SKEWED DATA BECAUSE OF GHOST REFERRAL SPAM ND CRAWL BOTS

Best tools for an initial website health check?

Block Moz (or any other robot) from crawling pages with specific URLs

Meta Tag Descriptions not being found in Moz Crawls

How to force SeoMoz to re-crawl my website?

How to resolve Duplicate Content crawl errors for Magento Login Page

How long does a crawl take?

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved