Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Moz Q&A is closed.

After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

Robots.txt Sitemap with Relative Path

Technical SEO

9186

MRCSearch last edited by

Hi Everyone,

In robots.txt, can the sitemap be indicated with a relative path? I'm trying to roll out a robots file to ~200 websites, and they all have the same relative path for a sitemap but each is hosted on its own domain.

Basically I'm trying to avoid needing to create 200 different robots.txt files just to change the domain. If I do need to do that, though, is there an easier way than just trudging through it?
1 Reply Last reply
Reply Quote 0
ClickConsult last edited by

Hi Nicholas,

Unfortunately not. The sitemap reference has to be absolute. (You can confirm this by using the crawler access tool within WMT's)

I'd suggest that you create a PHP script to create a robots.txt file with the correct domain rather than having to do it manually.

Good luck!
1 Reply Last reply
Reply Quote 1

Browse Questions

View

From

Sorted by

With category

Explore more categories

Related Questions

I have two robots.txt pages for www and non-www version. Will that be a problem?

There are two robots.txt pages. One for www version and another for non-www version though I have moved to the non-www version.
Technical SEO | | ramb

0
Crawl solutions for landing pages that don't contain a robots.txt file?

My site (www.nomader.com) is currently built on Instapage, which does not offer the ability to add a robots.txt file. I plan to migrate to a Shopify site in the coming months, but for now the Instapage site is my primary website. In the interim, would you suggest that I manually request a Google crawl through the search console tool? If so, how often? Any other suggestions for countering this Meta Noindex issue?
Technical SEO | | Nomader

1
Robots.txt Tester - syntax not understood

I've looked in the robots.txt Tester and I can see 3 warnings: There is a 'syntax not understood' warning for each of these. XML Sitemaps:
https://www.pkeducation.co.uk/post-sitemap.xml
https://www.pkeducation.co.uk/sitemap_index.xml How do I fix or reformat these to remove the warnings? Many thanks in advance.
Jim
Technical SEO | | JamesHancocks1

0
Disallow wildcard match in Robots.txt

This is in my robots.txt file, does anyone know what this is supposed to accomplish, it doesn't appear to be blocking URLs with question marks Disallow: /?crawler=1
Disallow: /?mobile=1 Thank you
Technical SEO | | AmandaBridge

0
Image Sitemap

I currently use a program to create our sitemap (xml). It doesn't offer creating an mage sitemaps. Can someone suggest a program that would create an image sitemap? Thanks.
Technical SEO | | Kdruckenbrod

0
Oh no googlebot can not access my robots.txt file

I just receive a n error message from google webmaster Wonder it was something to do with Yoast plugin. Could somebody help me with troubleshooting this? Here's original message Over the last 24 hours, Googlebot encountered 189 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%. Recommended action If the site error rate is 100%: Using a web browser, attempt to access http://www.soobumimphotography.com//robots.txt. If you are able to access it from your browser, then your site may be configured to deny access to googlebot. Check the configuration of your firewall and site to ensure that you are not denying access to googlebot. If your robots.txt is a static page, verify that your web service has proper permissions to access the file. If your robots.txt is dynamically generated, verify that the scripts that generate the robots.txt are properly configured and have permission to run. Check the logs for your website to see if your scripts are failing, and if so attempt to diagnose the cause of the failure. If the site error rate is less than 100%: Using Webmaster Tools, find a day with a high error rate and examine the logs for your web server for that day. Look for errors accessing robots.txt in the logs for that day and fix the causes of those errors. The most likely explanation is that your site is overloaded. Contact your hosting provider and discuss reconfiguring your web server or adding more resources to your website. After you think you've fixed the problem, use Fetch as Google to fetch http://www.soobumimphotography.com//robots.txt to verify that Googlebot can properly access your site.
Technical SEO | | BistosAmerica

0
OK to block /js/ folder using robots.txt?

I know Matt Cutts suggestions we allow bots to crawl css and javascript folders (http://www.youtube.com/watch?v=PNEipHjsEPU) But what if you have lots and lots of JS and you dont want to waste precious crawl resources? Also, as we update and improve the javascript on our site, we iterate the version number ?v=1.1... 1.2... 1.3... etc. And the legacy versions show up in Google Webmaster Tools as 404s. For example: http://www.discoverafrica.com/js/global_functions.js?v=1.1
http://www.discoverafrica.com/js/jquery.cookie.js?v=1.1
http://www.discoverafrica.com/js/global.js?v=1.2
http://www.discoverafrica.com/js/jquery.validate.min.js?v=1.1
http://www.discoverafrica.com/js/json2.js?v=1.1 Wouldn't it just be easier to prevent Googlebot from crawling the js folder altogether? Isn't that what robots.txt was made for? Just to be clear - we are NOT doing any sneaky redirects or other dodgy javascript hacks. We're just trying to power our content and UX elegantly with javascript. What do you guys say: Obey Matt? Or run the javascript gauntlet?
Technical SEO | | AndreVanKets

0
Robots.txt and canonical tag

In the SEOmoz post - http://www.seomoz.org/blog/robot-access-indexation-restriction-techniques-avoiding-conflicts, it's being said - If you have a robots.txt disallow in place for a page, the canonical tag will never be seen. Does it so happen that if a page is disallowed by robots.txt, spiders DO NOT read the html code ?
Technical SEO | | seoug_2005

0

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Robots.txt Sitemap with Relative Path

Browse Questions

Explore more categories

Related Questions

I have two robots.txt pages for www and non-www version. Will that be a problem?

Crawl solutions for landing pages that don't contain a robots.txt file?

Robots.txt Tester - syntax not understood

Disallow wildcard match in Robots.txt

Image Sitemap

Oh no googlebot can not access my robots.txt file

OK to block /js/ folder using robots.txt?

Robots.txt and canonical tag

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved