Is blocking RSS Feeds with robots.txt necessary?

nicole.healthline

Is it necessary to block an rss feed with robots.txt?

It seems they are automatically not indexed (http://googlewebmastercentral.blogspot.com/2007/12/taking-feeds-out-of-our-web-search.html)

And, google says here that it's important not to block RSS feeds

(http://googlewebmastercentral.blogspot.com/2009/10/using-rssatom-feeds-to-discover-new.html)

I'm just checking!

DaveSottimano

Hi Michelleh,

There's no need to block RSS feeds as they are used for discovery (Gbot). Here's a quirky fact: RSS feeds actually combat the scraper sites as they have absolute URLs which clearly link back to your site They're going to scrape your content anyhow, let's hope they choose RSS!

How does G know it's an RSS feed? Let's look at some of the markup on RSS pages:

<rss <span="">version</rss>="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel></channel>

Either this or something similar will be in the HTML that defines an XML/RSS/Atom/XSL document/markup - this is easily read by Google. Not going to get too far into it but you can start reading more here:

http://en.wikipedia.org/wiki/RSS

Does Google index the XML file type? **Yes. **

http://www.google.co.uk/search?hl=en&source=hp&biw=1366&bih=667&q=inurl%3Asitemap.xml&aq=f&aqi=&aql=&oq=

Does that help?

nicole.healthline

How do they know it is an RSS feed? Does google not index the xml filetype?

Thos003

If google says not to block it then don't block it. They may not index the RSS but they can still crawl the RSS.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Is blocking RSS Feeds with robots.txt necessary?

Browse Questions

Explore more categories

Related Questions

Crawl solutions for landing pages that don't contain a robots.txt file?

Do I need to block my cart page in robots.txt?

Staging & Development areas should be not indexable (i.e. no followed/no index in meta robots etc)

Block Domain in robots.txt

Googlebot does not obey robots.txt disallow

Block Quotes and Citations for duplicate content

What is the best method to block a sub-domain, e.g. staging.domain.com/ from getting indexed?

Robots.txt File Redirects to Home Page

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved