Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Robots.txt is blocking Wordpress Pages from Googlebot?
-
I have a robots.txt file on my server, which I did not develop, it was done by the web designer at the company before me. Then there is a word press plugin that generates a robots.txt file. How Do I unblock all the wordpress pages from googlebot?
-
Delete everything under the following directives and you should be good.
User-agent: Googlebot
Disallow: /*/trackback
Disallow: /*/feed
Disallow: /*/comments
Disallow: /?
Disallow: /*?
Disallow: /page/
As a rule of thumb, it's not a good idea to use wild cards in your robots.txt file - you may be excluding an entire folder inadvertently.
-
Here is my robots.txt from google webmaster tools. These are the pages that are being blocked and I am not sure which of these to get rid of in order to unblock blog posts from being searched.
http://ensoplastics.com/theblog/?cat=743
http://ensoplastics.com/theblog/?p=240
These category pages and blog posts are blocked so do I delete the /? ...I am new to SEO and web development so I am not sure why the developer of this robots.txt file would block pages and posts in wordpress. It seems to me like that is the reason why someone has a blog so it can be searched and get more exposure for SEO purposes.
Sitemap: http://www.ensobottles.com/blog/sitemap.xml
User-agent: Googlebot
Disallow: /*/trackback
Disallow: /*/feed
Disallow: /*/comments
Disallow: /?
Disallow: /*?
Disallow: /page/
User-agent: *Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/themes/
Disallow: /trackback
Disallow: /commentsDisallow: /feed
-
I'm not sure to what extent your website is being blocked with the robots.txt file but it's pretty easy to diagnose. You'll first need to identify and confirm that googlebot is being blocked by typing in your web browser ~> www.mywebsite.com/robots.txt
If you see an entry such as "User-agent: *" or "User-agent: googlebot" being used in conjunction with "Disallow" then you know your website is being blocked with the robots.txt file. Given your situation you'll need to go through a two step process.
First, go into your wordpress plugin page and deactivate the plugin which generates your robots.txt file. Second, login to the root folder of your server and look for the robots.txt file. Lastly, change "Disallow" to "Allow" and that should work but you'll need to confirm by typing in the robots URL again.
Given the limited information in your question I hope that helps. If you run into any more issues don't hesitate to post them here.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I apply Canonical Links from my Landing Pages to Core Website Pages?
I am working on an SEO project for the website: https://wave.com.au/ There are some core website pages, which we want to target for organic traffic, like this one: https://wave.com.au/doctors/medical-specialties/anaesthetist-jobs/ Then we have basically have another version that is set up as a landing page and used for CPC campaigns. https://wave.com.au/anaesthetists/ Essentially, my question is should I apply canonical links from the landing page versions to the core website pages (especially if I know they are only utilising them for CPC campaigns) so as to push link equity/juice across? Here is the GA data from January 1 - April 30, 2019 (Behavior > Site Content > All Pages😞
Intermediate & Advanced SEO | | Wavelength_International0 -
URL structure - Page Path vs No Page Path
We are currently re building our URL structure for eccomerce websites. We have seen a lot of site removing the page path on product pages e.g. https://www.theiconic.co.nz/liberty-beach-blossom-shirt-680193.html versus what would normally be https://www.theiconic.co.nz/womens-clothing-tops/liberty-beach-blossom-shirt-680193.html Should we be removing the site page path for a product page to keep the url shorter or should we keep it? I can see that we would loose the hierarchy juice to a product page but not sure what is the right thing to do.
Intermediate & Advanced SEO | | Ashcastle0 -
What does Disallow: /french-wines/?* actually do - robots.txt
Hello Mozzers - Just wondering what this robots.txt instruction means: Disallow: /french-wines/?* Does it stop Googlebot crawling and indexing URLs in that "French Wines" folder - specifically the URLs that include a question mark? Would it stop the crawling of deeper folders - e.g. /french-wines/rhone-region/ that include a question mark in their URL? I think this has been done to block URLs containing query strings. Thanks, Luke
Intermediate & Advanced SEO | | McTaggart0 -
If Robots.txt have blocked an Image (Image URL) but the other page which can be indexed has this image, how is the image treated?
Hi MOZers, This probably is a dumb question but I have a case where the robots.tags has an image url blocked but this image is used on a page (lets call it Page A) which can be indexed. If the image on Page A has an Alt tags, then how is this information digested by crawlers? A) would Google totally ignore the image and the ALT tags information? OR B) Google would consider the ALT tags information? I am asking this because all the images on the website are blocked by robots.txt at the moment but I would really like website crawlers to crawl the alt tags information. Chances are that I will ask the webmaster to allow indexing of images too but I would like to understand what's happening currently. Looking forward to all your responses 🙂 Malika
Intermediate & Advanced SEO | | Malika11 -
Disallow URLs ENDING with certain values in robots.txt?
Is there any way to disallow URLs ending in a certain value? For example, if I have the following product page URL: http://website.com/category/product1, and I want to disallow /category/product1/review, /category/product2/review, etc. without disallowing the product pages themselves, is there any shortcut to do this, or must I disallow each gallery page individually?
Intermediate & Advanced SEO | | jmorehouse0 -
"noindex, follow" or "robots.txt" for thin content pages
Does anyone have any testing evidence what is better to use for pages with thin content, yet important pages to keep on a website? I am referring to content shared across multiple websites (such as e-commerce, real estate etc). Imagine a website with 300 high quality pages indexed and 5,000 thin product type pages, which are pages that would not generate relevant search traffic. Question goes: Does the interlinking value achieved by "noindex, follow" outweigh the negative of Google having to crawl all those "noindex" pages? With robots.txt one has Google's crawling focus on just the important pages that are indexed and that may give ranking a boost. Any experiments with insight to this would be great. I do get the story about "make the pages unique", "get customer reviews and comments" etc....but the above question is the important question here.
Intermediate & Advanced SEO | | khi50 -
Effect of Removing Footer Links In all Pages Except Home Page
Dear MOZ Community: In an effort to improve the user interface of our business website (a New York CIty commercial real estate agency) my designer eliminated a standardized footer containing links to about 20 pages. The new design maintains this footer on the home page, but all other pages (about 600 eliminate the footer). The new design does a very good job eliminating non essential items. Most of the changes remove or reduce the size of unnecessary design elements. The footer removal is the only change really effect the link structure. The new design is not launched yet. Hoping to receive some good advice from the MOZ community before proceeding My concern is that removing these links could have an adverse or unpredictable effect on ranking. Last Summer we launched a completely redesigned version of the site and our ranking collapsed for 3 months. However unlike the previous upgrade this modifications does not URL names, tags, text or any major element. Only major change is the footer removal. Some of the footer pages provide good (not critical) info for visitors. Note the footer will still appear on the home page but will be removed on the interior pages. Are we risking any detrimental ranking effect by removing this footer? Can we compensate by adding text links to these pages if the links from the footer are removed? Seems irregular to have a home page footer but no footer on the other pages. Are we inviting any downgrade, penalty, adverse SEO effect by implementing this? I very much like the new design but do not want to risk a fall in rank and traffic. Thanks for your input!!!
Intermediate & Advanced SEO | | Kingalan1
Alan0 -
Robots.txt: how to exclude sub-directories correctly?
Hello here, I am trying to figure out the correct way to tell SEs to crawls this: http://www.mysite.com/directory/ But not this: http://www.mysite.com/directory/sub-directory/ or this: http://www.mysite.com/directory/sub-directory2/sub-directory/... But with the fact I have thousands of sub-directories with almost infinite combinations, I can't put the following definitions in a manageable way: disallow: /directory/sub-directory/ disallow: /directory/sub-directory2/ disallow: /directory/sub-directory/sub-directory/ disallow: /directory/sub-directory2/subdirectory/ etc... I would end up having thousands of definitions to disallow all the possible sub-directory combinations. So, is the following way a correct, better and shorter way to define what I want above: allow: /directory/$ disallow: /directory/* Would the above work? Any thoughts are very welcome! Thank you in advance. Best, Fab.
Intermediate & Advanced SEO | | fablau1