Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Block session id URLs with robots.txt
-
Hi,
I would like to block all URLs with the parameter '?filter=' from being crawled by including them in the robots.txt.
Which directive should I use:
User-agent: *
Disallow: ?filter=or
User-agent: *
Disallow: /?filter=In other words, is the forward slash in the beginning of the disallow directive necessary?
Thanks!
-
Hi Martijn,
Thanks for the answer. Regarding the forward slash in the beginning, is it necessary to use this?
In the robots text from Zalando for example, you can see that they don't use it for a lot of filters.
-
Uhh, that's not what the requester is looking for and could actually cause tons of problems if you would apply this on a site that you're unaware of. I would always go with the most limiting robots.txt that you can and in this case, I would go with: /?filter=
-
Hi,
The following should suffice as it will black any URL with a "?" in it
User-agent: * Disallow: /*?
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
If my website do not have a robot.txt file, does it hurt my website ranking?
After a site audit, I find out that my website don't have a robot.txt. Does it hurt my website rankings? One more thing, when I type mywebsite.com/robot.txt, it automatically redirect to the homepage. Please help!
Intermediate & Advanced SEO | | binhlai0 -
Will disallowing URL's in the robots.txt file stop those URL's being indexed by Google
I found a lot of duplicate title tags showing in Google Webmaster Tools. When I visited the URL's that these duplicates belonged to, I found that they were just images from a gallery that we didn't particularly want Google to index. There is no benefit to the end user in these image pages being indexed in Google. Our developer has told us that these urls are created by a module and are not "real" pages in the CMS. They would like to add the following to our robots.txt file Disallow: /catalog/product/gallery/ QUESTION: If the these pages are already indexed by Google, will this adjustment to the robots.txt file help to remove the pages from the index? We don't want these pages to be found.
Intermediate & Advanced SEO | | andyheath0 -
Double hyphen in URL - bad?
Instead of a URL such as domain.com/double-dash/ programming wants to use domain.com/double--dash/ for some reason that makes things easier for them. Would a double dash in the URL have a negative effect on the page ranking?
Intermediate & Advanced SEO | | CFSSEO0 -
Internal links and URL shortners
Hi guys, what are your thoughts using bit.ly links as internal links on blog posts of a website? Some posts have 4/5 bit.ly links going to other pages of our website (noindexed pages). I have nofollowed them so no seo value is lost, also the links are going to noindexed pages so no need to pass seo value directly. However what are your thoughts on how Google will see internal links which have essential become re-direct links? They are bit.ly links going to result pages basically. Am I also to assume the tracking for internal links would also be better using google analytics functionality? is bit.ly accurate for tracking clicks? Any advice much appreciated, I just wanted to double check this.
Intermediate & Advanced SEO | | pauledwards0 -
Canonical URL & sitemap URL mismatch
Hi We're running a Magento store which doesn't have too much stock rotation. We've implemented a plugin that will allow us to give products custom canonical URLs (basically including the category slug, which is not possible through vanilla Magento). The sitemap feature doesn't pick up on these URLs, so we're submitting URLs to Google that are available and will serve content, but actually point to a longer URL via a canonical meta tag. The content is available at each URL and is near identical (all apart from the breadcrumbs) All instances of the page point to the same canonical URL We are using the longer URL in our internal architecture/link building to show this preference My questions are; Will this harm our visibility? Aside from editing the sitemap, are there any other signals we could give Google? Thanks
Intermediate & Advanced SEO | | tomcraig860 -
Should comments and feeds be disallowed in robots.txt?
Hi My robots file is currently set up as listed below. From an SEO point of view is it good to disallow feeds, rss and comments? I feel allowing comments would be a good thing because it's new content that may rank in the search engines as the comments left on my blog often refer to questions or companies folks are searching for more information on. And the comments are added regularly. What's your take? I'm also concerned about the /page being blocked. Not sure how that benefits my blog from an SEO point of view as well. Look forward to your feedback. Thanks. Eddy User-agent: Googlebot Crawl-delay: 10 Allow: /* User-agent: * Crawl-delay: 10 Disallow: /wp- Disallow: /feed/ Disallow: /trackback/ Disallow: /rss/ Disallow: /comments/feed/ Disallow: /page/ Disallow: /date/ Disallow: /comments/ # Allow Everything Allow: /*
Intermediate & Advanced SEO | | workathomecareers0 -
Robots.txt: Can you put a /* wildcard in the middle of a URL?
We have noticed that Google is indexing the language/country directory versions of directories we have disallowed in our robots.txt. For example: Disallow: /images/ is blocked just fine However, once you add our /en/uk/ directory in front of it, there are dozens of pages indexed. The question is: Can I put a wildcard in the middle of the string, ex. /en/*/images/, or do I need to list out every single country for every language in the robots file. Anyone know of any workarounds?
Intermediate & Advanced SEO | | IHSwebsite0 -
Does Google index url with hashtags?
We are setting up some Jquery tabs in a page that will produce the same url with hashtags. For example: index.php#aboutus, index.php#ourguarantee, etc. We don't want that content to be crawled as we'd like to prevent duplicate content. Does Google normally crawl such urls or does it just ignore them? Thanks in advance.
Intermediate & Advanced SEO | | seoppc20120