Removing Dynamic "noindex" URL's from Index

BeTheBoss

6 months ago my clients site was overhauled and the user generated searches had an index tag on them. I switched that to noindex but didn't get it fast enough to avoid being 100's of pages indexed in Google.

It's been months since switching to the noindex tag and the pages are still indexed. What would you recommend? Google crawls my site daily - but never the pages that I want removed from the index.

I am trying to avoid submitting hundreds of these dynamic URL's to the removal tool in webmaster tools. Suggestions?

Dr-Pete

Hooray! Usually, I just give my advice and then run away, so it's always nice to hear I was actually right about something Seriously, glad you got it sorted out.

BeTheBoss

Just a follow up to your suggestion.

I created sitemaps for the pages I want removed using the google spreadsheet importXML functions, which saved a lot of time.

It took a couple weeks but all of the pages, and similar pages, have successfully been removed from the index. Even the similar pages I didn't get a chance to put in the sitemap yet (importXML limits the results to 100).

Your suggestion worked!

BeTheBoss

I can't 404 dynamic search pages.

BeTheBoss

There are a mix of search pages and old mobile pages.

The search pages I've been testing out having the canonical point to the default search page. I've seen a slight drop in these pages - but I guess I just have to be more patient.

For the other pages the path is no longer there like you were mentioning. I like the idea of setting up the XML sitemap, I never even thought of making a bad/indexed page sitemap. I will give that a shot! Thankfully this will be a quick job with the importXml function in google spreadsheets! Great tip, hopefully it'll work.

Dr-Pete

Is there a crawl path to them currently? One issue I see a lot is that a bunch of pages get indexed, the path is found and cut off, NOINDEX (canonical, 301, etc.) is added, but then the pages never get re-crawled. Since they don't get recrawled, the page-level directive never gets honored.

If there's a URL parameter involved, you could use parameter-handling in GWT - it's not a perfect solution, but it sometimes seems to work without a re-crawl.

The other option would be to create a new XML sitemap with all of the bad/indexed URLs. This may push Google to re-crawl them and then see the tags to deindex. It's a bit safer than re-opening the crawl paths.

If they are being crawled and Google is just ignoring the NOINDEX for some reason, I'd try to 301 or canonical those pages to a primary search page, if that's feasible (probably canonical, since you don't want the users to 301). Sometimes, if a signal isn't working for that long, you just have to shake Google and try a different signal. Even following their exact recommendations, it rarely works as planned at large scale.

MagicDude4Eva

Don't use GWMT's removal tool to remove URLs which should not be in the index (unless those expose sensitive information). Best practise is to exclude them in robots.txt and to also ensure that the pages either 404 or have a noindex,noarchive tag.

benjaminspak

Change the site structure and let the pages 404, Google will deindex them if they are not being linked to.

AgentsofValue

You could try adding the pages you want to remove to your robots.txt file. Since you're not linking to them, and it's very unlikely that Googlebot will index those pages naturally now, this might be a better way of telling it which pages to explicitly not index.

I'm not really sure how quickly this will trigger Google to remove those pages from the index - but they do reference robots.txt on the actual "Remove URLs" page of WMT ---> "Use **robots.txt **to specify how search engines should crawl your site, or request **removal **of URLs from Google's search results ..."

For that technique, you'd want to add something like this for all of the pages you want to remove:

Disallow: /oldpage1toremove.php

That should work. If it doesn't, then I would probably just submit the requests through the "Remove URLs" tool.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Removing Dynamic "noindex" URL's from Index

Browse Questions

Explore more categories

Related Questions

Forwarded vanity domains, suddenly resolving to 404 with appended URL's ending in random 5 characters

How to stop URLs that include query strings from being indexed by Google

What's the best way to noindex pages but still keep backlinks equity?

How to de-index old URLs after redesigning the website?

Why is /home used in this company's home URL?

Removing index.php

Putting "noindex" on a page that's in an iframe... what will that mean for the parent page?

Using the same content on different TLD's

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved