Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Sitemap indexed pages dropping
-
About a month ago I noticed my pages indexed from my sitemap are dropping.There are 134 pages in my sitemap and only 11 are indexed. It used to be 117 pages and just died off quickly. I still seem to be getting consistant search traffic but I'm just not sure whats causing this. There are no warnings or manual actions required in GWT that I can find.
-
Just wanted to update this, it took a month but since I decided to completely remove canonical tags and try and handle duplicate content with url rewrites and 301 redirects and I now have 114 out of 149 indexed from my sitemap which is much better. it ended up to dropping to 5 out of 149 at one point.
-
Hi Stephen,
Great that you've probably found the cause - this will absolutely cause mass de-indexation. I had a client a year ago canonicalise their entire site (two sites, actually) to the home page. All their rankings and indexed pages dropped off over a matter of about six days (we spotted the tag immediately but the fix went into a "queue" - ugh!).
The bad news is that it took them a long time to get properly re-indexed and regain their rankings (I am talking months, not weeks). Having said this, the sites were nearly brand new - they had very few backlinks and were both less than six months old. I do not believe that an older site would have had as much problem regaining rankings, but I can't be sure and I have only seen that situation take place first-hand once.
-
I may have found the issue today. Most of the articles are pulled from a database and I think I placed a wrong canonical tag on the page which screwed up everything. Does anyone know how long it takes before a fix like this will show?
-
Thats a good catch, I fixed that. I do use that in WMT and it has been fine for the longest time. I guess its not that big of an issue, my main concern was the pages being indexed. Was reading another Q&A thing and used the info: qualifer to check some of the pages and all the ones I checked are indexed and its more then the 11. I just don't understand why its dropped all a sudden, and if that number really means anything.
-
How are the indexed numbers looking in WMT today? I see 3,370 results for a site: search on the domain, but those can be iffy in terms of up to date accuracy: https://www.google.co.uk/search?q=site%3Agoautohub.com&oq=site%3Agoautohub.com&aqs=chrome..69i57j69i58.798j0j4&sourceid=chrome&es_sm=119&ie=UTF-8
Not that this should matter too much if you are submitting a sitemap through WMT but your robots.txt file specifies sitemap.xml. There is a duplciate sitemap on that URL (http://goautohub.com/sitemap.xml) - are you using sitemap.php, which you mention here, in WMT? .php can be used for sitemaps, but I would update the robots.txt file to reflect the correct URL - http://i.imgur.com/uSB1P1g.png, whichever is meant to be right. I am not aware of problems with having duplicate sitemaps, as long as they are identical, but I'd use just one if it were me.
-
Thanks for checking, I haven't found anything yet.The site is goautohub.com. it's a custom site and the site map file is auto generated. It's goautohub.com/sitemap.php. I've done it like that for over a year. I did start seeing an error message about high response times and I've been working on improving that. It makes since because we have been advertising more to get the site seen. In regards to the rest of Williams points I have checked those but no improvement yet. Thank you
-
Hi Stephen,
Checking in to see if you had checked the points William has raised above. Do you see anything that could have resulted in the drop? Also, are you comfortable sharing the site here? We might be able to have a look too (feel free to PM if you are not comfortable sharing publicly).
Cheers,
Jane
-
Try to determine when the drop off started, and try to remember what kinds of changes the website was going through during that time. That could help point to the reason for the drop in indexing.
There are plenty of reasons why Google may choose not to index pages, so this will take some digging. Here are some places to start the search:
-
Check your robots.txt to ensure those pages are still crawlable
-
Check to make sure the content on those pages isn't duplicated somewhere else on the Web.
-
Check to see if there was any updates to canonical changes on the site around when the drop started
-
Check to make sure the sitemap currently on the site matches the one you submitted to Webmasters, and that your CMS didn't auto-generate a new one
-
Make sure the quality of the pages is worth indexing. You said your traffic didn't really take a hit, so it's not de-indexing your quality stuff.
-
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to index e-commerce marketplace product pages
Hello! We are an online marketplace that submitted our sitemap through Google Search Console 2 weeks ago. Although the sitemap has been submitted successfully, out of ~10000 links (we have ~10000 product pages), we only have 25 that have been indexed. I've attached images of the reasons given for not indexing the platform. gsc-dashboard-1 gsc-dashboard-2 How would we go about fixing this?
Technical SEO | | fbcosta0 -
Best practices for types of pages not to index
Trying to better understand best practices for when and when not use a content="noindex". Are there certain types of pages that we shouldn't want Google to index? Contact form pages, privacy policy pages, internal search pages, archive pages (using wordpress). Any thoughts would be appreciated.
Technical SEO | | RichHamilton_qcs0 -
Trying to find all internal links to a specific page (without index)
Hi guys -- Still waiting on Moz to index a page of mine. We launched a new site over two months ago. In the meantime, I really just need a list of internal links to a specific page because I want to change its URL. Does anybody know how to find that list (of internal links to 1 of my pages) without the Moz index? I appreciate the help!
Technical SEO | | marchexmarketingmcc1 -
Is it better to use XXX.com or XXX.com/index.html as canonical page
Is it better to use 301 redirects or canonical page? I suspect canonical is easier. The question is, which is the best canonical page, YYY.com or YYY.com/indexhtml? I assume YYY.com, since there will be many other pages such as YYY.com/info.html, YYY.com/services.html, etc.
Technical SEO | | Nanook10 -
Should i index or noindex a contact page
Im wondering if i should noindex the contact page im doing SEO for a website just wondering if by noindexing the contact page would it help SEO or hurt SEO for that website
Technical SEO | | aronwp0 -
Should I put meta descriptions on pages that are not indexed?
I have multiple pages that I do not want to be indexed (and they are currently not indexed, so that's great). They don't have meta descriptions on them and I'm wondering if it's worth my time to go in and insert them, since they should hypothetically never be shown. Does anyone have any experience with this? Thanks! The reason this is a question is because one member of our team was linking to this page through Facebook to send people to it and noticed random text on the page being pulled in as the description.
Technical SEO | | Viewpoints0 -
How to Stop Google from Indexing Old Pages
We moved from a .php site to a java site on April 10th. It's almost 2 months later and Google continues to crawl old pages that no longer exist (225,430 Not Found Errors to be exact). These pages no longer exist on the site and there are no internal or external links pointing to these pages. Google has crawled the site since the go live, but continues to try and crawl these pages. What are my next steps?
Technical SEO | | rhoadesjohn0 -
De-indexing millions of pages - would this work?
Hi all, We run an e-commerce site with a catalogue of around 5 million products. Unfortunately, we have let Googlebot crawl and index tens of millions of search URLs, the majority of which are very thin of content or duplicates of other URLs. In short: we are in deep. Our bloated Google-index is hampering our real content to rank; Googlebot does not bother crawling our real content (product pages specifically) and hammers the life out of our servers. Since having Googlebot crawl and de-index tens of millions of old URLs would probably take years (?), my plan is this: 301 redirect all old SERP URLs to a new SERP URL. If new URL should not be indexed, add meta robots noindex tag on new URL. When it is evident that Google has indexed most "high quality" new URLs, robots.txt disallow crawling of old SERP URLs. Then directory style remove all old SERP URLs in GWT URL Removal Tool This would be an example of an old URL:
Technical SEO | | TalkInThePark
www.site.com/cgi-bin/weirdapplicationname.cgi?word=bmw&what=1.2&how=2 This would be an example of a new URL:
www.site.com/search?q=bmw&category=cars&color=blue I have to specific questions: Would Google both de-index the old URL and not index the new URL after 301 redirecting the old URL to the new URL (which is noindexed) as described in point 2 above? What risks are associated with removing tens of millions of URLs directory style in GWT URL Removal Tool? I have done this before but then I removed "only" some useless 50 000 "add to cart"-URLs.Google says themselves that you should not remove duplicate/thin content this way and that using this tool tools this way "may cause problems for your site". And yes, these tens of millions of SERP URLs is a result of a faceted navigation/search function let loose all to long.
And no, we cannot wait for Googlebot to crawl all these millions of URLs in order to discover the 301. By then we would be out of business. Best regards,
TalkInThePark0