Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Canonical URLs and Sitemaps
-
We are using canonical link tags for product pages in a scenario where the URLs on the site contain category names, and the canonical URL points to a URL which does not contain the category names. So, the product page on the site is like www.example.com/clothes/skirts/skater-skirt-12345, and also like www.example.com/sale/clearance/skater-skirt-12345 in another category. And on both of these pages, the canonical link tag references a 3rd URL like www.example.com/skater-skirt-12345. This 3rd URL, used in the canonical link tag is a valid page, and displays the same content as the other two versions, but there are no actual links to this generic version anywhere on the site (nor external).
Questions:
1. Does the generic URL referenced in the canonical link also need to be included as on-page links somewhere in the crawled navigation of the site, or is it okay to be just a valid URL not linked anywhere except for the canonical tags?
2. In our sitemap, is it okay to reference the non-canonical URLs, or does the sitemap have to reference only the canonical URL? In our case, the sitemap points to yet a 3rd variation of the URL, like www.example.com/product.jsp?productID=12345. This page retrieves the same content as the others, and includes a canonical link tag back to www.example.com/skater-skirt-12345. Is this a valid approach, or should we revise the sitemap to point to either the category-specific links or the canonical links?
-
Thanks. And since we've now implemented the aforementioned changes, I can give some findings back.
What we did: We changed our sitemap to point to the same canonical URLs as are referenced in the tags on our product pages (only one entry in sitemap per product).
What we didn't do: We didn't change the product pages themselves. They still have a canonical URL link reference, pointing to a URL with no category paths, which does not naturally occur in the navigation of the site (on the site, product pages all have category paths in the URL).
Findings: After submitting the new sitemap, the stats in Google Webmasters Tools indicate that almost all (> 96%) of our product pages are indexed. We believe that the pages were already indexed (for the most part) and now the sitemap is useful for metrics. From the timing, it's unlikely that the sitemap itself caused our index stats to get significantly better in just 1 day. Possible, but unlikely. In either case, since our product page URLs still reference canonical links which don't exist in the site's navigation, the evidence suggests that the canonical link itself is enough, and an actual navigation path to the canonical version of the page is not needed. That's just empirical evidence, we have no inside info on Google's methods, but this is what we believe now after monitoring.
-
With the canonical tag in place, I'm guessing that extra link would basically be ignored. It's probably harmless, but I'm not sure it will do anything. You could create an HTML "sitemap" (or even an XML sitemap) with the canonical URLs. It's not my first choice, but it at least would give Google an extra push.
-
We're in process of updating our canonical tagging and our sitemap, based on the feedback here. I have a question for the group though. Unfortunately we can't follow Andy Smith's suggestion of creating a "By Brand" navigation section on the site, since this web site is all private label (they sell all products under their own brand name).
One possible solution is to create a user-accessible site map page, with an "all products" paginated section, where all these product page URLs would be the canonical version.
But another possible solution, easier to implement, would be to have a user accessible link on each product page to the canonical version of itself. That is, when the user is on www.example.com/clothes/skirts/skater-skirt-12345, there would be a link to www.example.com/skater-skirt-12345, which would also be the URL specified in the canonical tag.
This seems redundant, but our results so far have borne out that the canonical tag pointing to a URL which doesn't really exist anywhere in the navigation doesn't seem to be having the desired effect. So, the thought is that a combination of the canonical tag, plus a "real" link to that same URL referenced in the canonical tag would better inform the search engine robots. But our hesitation is whether it should work for this link to be on the product page itself (e.g. the non-canonical version).
Any thoughts or feedback on approach?
-
Thanks for the responses. I've been monitoring for the past couple of weeks with the current sitemap and canonical structure, and so far the data seems consistent with the replies to this thread. In GWT, the sitemap stats show less than 1% of the URLs submitted are indexed so far. We have an action plan now to update the canonical structure and the sitemap to point to URLs which will be naturally crawled on the site as well.
-
There's no "have to" in most of these situations, but it boils down to this - the more canonical your canonical URL actually is, the better chance you have of Google honoring it. In other words, if you set a canonical tag but then never use that in internal links or your XML sitemap, odds are pretty good that Google may ignore the tag in some cases. You're basically saying "Hey, this URL is canonical! No, this one is! No, this one!" - it's a mixed message, and they're going to try to interpret it algorithmically.
I definitely think pointing to yet another version in the XML sitemap is a problem. Ideally, it would be great to unify your URLs, but if that's not possible, getting the canonical version in the sitemap would be a big help (and introducing yet another variant isn't good, so you'd kill two birds with one stone). As Andy said, if you could create some kind of internal link to the canonical version, even if it's not the main link, that could also help. I only hesitate on that one, because you don't want to end up with a weird, artificial linking structure (just creating links to have links).
Please note, this isn't necessarily a disaster the way you have it. Google could honor the tags properly and generally rank your site correctly. In my experience, though, it's a recipe for long-term problems, and it's worth fixing.
-
The purpose of the canonical tag is to tell Google which page to index first. So, on that note, I usually use the canonical tag on the strongest page in terms of pagerank, as this shows which page is linked to the best.
I'm also guessing you're using a framwork/platform like Magento, this can make linking quite difficult. I often suggest creating Brand pages, and link to the product page, the "3rd URL", from there. Brand pages also great for SEO, as most people search for brands first. Great place to get some fat head keywords in.
Also, make sure you put in the http:// as well, I think it is good practice to put in the full URL.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Hreflang in header...should I do a Sitemap?
A client implemented hreflang tags in the site header. MOZ says you aren't supposed to do an hreflang Sitemap as well. My question is how should I do a Sitemap now (or should I do one at all)?
Intermediate & Advanced SEO | | navdm0 -
URL structure change and xml sitemap
At the end of April we changed the url structure of most of our pages and 301 redirected the old pages to the new ones. The xml sitemaps were also updated at that point to reflect the new url structure. Since then Google has not indexed the new urls from our xml sitemaps and I am unsure of why. We are at 4 weeks since the change, so I would have thought they would have indexed the pages by now. Any ideas on what I should check to make sure pages are indexed?
Intermediate & Advanced SEO | | ang0 -
Attack of the dummy urls -- what to do?
It occurs to me that a malicious program could set up thousands of links to dummy pages on a website: www.mysite.com/dynamicpage/dummy123 www.mysite.com/dynamicpage/dummy456 etc.. How is this normally handled? Does a developer have to look at all the parameters to see if they are valid and if not, automatically create a 301 redirect or 404 not found? This requires a table lookup of acceptable url parameters for all new visitors. I was thinking that bad url names would be rare so it would be ok to just stop the program with a message, until I realized someone could intentionally set up links to non existent pages on a site.
Intermediate & Advanced SEO | | friendoffood1 -
Removing UpperCase URLs from Indexing
This search - site:www.qjamba.com/online-savings/automotix gives me this result from Google: Automotix online coupons and shopping - Qjamba
Intermediate & Advanced SEO | | friendoffood
https://www.qjamba.com/online-savings/automotix
Online Coupons and Shopping Savings for Automotix. Coupon codes for online discounts on Vehicles & Parts products. and Google tells me there is another one, which is 'very simliar'. When I click to see it I get: Automotix online coupons and shopping - Qjamba
https://www.qjamba.com/online-savings/Automotix
Online Coupons and Shopping Savings for Automotix. Coupon codes for online discounts on Vehicles & Parts products. This is because I recently changed my program to redirect all urls with uppercase in them to lower case, as it appears that all lowercase is strongly recommended. I assume that having 2 indexed urls for the same content dilutes link juice. Can I safely remove all of my UpperCase indexed pages from Google without it affecting the indexing of the lower case urls? And if, so what is the best way -- there are thousands.0 -
Canonical URL & sitemap URL mismatch
Hi We're running a Magento store which doesn't have too much stock rotation. We've implemented a plugin that will allow us to give products custom canonical URLs (basically including the category slug, which is not possible through vanilla Magento). The sitemap feature doesn't pick up on these URLs, so we're submitting URLs to Google that are available and will serve content, but actually point to a longer URL via a canonical meta tag. The content is available at each URL and is near identical (all apart from the breadcrumbs) All instances of the page point to the same canonical URL We are using the longer URL in our internal architecture/link building to show this preference My questions are; Will this harm our visibility? Aside from editing the sitemap, are there any other signals we could give Google? Thanks
Intermediate & Advanced SEO | | tomcraig860 -
Overly-Dynamic URL
Hi, We have over 5000 pages showing under Overly-Dynamic URL error Our ecommerce site uses Ajax and we have several different filters like, Size, Color, Brand and we therefor have many different urls like, http://www.dellamoda.com/Designer-Pumps.html?sort=price&sort_direction=1&use_selected_filter=Y http://www.dellamoda.com/Designer-Accessories.html?sort=title&use_selected_filter=Y&view=all http://www.dellamoda.com/designer-handbags.html?use_selected_filter=Y&option=manufacturer%3A&page3 Could we use the robots.txt file to disallow these from showing as duplicate content? and do we need to put the whole url in there? like: Disallow: /*?sort=price&sort_direction=1&use_selected_filter=Y if not how far into the url should be disallowed? So far we have added the following to our robots,txt Disallow: /?sort=title Disallow: /?use_selected_filter=Y Disallow: /?sort=price Disallow: /?clearall=Y Just not sure if they are correct. Any help would be greatly appreciated. Thank you,Kami
Intermediate & Advanced SEO | | dellamoda2 -
Exact keyword URL or not?
Hi all, I have a quick question about the proper use of permalinks. Let's say that I have a website about sports and I want to create an internal page dedicated to shoes. I know that the keyword "shoe" has 15.000 monthly visits, while the keyword "shoes" has 1.000 monthly visits. How do I have to name the internal page? http://www.example.com/shoe or http://www.example.com/shoes (with a final 's')? I would think that by naming the URL http://www.example.com/shoes, the search engine would consider that page for the keywords "shoe" and "shoes", but I am not sure about it. Should I create a URL that only focuses on one specific keyword ("shoe", in this example) or a URL that may encompass more than one keyword ("shoe" and "shoes")? I hope this is clear. Thank you for your time and help. All best, Sal
Intermediate & Advanced SEO | | salvyy0 -
Canonical Tag and Affiliate Links
Hi! I am not very familiar with the canonical tag. The thing is that we are getting traffic and links from affiliates. The affiliates links add something like this to the code of our URL: www.mydomain.com/category/product-page?afl=XXXXXX At this moment we have almost 2,000 pages indexed with that code at the end of the URL. So they are all duplicated. My other concern is that I don't know if those affilate links are giving us some link juice or not. I mean, if an original product page has 30 links and the affiliates copies have 15 more... are all those links being counted together by Google? Or are we losing all the juice from the affiliates? Can I fix all this with the canonical tag? Thanks!
Intermediate & Advanced SEO | | jorgediaz0