Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How to resolve Duplicate Page Content issue for root domain & index.html?
-
SEOMoz returns a Duplicate Page Content error for a website's index page, with both domain.com and domain.com/index.html isted seperately. We had a rewrite in the htacess file, but for some reason this has not had an impact and we have since removed it. What's the best way (in an HTML website) to ensure all index.html links are automatically redirected to the root domain and these aren't seen as two separate pages?
-
great code Josh...but , after i saved it on .htaccess , a "?" appeared on the link..
http://www.domain.com/?/example/file.html
Is this ok ? pls advice/
Thank you,
-
You touched on a good point here "We set up our site to utilize a index redirect for all of our sub directories as well, so with this method you simply name your sub directories to match the url path that you desire. Each sub directory has it's own index which you redirect with a variation of the above code. By doing this you can have nice clean url paths like http://www.semclix.com/design/ecommerce/ - and mitigate the duplicate content issue. We hope that this helps."
Too often I see sites where they get the home page right but miss the re-write on the directories.
-
Here's the .htaccess rewrite command that you can use for the index.html redirect -
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.amarasoftware.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.amarasoftware.com/$1/ [R=301,L]
We set up our site to utilize a index redirect for all of our sub directories as well, so with this method you simply name your sub directories to match the url path that you desire. Each sub directory has it's own index which you redirect with a variation of the above code. By doing this you can have nice clean url paths like http://www.semclix.com/design/ecommerce/ - and mitigate the duplicate content issue. We hope that this helps.
-
I'd check it with some other software too... i.e. Raven Tools free trial or something, that will tell you if there's canonicalization problems... of course I'm not advocating Raven Tools over SEOmoz tools (I'm a member here and not there for good reasons), I just think best to try a few different tests before deciding if it's a problem. There might just be an issue with the SEOmoz campaign tool for the moment, which I'm sure they'll fix as soon as they realise.
Hey, aren't you the tutor I had in my SEC usability course?
-
Unfortunately I can't speak for how SEOmoz handles rewrites like this if it's already crawled the page.
The rewrite rule you're using looks like it's only rewriting the www portion of the URL, not index.html. So alone it wouldn't do anything to solve dupe content issues. (someone please correct me if I'm misreading the rewrite rule)
Here's a link to what I used to write a redirect for index.html on another site.
http://www.webmasterworld.com/forum92/6375.htm
I think it is a fairly safe assumption to make that SEOmoz is smart enough to realize if you're got a redirect in there (providing that its working). I'd still recommend taking a look to see if Google has cached or indexed an index.html version, though.
Edit: my personal, highly technical, acid-test for an index.html redirect is just going there and manually entering the url with index.html on the end, rather than waiting for a recrawl to see if you're heading in the right direction.
-
RewriteEngine on RewriteCond %{HTTP_HOST} ^([a-z.]+)?amarasoftware.com$ [NC] RewriteCond %{HTTP_HOST} !^www. [NC] RewriteRule .? http://www.%1amarasoftware.com%{REQUEST_URI} [R=301,L] Is what I use. In Seomoz this leads to www.amarasoftware.com and index.html so 2 different URL's, both with different incoming links, and a different authority, which has an impact on my ranking if correct. in SEomoz this a returns a duplicate title and meta tags errors. If SEOmoz finds 2 pages instead of one I may assume that Google agrees with this.
-
As you did, I'd normally handle this with a 301 from index.html to the root domain. When you say that it's "not had an impact" do you mean that the SEOmoz dashboard continues to show an error after it re-crawls, or that the search engines are not picking up the redirect?
SEOmoz dashboard does a great job, but I'd check to see how the search engines are actually indexing yourdomain.com/index.html vs. yourdomain.com also. If the search engines are indexing it as you want them to, then I'd be inclined to ignore the dashboard error.
I apologize if this is a stupid question, but I assume you manually checked that the redirect worked?
-
You wish to canonicalize the pages. That is the SEO word which describes exactly what you are trying to achieve.
Above are 5 URLs which can possibly lead to the exact same page. If you add the following HTML in the code then the pages will be canonicalized.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No Index thousands of thin content pages?
Hello all! I'm working on a site that features a service marketed to community leaders that allows the citizens of that community log 311 type issues such as potholes, broken streetlights, etc. The "marketing" front of the site is 10-12 pages of content to be optimized for the community leader searchers however, as you can imagine there are thousands and thousands of pages of one or two line complaints such as, "There is a pothole on Main St. and 3rd." These complaint pages are not about the service, and I'm thinking not helpful to my end goal of gaining awareness of the service through search for the community leaders. Community leaders are searching for "311 request service", not "potholes on main street". Should all of these "complaint" pages be NOINDEX'd? What if there are a number of quality links pointing to the complaint pages? Do I have to worry about losing Domain Authority if I do NOINDEX them? Thanks for any input. Ken
Intermediate & Advanced SEO | | KenSchaefer0 -
Country Code Top Level Domains & Duplicate Content
Hi looking to launch in a new market, currently we have a .com.au domain which is geo-targeted to Australia. We want to launch in New Zealand which is ends with .co.nz If i duplicate the Australian based site completely on the new .co.nz domain name, would i face duplicate content issues from a SEO standpoint?
Intermediate & Advanced SEO | | jayoliverwright
Even though it's on a completely separate country code. Or is it still advised tosetup hreflang tag across both of the domains? Cheers.0 -
How can I prevent duplicate pages being indexed because of load balancer (hosting)?
The site that I am optimising has a problem with duplicate pages being indexed as a result of the load balancer (which is required and set up by the hosting company). The load balancer passes the site through to 2 different URLs: www.domain.com www2.domain.com Some how, Google have indexed 2 of the same URLs (which I was obviously hoping they wouldn't) - the first on www and the second on www2. The hosting is a mirror image of each other (www and www2), meaning I can't upload a robots.txt to the root of www2.domain.com disallowing all. Also, I can't add a canonical script into the website header of www2.domain.com pointing the individual URLs through to www.domain.com etc. Any suggestions as to how I can resolve this issue would be greatly appreciated!
Intermediate & Advanced SEO | | iam-sold0 -
Date of page first indexed or age of a page?
Hi does anyone know any ways, tools to find when a page was first indexed/cached by Google? I remember a while back, around 2009 i had a firefox plugin which could check this, and gave you a exact date. Maybe this has changed since. I don't remember the plugin. Or any recommendations on finding the age of a page (not domain) for a website? This is for competitor research not my own website. Cheers, Paul
Intermediate & Advanced SEO | | MBASydney0 -
Links from non-indexed pages
Whilst looking for link opportunities, I have noticed that the website has a few profiles from suppliers or accredited organisations. However, a search form is required to access these pages and when I type cache:"webpage.com" the page is showing up as non-indexed. These are good websites, not spammy directory sites, but is it worth trying to get Google to index the pages? If so, what is the best method to use?
Intermediate & Advanced SEO | | maxweb0 -
[E-commerce] Duplicate content due to color variations (canonical/indexing)
Hello, We currently have a lot of color variations on multiple products with almost the same content. Even with our canonicals being set, Moz's crawling tool seems to flag them as duplicate content. What we have done so far: Choosing the best-selling color variation (our "master product") Adding a rel="canonical" to every variation (with our "master product" as the canonical URL) In my opinion, it should be enough to address this issue. However, being given the fact that it's flagged as duplicate by Moz, I was wondering if there is something else we should do? Should we add a "noindex,follow" to our child products and "index,follow" to our master product? (sounds to me like such a heavy change) Thank you in advance
Intermediate & Advanced SEO | | EasyLounge0 -
Duplicate content on subdomains.
Hi Mozer's, I have a site www.xyz.com and also geo targeted sub domains www.uk.xyz.com, www.india.xyz.com and so on. All the sub domains have the content which is same as the content on the main domain that is www.xyz.com. So, I want to know how can i avoid content duplication. Many Thanks!
Intermediate & Advanced SEO | | HiteshBharucha0 -
301 - should I redirect entire domain or page for page?
Hi, We recently enabled a 301 on our domain from our old website to our new website. On the advice of fellow mozzer's we copied the old site exactly to the new domain, then did the 301 so that the sites are identical. Question is, should we be doing the 301 as a whole domain redirect, i.e. www.oldsite.com is now > www.newsite.com, or individually setting each page, i.e. www.oldsite.com/page1 is now www.newsite.com/page1 etc for each page in our site? Remembering that both old and new sites (for now) are identical copies. Also we set the 301 about 5 days ago and have verified its working but haven't seen a single change in rank either from the old site or new - is this because Google hasn't likely re-indexed yet? Thanks, Anthony
Intermediate & Advanced SEO | | Grenadi0