Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How to resolve Duplicate Page Content issue for root domain & index.html?
-
SEOMoz returns a Duplicate Page Content error for a website's index page, with both domain.com and domain.com/index.html isted seperately. We had a rewrite in the htacess file, but for some reason this has not had an impact and we have since removed it. What's the best way (in an HTML website) to ensure all index.html links are automatically redirected to the root domain and these aren't seen as two separate pages?
-
great code Josh...but , after i saved it on .htaccess , a "?" appeared on the link..
http://www.domain.com/?/example/file.html
Is this ok ? pls advice/
Thank you,
-
You touched on a good point here "We set up our site to utilize a index redirect for all of our sub directories as well, so with this method you simply name your sub directories to match the url path that you desire. Each sub directory has it's own index which you redirect with a variation of the above code. By doing this you can have nice clean url paths like http://www.semclix.com/design/ecommerce/ - and mitigate the duplicate content issue. We hope that this helps."
Too often I see sites where they get the home page right but miss the re-write on the directories.
-
Here's the .htaccess rewrite command that you can use for the index.html redirect -
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.amarasoftware.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.amarasoftware.com/$1/ [R=301,L]
We set up our site to utilize a index redirect for all of our sub directories as well, so with this method you simply name your sub directories to match the url path that you desire. Each sub directory has it's own index which you redirect with a variation of the above code. By doing this you can have nice clean url paths like http://www.semclix.com/design/ecommerce/ - and mitigate the duplicate content issue. We hope that this helps.
-
I'd check it with some other software too... i.e. Raven Tools free trial or something, that will tell you if there's canonicalization problems... of course I'm not advocating Raven Tools over SEOmoz tools (I'm a member here and not there for good reasons), I just think best to try a few different tests before deciding if it's a problem. There might just be an issue with the SEOmoz campaign tool for the moment, which I'm sure they'll fix as soon as they realise.
Hey, aren't you the tutor I had in my SEC usability course?
-
Unfortunately I can't speak for how SEOmoz handles rewrites like this if it's already crawled the page.
The rewrite rule you're using looks like it's only rewriting the www portion of the URL, not index.html. So alone it wouldn't do anything to solve dupe content issues. (someone please correct me if I'm misreading the rewrite rule)
Here's a link to what I used to write a redirect for index.html on another site.
http://www.webmasterworld.com/forum92/6375.htm
I think it is a fairly safe assumption to make that SEOmoz is smart enough to realize if you're got a redirect in there (providing that its working). I'd still recommend taking a look to see if Google has cached or indexed an index.html version, though.
Edit: my personal, highly technical, acid-test for an index.html redirect is just going there and manually entering the url with index.html on the end, rather than waiting for a recrawl to see if you're heading in the right direction.
-
RewriteEngine on RewriteCond %{HTTP_HOST} ^([a-z.]+)?amarasoftware.com$ [NC] RewriteCond %{HTTP_HOST} !^www. [NC] RewriteRule .? http://www.%1amarasoftware.com%{REQUEST_URI} [R=301,L] Is what I use. In Seomoz this leads to www.amarasoftware.com and index.html so 2 different URL's, both with different incoming links, and a different authority, which has an impact on my ranking if correct. in SEomoz this a returns a duplicate title and meta tags errors. If SEOmoz finds 2 pages instead of one I may assume that Google agrees with this.
-
As you did, I'd normally handle this with a 301 from index.html to the root domain. When you say that it's "not had an impact" do you mean that the SEOmoz dashboard continues to show an error after it re-crawls, or that the search engines are not picking up the redirect?
SEOmoz dashboard does a great job, but I'd check to see how the search engines are actually indexing yourdomain.com/index.html vs. yourdomain.com also. If the search engines are indexing it as you want them to, then I'd be inclined to ignore the dashboard error.
I apologize if this is a stupid question, but I assume you manually checked that the redirect worked?
-
You wish to canonicalize the pages. That is the SEO word which describes exactly what you are trying to achieve.
Above are 5 URLs which can possibly lead to the exact same page. If you add the following HTML in the code then the pages will be canonicalized.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What are best page titles for sub-domain pages?
Hi Moz communtity, Let's say a website has multiple sub-domains with hundreds and thousands of pages. Generally we will be mentioning "primary keyword & "brand name" on every page of website. Can we do same on all pages of sub-domains to increase the authority of website for this primary keyword in Google? Or it gonna end up as negative impact if Google consider as duplicate content being mentioned same keyword and brand name on every page even on website and all pages of sub domains? Thanks
Intermediate & Advanced SEO | | vtmoz0 -
Directory with Duplicate content? what to do?
Moz keeps finding loads of pages with duplicate content on my website. The problem is its a directory page to different locations. E.g if we were a clothes shop we would be listing our locations: www.sitename.com/locations/london www.sitename.com/locations/rome www.sitename.com/locations/germany The content on these pages is all the same, except for an embedded google map that shows the location of the place. The problem is that google thinks all these pages are duplicated content. Should i set a canonical link on every single page saying that www.sitename.com/locations/london is the main page? I don't know if i can use canonical links because the page content isn't identical because of the embedded map. Help would be appreciated. Thanks.
Intermediate & Advanced SEO | | nchlondon0 -
Trailing Slashes for Magento CMS pages - 2 URLS - Duplicate content
Hello, Can anyone help me find a solution to Fixing and Creating Magento CMS pages to only use one URL and not two URLS? www.domain.com/testpage www.domain.com/testpage/ I found a previous article that applies to my issue, which is using htaccess to redirect request for pages in magento 301 redirect to slash URL from the non-slash URL. I dont understand the syntax fully in htaccess , but I used this code below. This code below fixed the CMS page redirection but caused issues on other pages, like all my categories and products with this error: "This webpage has a redirect loop ERR_TOO_MANY_REDIRECTS" Assuming you're running at domain root. Change to working directory if needed. RewriteBase / # www check If you're running in a subdirectory, then you'll need to add that in to the redirected url (http://www.mydomain.com/subdirectory/$1 RewriteCond %{HTTP_HOST} !^www. [NC]
Intermediate & Advanced SEO | | iamgreenminded
RewriteRule ^(.*)$ http://www.mydomain.com/$1 [R=301,L] Trailing slash check Don't fix direct file links RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_URI} !(.)/$
RewriteRule ^(.)$ $1/ [L,R=301] Finally, forward everything to your front-controller (index.php) RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .* index.php [QSA,L]0 -
Removing duplicate content
Due to URL changes and parameters on our ecommerce sites, we have a massive amount of duplicate pages indexed by google, sometimes up to 5 duplicate pages with different URLs. 1. We've instituted canonical tags site wide. 2. We are using the parameters function in Webmaster Tools. 3. We are using 301 redirects on all of the obsolete URLs 4. I have had many of the pages fetched so that Google can see and index the 301s and canonicals. 5. I created HTML sitemaps with the duplicate URLs, and had Google fetch and index the sitemap so that the dupes would get crawled and deindexed. None of these seems to be terribly effective. Google is indexing pages with parameters in spite of the parameter (clicksource) being called out in GWT. Pages with obsolete URLs are indexed in spite of them having 301 redirects. Google also appears to be ignoring many of our canonical tags as well, despite the pages being identical. Any ideas on how to clean up the mess?
Intermediate & Advanced SEO | | AMHC0 -
How can I prevent duplicate pages being indexed because of load balancer (hosting)?
The site that I am optimising has a problem with duplicate pages being indexed as a result of the load balancer (which is required and set up by the hosting company). The load balancer passes the site through to 2 different URLs: www.domain.com www2.domain.com Some how, Google have indexed 2 of the same URLs (which I was obviously hoping they wouldn't) - the first on www and the second on www2. The hosting is a mirror image of each other (www and www2), meaning I can't upload a robots.txt to the root of www2.domain.com disallowing all. Also, I can't add a canonical script into the website header of www2.domain.com pointing the individual URLs through to www.domain.com etc. Any suggestions as to how I can resolve this issue would be greatly appreciated!
Intermediate & Advanced SEO | | iam-sold0 -
Best practice for duplicate website content: same root domain name but different extension
Hi there I have a new client who has two websites: http://www.bayofislandsteambuilding.co.nz
Intermediate & Advanced SEO | | turnbullholdingsltd
http://www.bayofislandsteambuilding.org.nz They are the same in every regard apart from the domain extension (.co.nz & .org.nz) which is likely to be causing them issues with Google ranking given the huge amount of duplicate content. What is the best practice approach to fixing this? Normally, if I was starting from scratch, I would set one of the extensions as an alias which redirects to the main domain. Thanks in advance. Laurie0 -
How do you de-index and prevent indexation of a whole domain?
I have parts of an online portal displaying in SERPs which it definitely shouldn't be. It's due to thoughtless developers but I need to have the whole portal's domain de-indexed and prevented from future indexing. I'm not too tech savvy but how is this achieved? No index? Robots? thanks
Intermediate & Advanced SEO | | Martin_S0 -
Copying my Facebook content to website considered duplicate content?
I write career advice on Facebook on a daily basis. On my homepage users can see the most recent 4-5 feeds (using FB social media plugin). I am thinking to create a page on my website where visitors can see all my previous FB feeds. Would this be considered duplicate content if I copy paste the info, but if I use a Facebook social media plugin then it is not considered duplicate content? I am working on increasing content on my website and feel incorporating FB feeds would make sense. thank you
Intermediate & Advanced SEO | | knielsen0