Comment on Duplicate Content and Canonical URLs by SEO Dave.
On archive pages rather than use the full content of a post an excerpt is used, this means though you are using content from your single blog post pages to generate archive pages (home page archives, categories, tags) they are not a full copy (just an excerpt) so they are not treated as duplicate content by Google.
The only possible issues you might have with archive pages and duplicate content is in two scenarios.
You have only one category, the content of the category archive pages are going to be exact copies of the home page archives and possibly copies of the monthly archive pages.
You use tags/categories and the content of some tags/categories match categories or other tags. I had this tag issue on this site, if I tagged all the theme pages with WordPress 2.8, WordPress 2.9 etc… for example, those tag pages would be identical to one another and to the “AdSense WordPress Themes” category. All you can do to avoid this is think through what you are tagging and which single blog posts you put in a particular category. If you go over the top with your categories and tags, all my themes for example could be tagged under WordPress, SEO, AdSense, Make Money Online… but the archive type pages created would be practically identical, so I don’t create that many tags (this site doesn’t really have enough posts to be tagged extensively)!
Although not really a duplicate content issue I never use the monthly archive pages because they add nothing SEO wise to a site. Your categories hold all archived content and it is dated within a category format, so monthly archives are not really needed.
In a future version of Talian I’m dealing with potential canonical issues associated with multiple comment pages. This page for example has just generated it’s 4th page of comments and the main content of each of the 4 pages are the full content of this post (duplicated).
I’ve not noticed duplicate content issues per se, but comment pages 2,3,4…. I’m not finding ranking particularly well for potential SERPs based on the comment content. It’s quite wasteful from an SEO resources perspective having all these partially duplicate pages if they don’t generate traffic in their own right, so I’m testing having pages 2,3,4…. as a canonical version of the main blog post page. This will result in all the comment pages spidered, but treated as one page in Google (this will save link benefit).
If you view source of this page and the other archive comment pages for this page you’ll find within the head:
link rel=’canonical’ href=’http://www.google-adsense-templates.co.uk/wordpress-theme-talian-with-adsense-and-seo-optimisation.html’
I’m testing this now and so far not hit any issues, Google appears to be combining the comment pages into one page as it should.
I’ve been testing a plugin called SEO Super Comments (significantly modified version) with the Talian theme. This plugin creates individual pages for comments (like this comment) that’s linked from these comment pages. The original plugin turns all comments into pages (so a one word comment gets a link!), I’ve modified the plugin to only link to comment pages with a certain number of characters, so a one line comment won’t get it’s own page.
I’m working on this plugin as it’s a real shame to have a site with lots of really good comments and not have them increase traffic to a site. Still at the testing phase, but I’m 99% sure I’ll include the modified plugin with the Talian theme soon. Note: the original SEO Super Comments plugin does not work out the box with Talian, so probably not a good idea trying the original (I couldn’t get it working). I’ve also made other improvements to this plugin.