There has been some recent discussion in the SEO community about whether Google and Bing have different rules for the use of the rel=”canonical” tag. Google has said it is fine to have self-referential canonical tags (ie. the rel=”canonical” tag specifies the same URL as the page you are on), whereas Bing indicates they’d prefer the canonical tag be left blank in that case.
The proper use of rel=”canonical” can be confusing at best, and can produce devastating results at worst. So what is an SEO to do?
First of all, realize that using rel=”canonical” isn’t necessary in many cases of duplicate content. The canonical tag is a great tool for extreme situations and enterprise-level sites, but on small to medium sized websites there are often other solutions.
Choose Non-WWW to WWW, or Vice Versa
A lot of canonical issues arise because a website is available at both the WWW and non-WWW versions of the domain, and other sites may end up linking to either version. Using your favorite method, redirect the non-WWW to the WWW, or vice versa. If you do this when the site is initially built, you can eliminate most instances of the wrong version being linked to (people tend to just grab whatever URL is in the address bar, anyway). Make the choice early on, and stick with it.
Don’t Use URL Parameters
If possible, try to avoid using parameters in URLs.
- If you run an e-commerce or community-based site, store all session information in a cookie rather than as parameters. This is a programming best practice to ensure users don’t get access to each others’ information.
- Avoid specifying sort order or viewing options of a search results or product page in the URL. It is better to display the page with a static URL, and make use of AJAX for sorting and filtering.
- If you use tracking parameters for referrals, replace the question mark (?) in the URL with a hashtag (#). Don’t forget to adjust your Google Analytics tracking code to allow hashtags!
Don’t Generate URLs on the Fly
Some content management systems generate page URLs on the fly, based on how the user navigated to it. I suppose the idea behind it is that the URL then becomes sort of a breadcrumb trail so the user can easily figure out how get to higher levels, but this is just a bad idea on so many levels.
Ideally, you want each product, article, post or resource to be available at a single, static URL, regardless of how the user got there.
No matter how hard you try to avoid it, you sometimes end up with pages on your site being linked to in strange ways, potentially causing duplicate content issues. In cases like this, you may want to set up redirect rules that specify certain parameters (or all parameters, if you want to be extreme) get redirected back to the root page. If it only occurs on a few pages across your site, you can individually redirect them to the preferred URL (an advantage small sites have over enterprise ones).
In general, you should try to prevent duplicate content issues before they happen, or fix them when they do. Use of the rel=”canonical” tag is advanced and shouldn’t be applied as a blanket solution to all canonicalization problems.
Photo: Jason Tester/Flickr