A canonical URL lets you tell search engines that certain similar URLs are actually the same. Sometimes you have products or content that can be found on multiple URLs – or even multiple websites, but by using canonical URLs (HTML link tags with the attribute rel=canonical), you can have these on your site without harming your rankings.
What is the canonical link element?
The idea is simple: if you have several similar versions of the same content, you pick one “canonical” version and point the search engines at it. This solves the duplicate content problem where search engines don’t know which version of the content to show in their results. This article takes you through how and when to use them, and how to avoid common mistakes.
The SEO benefit of rel=canonical
Choosing a proper canonical URL for every set of similar URLs improves the SEO of your site. This is because the search engine knows which version is canonical, so it can count all the links pointing at all the different versions as links to the canonical version. Setting a canonical is similar in concept to a 301 redirect, only without actually redirecting.
The process of canonicalization
When you have several choices for a product’s URL, canonicalization is the process of picking one of them. In many cases, it’ll be obvious: one URL will be a better choice than others. In some cases, it might not be as obvious, but even then it’s still pretty simple: just pick one! Not canonicalizing your URLs is always worse than canonicalizing your URLs.
How to set canonical URLs
Correct example of using rel=canonical
Let’s assume you have two versions of the same page, each with exactly – 100% – the same content. The only difference is that they’re in separate sections of your site and because of that the background color and the active menu item are different – that’s it. Both versions have been linked to from other sites, so the content itself is clearly valuable. So which version should search engines show in results?
For example, these could be their URLs:
- http://example.com/wordpress/seo-plugin/
- http://example.com/wordpress/plugins/seo/
This is what rel=canonical
was invented for and, unfortunately, this happens fairly often, especially in a lot of e-commerce systems. A product can have several different URLs depending on how you got there. In this case you would apply rel=canonical
as follows:
- Pick one of your two pages as the canonical version. This should be the version you think is the most important. If you don’t care, pick the one with the most links or visitors, and if all else is equal, flip a coin. You just need to choose.
- Add a rel=canonical link from the non-canonical page to the canonical one. So if we picked the shortest URL as our canonical URL, the other URL would link to the shortest URL in the
<head>
section of the page – like this:<link rel="canonical" href="http://example.com/wordpress/seo-plugin/" />
That’s it. Nothing more, nothing less.
What this does is “merge” the two pages into one from a search engine’s perspective. It’s a “soft redirect”, without redirecting the user. Links to both URLs now count as the single, canonical version of the URL.
Setting the canonical in Yoast SEO
For posts, pages, and custom post types, you can edit the canonical in the advanced tab of the Yoast SEO metabox:
For categories, tags and other taxonomy terms, you can change them in the same place in the Yoast SEO metabox too. If you have other advanced use cases, you can also use the wpseo_canonical filter
to change the Yoast SEO output.
When should you use canonical URLs?
301 redirect or canonical?
If you are unsure whether to do a 301 redirect or set a canonical, what should you do? The answer is simple: you should always do a redirect, unless there are technical reasons not to. If you can’t redirect because that would harm the user experience or be otherwise problematic, then set a canonical URL.
Should a page have a self-referencing canonical URL?
In the example above, we link the non-canonical page to the canonical version. But should a page set a rel=canonical for itself? This question is a much-debated topic amongst SEOs. At Yoast, we strongly recommend having a canonical link element on every page and Google has confirmed that’s best. That’s because most CMS’s will allow URL parameters without changing the content. So all of these URLs would show the same content:
- http://example.com/wordpress/seo-plugin/
- http://example.com/wordpress/seo-plugin/?isnt=it-awesome
- http://example.com/wordpress/seo-plugin/?cmpgn=twitter
- http://example.com/wordpress/seo-plugin/?cmpgn=facebook
The issue is that if you don’t have a self-referencing canonical on the page that points to the cleanest version of the URL, you risk being hit by this. If you don’t do it yourself, someone else could do it to you and cause a duplicate content issue, so adding a self-referencing canonical to URLs across your site is a good “defensive” SEO move. Luckily, our Yoast SEO plugin does this for you.
Cross-domain canonical URLs
Perhaps you have the same piece of content on several domains. There are sites or blogs that republish articles from other websites on their own, as they feel the content is relevant for their users. In the past, we had websites republishing articles from Yoast.com as well (with express permission), but if you had looked at the HTML of every one of those articles you’d found a rel=canonical link pointing right back to our original article. This means all the links pointing to their version of the article count towards the ranking of our canonical version. They get to use our content to please their audience, and we get a clear benefit from it too. Everybody wins.
Faulty canonical URLs: common issues
There are many examples out there of how a wrong rel=canonical implementation can lead to huge issues. I’ve seen several sites where the canonical on their homepage was pointed at an article, only to see their home page disappear from search results. There are other things you should never do with rel=canonical. Here are the most important:
- Don’t canonicalize a paginated archive to page 1. The rel=canonical on page 2 should point to page 2. If you point it to page 1, search engines will actually not index the links on those deeper archive pages…
- Make them 100% specific. For various reasons, many sites use protocol-relative links, meaning they leave the http / https bit from their URLs. Don’t do this for your canonicals. You have a preference, so show it.
- Base your canonical on the request URL. If you use variables like the domain or request URL used to access the current page while generating your canonical, you’re doing it wrong. Your content should be aware of its own URLs. Otherwise, you could still have the same piece of content on – for instance – example.com and www.example.com and have each of them canonicalize to themselves.
- Multiple rel=canonical links on a page causing havoc. Sometimes a developer of a plugin or extensions thinks they are God’s greatest gift to mankind and they know the best way to add a canonical to the page. Sometimes, that developer is right, but since you can’t all be me, they’re inevitably wrong sometimes too. When we encounter this in WordPress plugins, we try to reach out to the developer doing it and teach them not to, but it still happens. And when it does, the results are wholly unpredictable.
rel=canonical and social networks
Facebook and Twitter honor rel=canonical too, and this might lead to weird situations. If you share a URL on Facebook that has a canonical pointing elsewhere, Facebook will share the details from the canonical URL. In fact, if you add a ‘like’ button on a page that has a canonical pointing elsewhere, it will show the like count for the canonical URL, not for the current URL. Twitter works in the same way.
Advanced uses of rel=canonical
Canonical link HTTP header
Google also supports a canonical link HTTP header. The header looks like this:
Link: <http://www.example.com/white-paper.pdf>; rel="canonical"
Canonical link HTTP headers can be very useful when canonicalizing files like PDFs, so it’s good to know that the option exists.
Using rel=canonical on not so similar pages
While I wouldn’t recommend this, you can definitely use rel=canonical very aggressively. Google honors it to an almost ridiculous extent, where you can canonicalize a very different piece of content to another piece of content. However, if Google catches you doing this, it will stop trusting your site’s canonicals and thus cause you more harm…
Using rel=canonical in combination with hreflang
We also talk about canonical in our ultimate guide to hreflang. That’s because it’s very important that when you use hreflang, each language’s canonical points to itself. Make sure that you understand how to use canonical well when you’re implementing hreflang, as otherwise you might kill your entire hreflang implementation.
Conclusion: rel=canonical is a power tool
Rel=canonical is a powerful tool in an SEO’s toolbox, but like any power tool, you should use it wisely as it’s easy to cut yourself. For larger sites, the process of canonicalization can be very important and lead to major SEO improvements.