5 common mistakes with rel=canonical

Monday, April 08, 2013

Including a rel=canonical link in your webpage is a strong hint to search engines your about preferred version to index among duplicate pages on the web. It's supported by several search engines, including Yahoo!, Bing, and Google. The rel=canonical link consolidates indexing properties from the duplicates, like their inbound links, as well as specifies which URL you'd like displayed in search results. However, rel=canonical can be a bit tricky because it's not very obvious when there's a misconfiguration.

Example of a page and its HTML markup for rel-canonical.

While the webmaster sees the "red velvet" page on the left in their browser, search engines notice on the webmaster's unintended "blue velvet" rel=canonical on the right. We recommend the following best practices for using rel=canonical:

  • A large portion of the duplicate page's content should be present on the canonical version.
  • Double-check that your rel=canonical target exists (it's not an error or "soft 404").
  • Verify the rel=canonical target doesn't contain a noindex robots meta tag.
  • Make sure you'd prefer the rel=canonical URL to be displayed in search results (rather than the duplicate URL).
  • Include the rel=canonical link in either the <head> of the page or the HTTP header.
  • Specify no more than one rel=canonical for a page. When more than one is specified, all rel=canonical links will be ignored.

Mistake 1: rel=canonical to the first page of a paginated series

Imagine that you have an article that spans several pages:

  • example.com/article?story=cupcake-news&page=1
  • example.com/article?story=cupcake-news&page=2
  • and so on

Specifying a rel=canonical from page 2 (or any later page) to page 1 is not correct use of rel=canonical, as these are not duplicate pages. Using rel=canonical in this instance would result in the content on pages 2 and beyond not being indexed at all.

Example of wrong rel-canonical markups.
Good content (for example, "cookies are superior nutrition" and "to vegetables") is lost when specifying rel=canonical from component pages to the first page of a series.
Example for annotating a page-series with rel-canonical that points to a single page with all the content of the series.
rel=canonical from component pages to the view-all page
Example for annotating pages with rel-canonical and the deprecated rel-prev-next annotations.
If rel=canonical to a view-all page isn't designated, paginated content can use rel="prev" and rel="next" markup.

Mistake 2: Absolute URLs mistakenly written as relative URLs

Example for incorrect rel-canonical markup: wrong relative URLs

The <link> tag, like many HTML tags, accepts both relative and absolute URLs. Relative URLs include a path "relative" to the current page. For example, images/cupcake.png means "from the current directory go to the images subdirectory, then to cupcake.png." Absolute URLs specify the full path—including the scheme like https://.

Specifying <link rel=canonical href="example.com/cupcake.html" /> (a relative URL since there's no https://) implies that the desired canonical URL is https://example.com/example.com/cupcake.html even though that is almost certainly not what was intended. In these cases, our algorithms may ignore the specified rel=canonical. Ultimately this means that whatever you had hoped to accomplish with this rel=canonical will not come to fruition.

Mistake 3: Unintended or multiple declarations of rel=canonical

Occasionally, we see rel=canonical designations that we believe are unintentional. In very rare circumstances we see simple typos, but more commonly a busy site owner copies a page template without thinking to change the target of the rel=canonical. Now the site owner's pages specify a rel=canonical to the template author's site.

Example for incorrect rel-canonical markup: incorrect URL

If you use a template, check that you didn't also copy the rel=canonical sp