Monday, April 08, 2013
Including a
rel=canonical
link
in your webpage is a strong hint to search engines your about preferred version to index among
duplicate pages on the web.
It's supported by several search engines, including
Yahoo!,
Bing,
and Google. The rel=canonical
link consolidates indexing properties from the
duplicates, like their inbound links, as well as specifies which URL you'd like displayed in
search results. However, rel=canonical
can be a bit tricky because it's not very
obvious when there's a misconfiguration.

While the webmaster sees the "red velvet" page on the left in their browser, search engines notice
on the webmaster's unintended "blue velvet" rel=canonical
on the right. We recommend
the following best practices for using rel=canonical
:
- A large portion of the duplicate page's content should be present on the canonical version.
-
Double-check that your
rel=canonical
target exists (it's not an error or "soft 404
"). -
Verify the
rel=canonical
target doesn't contain a noindex robotsmeta
tag. -
Make sure you'd prefer the
rel=canonical
URL to be displayed in search results (rather than the duplicate URL). -
Include the
rel=canonical
link in either the<head>
of the page or the HTTP header. -
Specify no more than one
rel=canonical
for a page. When more than one is specified, allrel=canonical
links will be ignored.
Mistake 1: rel=canonical
to the first page of a paginated series
Imagine that you have an article that spans several pages:
- example.com/article?story=cupcake-news&page=1
- example.com/article?story=cupcake-news&page=2
- and so on
Specifying a rel=canonical
from page 2 (or any later page) to page 1 is not correct
use of rel=canonical
, as these are not duplicate pages. Using
rel=canonical
in this instance would result in the content on pages 2 and beyond not
being indexed at all.

rel=canonical
from component pages to the first page of a series.

rel=canonical
from component pages to the view-all page

rel=canonical
to a view-all page isn't designated, paginated content can use
rel="prev"
and rel="next"
markup.
Mistake 2: Absolute URLs mistakenly written as relative URLs

The <link>
tag, like many HTML tags, accepts both relative and absolute URLs.
Relative URLs include a path "relative" to the current page. For example,
images/cupcake.png
means "from the current directory go to the images
subdirectory, then to cupcake.png
." Absolute URLs specify the full path—including the
scheme like https://
.
Specifying <link rel=canonical href="example.com/cupcake.html" />
(a relative
URL since there's no https://
) implies that the desired canonical URL is
https://example.com/example.com/cupcake.html
even though that is
almost certainly not what was intended. In these cases, our algorithms may ignore the specified
rel=canonical
. Ultimately this means that whatever you had hoped to accomplish with
this rel=canonical
will not come to fruition.
Mistake 3: Unintended or multiple declarations of rel=canonical
Occasionally, we see rel=canonical
designations that we believe are unintentional. In
very rare circumstances we see simple typos, but more commonly a busy site owner copies a page
template without thinking to change the target of the rel=canonical
. Now the site
owner's pages specify a rel=canonical
to the template author's site.

If you use a template, check that you didn't also copy the rel=canonical
sp