Search engines fight duplication with canonical
Google, Microsoft and Yahoo are now providing a way for web site developers to specify a preferred URL for any piece of content on a web site. The problem has been that sites may, legitimately, have the same content on different URLs. This causes problems for search engines which can't easily differentiate between these duplications. Now, a new type of "canonical" link reference can allow a page to express the preferred URL for a search engine to use.
Google explains the process in a with a step-by-step example. In the example, they show some URLs that point to the same page:
http://www.example.com/product.php?item=swedish-fish
(preferred URL)http://www.example.com/product.php?item=swedish-fish&category=gummy-candy
http://www.example.com/product.php?item=swedish-fish&trackingid=1234&sessionid= 5678
The differences in the examples are caused first by a category parameter and secondly by a tracking id and session id. To set the preferred URL, the page maintainer adds a link
element, with the rel
attribute set to "canonical" and the href
attribute set to the preferred URL. The link
element, goes into into the head
section of the page. For the examples above this would be
<link rel="canonical" href="http://www.example.com/product.php?item=swedish-fish" />
A Google search will now understand that the duplicated links belong to http://www.example.com/product.php?item=swedish-fish
, and additional URL properties like PageRank and related informations, says Google, will be transferred as well.
W3C (the World Wide Web Consortium) specifically provided the rel
attribute in the link
element for use by web developers to define relationships between pages for consumption by search engines.
More details of how canonical will be used by the search engines can be found on the Yahoo announcement, Microsoft's announcement and of course, Google's announcement.
Matt Cutts, a Google Engineer, also published a presentation on the link element and pointed out that canonical plug-ins have already appeared for Wordpress, Magento and Drupal.
(crve)