Page
cloaking can broadly be defined as a technique used to deliver different web
pages under different circumstances. There are two primary reasons that people
use page cloaking:
i) It
allows them to create a separate optimized page for each search engine and
another page which is aesthetically pleasing and designed for their human
visitors. When a search engine spider visits a site, the page which has been
optimized for that search engine is delivered to it. When a human visits a site,
the page which was designed for the human visitors is shown. The primary benefit
of doing this is that the human visitors don't need to be shown the pages which
have been optimized for the search engines, because the pages which are meant
for the search engines may not be aesthetically pleasing, and may contain an
over-repetition of keywords.
ii) It
allows them to hide the source code of the optimized pages that they have
created, and hence prevents their competitors from being able to copy the source
code.
Page
cloaking is implemented by using some specialized cloaking scripts. A cloaking
script is installed on the server, which detects whether it is a search engine
or a human being that is requesting a page. If a search engine is requesting a
page, the cloaking script delivers the page which has been optimized for that
search engine. If a human being is requesting the page, the cloaking script
delivers the page which has been designed for humans.
There are
two primary ways by which the cloaking script can detect whether a search engine
or a human being is visiting a site:
i) The
first and simplest way is by checking the User-Agent variable. Each time anyone
(be it a search engine spider or a browser being operated by a human) requests a
page from a site, it reports an User-Agent name to the site. Generally, if a
search engine spider requests a page, the User-Agent variable contains the name
of the search engine. Hence, if the cloaking script detects that the User-Agent
variable contains a name of a search engine, it delivers the page which has been
optimized for that search engine. If the cloaking script does not detect the
name of a search engine in the User-Agent variable, it assumes that the request
has been made by a human being and delivers the page which was designed for
human beings.
However,
while this is the simplest way to implement a cloaking script, it is also the
least safe. It is pretty easy to fake the User-Agent variable, and hence,
someone who wants to see the optimized pages that are being delivered to
different search engines can easily do so.
ii) The
second and more complicated way is to use I.P. (Internet Protocol) based
cloaking. This involves the use of an I.P. database which contains a list of the
I.P. addresses of all known search engine spiders. When a visitor (a search
engine or a human) requests a page, the cloaking script checks the I.P. address
of the visitor. If the I.P. address is present in the I.P. database, the
cloaking script knows that the visitor is a search engine and delivers the page
optimized for that search engine. If the I.P. address is not present in the I.P.
database, the cloaking script assumes that a human has requested the page, and
delivers the page which is meant for human visitors.
Although
more complicated than User-Agent based cloaking, I.P. based cloaking is more
reliable and safe because it is very difficult to fake I.P. addresses.
Now that
you have an idea of what cloaking is all about and how it is implemented, the
question arises as to whether you should use page cloaking. The one word answer
is "NO". The reason is simple: the search engines don't like it, and
will probably ban your site from their index if they find out that your site
uses cloaking. The reason that the search engines don't like page cloaking is
that it prevents them from being able to spider the same page that their
visitors are going to see. And if the search engines are prevented from doing
so, they cannot be confident of delivering relevant results to their users. In
the past, many people have created optimized pages for some highly popular
keywords and then used page cloaking to take people to their real sites which
had nothing to do with those keywords. If the search engines allowed this to
happen, they would suffer because their users would abandon them and go to
another search engine which produced more relevant results.
Of course,
a question arises as to how a search engine can detect whether or not a site
uses page cloaking. There are three ways by which it can do so:
i) If the
site uses User-Agent cloaking, the search engines can simply send a spider to a
site which does not report the name of the search engine in the User-Agent
variable. If the search engine sees that the page delivered to this spider is
different from the page which is delivered to a spider which reports the name of
the search engine in the User-Agent variable, it knows that the site has used
page cloaking.
ii) If the
site uses I.P. based cloaking, the search engines can send a spider from a
different I.P. address than any I.P. address which it has used previously. Since
this is a new I.P. address, the I.P. database that is used for cloaking will not
contain this address. If the search engine detects that the page delivered to
the spider with the new I.P. address is different from the page that is
delivered to a spider with a known I.P. address, it knows that the site has used
page cloaking.
iii) A
human representative from a search engine may visit a site to see whether it
uses cloaking. If she sees that the page which is delivered to her is different
from the one being delivered to the search engine spider, she knows that the
site uses cloaking.
Hence,
when it comes to page cloaking, my advice is simple: don't even think about
using it.
Article by Sumantra Roy. Sumantra is one of the
most respected search engine positioning specialists on the Internet. To have
Sumantra's company place your site at the top of the search engines, go to
http://www.1stSearchRanking.com/