Offshore outsourcing, IT services, Software development India

Keeping Content Out Of Google’s Search Results From the Start

Ideally, information that you don’t want in Google’s index won’t end up there at all. The best ways to ensure this are:

* Require a login to access the information

This method, of course, not only keeps Google out, but ensures that only those you want to view the content are able to. You would use this method, for instance, to keep personal information like credit card and social security numbers private and to manage access to premium content.

* Use the Robots Exclusion Protocol to block search engines from crawling and/or indexing the content

You can block content using a robots.txt file, a robots meta tag, or an X-Robots tag in the page header. Using a Disallow statement in the robots.txt file keeps Googlebot from crawling the page, although the URL itself may still end up indexed. Using a noindex robots meta tag on the page allows Googlebot to crawl the page, but keeps Google from indexing the page contents or displaying the URL in the search results. (Note that while the Noindex directive in robots.txt has been unofficially followed by Google, it hasn’t stated support for this directive officially, so it’s not an ideal way to ensure content remains out of Google search results.)

Source url : http://searchengineland.com/removing-pages-from-google-53086

Offshore outsourcing India

Monday, November 8, 2010

Removing website pages from Google: Keeping some pages confidential

No comments:

Post a Comment