Robots.txt

Robots.txt is a plain text file that is implemented in the root directory of a URI as a configuration file used by some search engine spiders and internet robots/bot programs to help direct them to what you want to be indexed and what you don’t. Although many robots will read and follow your instructions in the “/robots.txt” file, many ‘less compliant’ programs may actually ignore this file completely.

Here are a few examples of robots.txt file (plain text):

Ask all search engines to NOT index or follow links on the entire website:

#asks all search engines to NOT index and NOT follow any pages or links on the entire website
User-agent: *
Disallow: /

Allows all search engines to index and follow links on the entire website by Disallowing nothing:

#allows search engines to index and follow all pages and links on the entire website by Disallowing nothing
User-agent: *
Disallow:

Disallows specific folders and files from indexing and following:

User-agent: *
Disallow: /uploads/ # since this folder may contain secure, private, cached or temporary files, we should disallow this entire folder from being indexed.
Disallow: /tmp/ # since this folder may contain cached or temporary files, we should disallow this entire folder from being indexed
Disallow: /page.php


Also see:

  1. Robots Meta Tag
  2. .htacess and mod_rewrite

Leave a Reply

Your email address will not be published. Required fields are marked *