Robots.txt is very good things for any web sites. The robots.txt is use for web routing purpose. in the robots.txt file , there are instructions are give for web routes for their wer robots. This is called The Robots Exclusion Protocol.
If you want to go www.abc.com/index.html , then it first checks for http://www.abc.com/robots.txt. if /index.html is allowed in robots.txt then you will be routed for that /index.html otherwise you can not go on index.html page.
User-agent: * Disallow: /search Allow: /search/about Disallow: /sdch Disallow: /groups Disallow: /index.html? Disallow: /? Allow: /?hl= Disallow: /?hl=*& Allow: /?hl=*&gws_rd=ssl$ Disallow: /?hl=*&*&gws_rd=ssl Allow: /?gws_rd=ssl$ Allow: /?pt1=true$ Disallow: /imgres Disallow: /u/ Disallow: /preferences Disallow: /setprefs Disallow: /default Disallow: /m? Disallow: /m/ Allow: /m/finance Disallow: /wml? Disallow: /wml/? Disallow: /wml/search? Disallow: /xhtml? Disallow: /xhtml/? Disallow: /xhtml/search? Disallow: /xml? Disallow: /imode? Disallow: /imode/? Disallow: /imode/search? Disallow: /jsky? Disallow: /jsky/? Disallow: /jsky/search? Disallow: /pda? Disallow: /pda/? Disallow: /pda/search? Disallow: /sprint_xhtml Disallow: /sprint_wml Disallow: /pqa Disallow: /palm Disallow: /gwt/ Disallow: /purchases Disallow: /local? Disallow: /local_url Disallow: /shihui? Disallow: /shihui/
.
Create a /robots.txt file on your web site
Where to put it
At the top-level directory of your web server.
For example, for "http://www.abc.com/home/index.html, it will remove the "/home/index.html", and replace it with "/robots.txt", and will end up with "http://www.abc.com/robots.txt".
What to put in it
The "/robots.txt" file is a text file, with one or more records. Usually contains a single record as:User-agent: * Disallow: /search Allow: /search/about Disallow: /sdch Disallow: /groups Disallow: /index.html? Disallow: /? Allow: /?hl= Disallow: /?hl=*& Allow: /?hl=*&gws_rd=ssl$ Disallow: /?hl=*&*&gws_rd=ssl Allow: /?gws_rd=ssl$ Allow: /?pt1=true$