Robots.txt Optimization
data:image/s3,"s3://crabby-images/0fb03/0fb03f16c24c1f64db32384b3b9c443f7ddf9661" alt=""
Let give me an example: You may not want Google to crawl the /images directory of your site, as it’s both meaningless to you and a waste of your site’s bandwidth. “Robots.txt” lets you tell Google just that by using simple text file.
Let’s start with an optimization process. Create a regular text file called “robots.txt”, and make sure it’s named exactly that. This file must be uploaded to the root accessible directory of your site, not a subdirectory. The format is simple enough for most intents and purpose; a user-agent line to identify the crawler in question followed by one or more disallow: lines to disallow it from crawling certain part of your site.
1) Here's a basic "robots.txt":
User-agent: *
Disallow: /
With the above declared, all robots (indicated by "*") are instructed to not index any of your pages (indicated by "/").
2) Lets get a little more discriminatory now. While every webmaster loves Google, you may not want Google's Image bot crawling your site's images and making them searchable online, if just to save bandwidth. The below declaration will do the trick:
User-agent: Googlebot-Image
Disallow: /
3) The following disallows all search engines and robots from crawling selected directories and pages:
User-agent: *
Disallow: /cgi-bin/
Disallow: /privatedir/
Disallow: /tutorials/blank.htm
4) You can conditionally target multiple robots in "robots.txt." Take a look at the below:
User-agent: *
Disallow: /
User-agent: Googlebot
Disallow: /cgi-bin/
Disallow: /privatedir/
This is interesting- here we declare that crawlers in general should not crawl any parts of our site, EXCEPT for Google, which is allowed to crawl the entire site apart from /cgi-bin/ and /privatedir/. So the rules of specificity apply, not inheritance.
Hey Good one, thanks for explaining in such an easy way.. :)
ReplyDeletewow
ReplyDeletethank you great
ReplyDeleteExplained in very simple and easy to understand language. Thanks. Great post.
ReplyDeletenice information
ReplyDelete