Solve URL restricted by robots.txt errors
Our site is brand new and we have been carefully watching the Google Bot Crawl statistics from Day 1 and lately we have been noticing a lot of Crawl Errors which says
"Restricted by Robots.txt Detail URL Restricted by robots.txt"
Its strange and clear that all the pages listed on the errors are blocked in robots.txt. Here is the sample from our robots.txt file.
Disallow: /signup.aspx
Disallow: /SignUp.aspx
Disallow: /SignUp.aspx
Disallow: /managerecipe.aspx
Disallow: /managerestaurant.aspx
Disallow: /managebrand.aspx
Disallow: /manage.aspx
Disallow: /managerestaurant.aspx
Disallow: /managebrand.aspx
Disallow: /manage.aspx
But still google crawled them from the other content pages and ended up in throwing errors.
After reviewing the content pages we indentified that every content page has a link to the signup.aspx and thats why its ended up in crawling them.
But luckily google has a way to prevents some links to be crawled by adding rel="nofollow" to the html hyperlinks in the below format
We added them in all pages with the links to Signup and now google seems happy and not seeing errors in the Crawl.
Let us know your comments as well
Comments