Sunday, July 11, 2010

Solve URL restricted by robots.txt errors

Our site is brand new and we have been carefully watching the Google Bot Crawl statistics from Day 1 and lately we have been noticing a lot of Crawl Errors which says

"Restricted by Robots.txt Detail URL Restricted by robots.txt"

Its strange and clear that all the pages listed on the errors are blocked in robots.txt. Here is the sample from our robots.txt file.

Disallow: /signup.aspx
Disallow: /SignUp.aspx

Disallow: /managerecipe.aspx
Disallow: /managerestaurant.aspx
Disallow: /managebrand.aspx
Disallow: /manage.aspx
But still google crawled them from the other content pages and ended up in throwing errors.
After reviewing the content pages we indentified that every content page has a link to the signup.aspx and thats why its ended up in crawling them.

But luckily google has a way to prevents some links to be crawled by adding rel="nofollow" to the html hyperlinks in the below format

We added them in all pages with the links to Signup and now google seems happy and not seeing errors in the Crawl.

Let us know your comments as well


heather davis said... - Gourmandia is a culinary website offering videos of world-class Michelin rated chefs exhibiting their techniques. Also features documentaries on fine dining restaurant locations and cities, recipes, forum, and more. At - Gourmet Recipe is the place to find the tastiest, healthiest gourmet recipes. Watch videos of great chefs preparing meals, find easy beginner dishes, and more.

yvette45 said...

I appreciate your time and effort to come up this kind of article. Try to take a glance at gourmetrecipe. A recommendation for you to relax and unwind for a while...In that website you will find many tasty and nutritious dishes for you to choose from and your family. I also assure you, you won't regret it for visiting this site.