|
|
Question : Google URL restricted by robots.txt
|
|
Hello, I have URL's restricted by my robots.txt, one of the urls restricted is
http://www.mysite.com/ignored_by.asp?ding=606
Where as this seems to work but as you can see its a dynamic link and in googles webasters tools I have requested to have it ignore /ignored_by.asp My question is how can I have it ignore all /ignored_by.asp along with the dynamic contents after the .asp
can I do something like /ignored_by.asp* or is there something else I could do.
regards
k
|
Answer : Google URL restricted by robots.txt
|
|
Googlebot DOES support wildcards.
So, you can do something like this for Googlebot:
Disallow: /cgi-bin/somescript.cgi?*
Make sure you make your disallow as specific as possible - because you don't want to potentially do damage to your rankings, of course. ( I'm sure it's a silly thing to mention -but- after working for 15 years in the industry ... sometimes checking if my LAN cables are plugged in ... is the last thing I do. )
For more info, go here:
http://groups.google.com/group/Google_Webmaster_Help-Tools/browse_thread/thread/70a9141e647c0131/4b4e021ac7787284?lnk=gst&q=wildcards+googlebot+#4b4e021ac7787284
Please keep in mind, this is not part of the robots.txt protocol -so- it's non-standard. Yahoo's bot (slurp) supports wildcards too. So, even though it's nonstandard, you've got your wishes covered in about 90% of web searches. I'm sure it'll work great.
|
|
|
|
|