Microsoft
Software
Hardware
Network
Question : Google URL restricted by robots.txt
Hello,
I have URL's restricted by my robots.txt, one of the urls restricted is
http://www.mysite.com/igno
red_by.asp
?ding=606
Where as this seems to work but as you can see its a dynamic link and in googles webasters tools I have requested to have it ignore /ignored_by.asp
My question is how can I have it ignore all /ignored_by.asp along with the dynamic contents after the .asp
can I do something like /ignored_by.asp* or is there something else I could do.
regards
k
Answer : Google URL restricted by robots.txt
Googlebot DOES support wildcards.
So, you can do something like this for Googlebot:
Disallow: /cgi-bin/somescript.cgi?*
Make sure you make your disallow as specific as possible - because you don't want to potentially do damage to your rankings, of course. ( I'm sure it's a silly thing to mention -but- after working for 15 years in the industry ... sometimes checking if my LAN cables are plugged in ... is the last thing I do. )
For more info, go here:
http://groups.google.com/g
roup/Googl
e_Webmaste
r_Help-Too
ls/
browse_
thread/thr
ead/70a914
1e647c0131
/4b4e021ac
7787284?
ln
k=gst&q=wi
ldcards+go
oglebot+#4
b4e021ac77
87284
Please keep in mind, this is not part of the robots.txt protocol -so- it's non-standard. Yahoo's bot (slurp) supports wildcards too. So, even though it's nonstandard, you've got your wishes covered in about 90% of web searches. I'm sure it'll work great.
Random Solutions
seo contract / agreement
Regular expression riddles
trying to use a script python on xp, read on a truecrypt container, i receive a permission problem when i'm the administrator, help?
Use Palm TREO to Remote Desktop to client PC's
the name cannot be resolved. the name cannot be matched with a name in the address list.
Microsoft , Vista, Business
Outlook Express 6 Sent ITems Disappeared
Subqueries
Stop Students accessing USB Devices such as pen drives
Cisco PIX will not accept command to allow RPC