"smart" access for bots and spiders

florin_andrei
florin_andrei's picture

Joined: 2004-01-11
Posts: 79
Posted: Thu, 2007-08-09 00:08

I'm seeing a lot of noise from search bots and spiders in the Web logs. I am not opposed to search engines indexing my G2 site, but it seems like they're just spending way too much time on it. live.com and yahoo appear to be the worst offenders. They're crawling the slideshow links, PrintPhoto and whatnot.

What I would like to do is tell the bots - "hey, index the main pages for each album, grab the text that's under each image, feel free to grab the image too if you want, but that's it - and by the way, stop being so freakin' obsessive-compulsive".

So, the comments that I put under each image are OK for indexing, and if the search engine can associate the image with the comments, that's fine. But that should be it. It makes no sense for a spider to crawl the Search links on G2.

I guess what I'm looking for is a clever robots.txt file or something like that. I'm kind of thinking that I'm not the first one to have this problem.

Anybody?

--
Florin Andrei
http://florin.myip.org/gallery2/