> > Naturally, it's not hard to spot the Google spider, and > give it access > > to an abridged PDF (ya don't want the lot ending up in the > cache!). AKA 'doorway' > > pages. Google takes a dim view of this, and banishes those it > > catches. I'm not sure how it catches them though, a second > spider disguised as a browser? > > It really must be. > > Search for any of my pages, and a decent number will have > weird crap in the google results from my "put up the > webservers logs" background. > > For awhile I had some code that would detect the google > spider and simply disable that stuff. But I noticed that > every new page I put up would work... then about a week or > two later that log crap would show up again. > > Google definetely has second spiders. Why don't you use robots.txt like you're supposed to? That's exactly the sort of thing that gets you kicked out of Google. Serving up a different result to the Google spider than what a browser would see means you're trying to rig the system. Browsers see spam, spider sees keywords. Tsk, naughty! Anyway, I doubt the spider runs Javascript, so it may not have even noticed unless you were doing it server-side. Tony -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist