Much Ado About nofollow

Watching the various debate about Google’s nofollow initiative has been enlightening. Ostensibly, it was supposed to be a way to fight comment spam on weblogs, but predictably it took no time at all for people to figure out how to game the system. Also predictably, anti-nofollow support launched equally quickly.

I won’t use it. At all. Why? Mostly because it’s such a non-issue (it won’t do a thing to comment spam), but a large part of the reasoning is that I won’t be held hostage to what I can write and link to by any one search engine or technology. Nor am I going to let the ranking alorithm of one search engine make me do its work for it, especially if PageRank is broken like some people believe.

It’s a misnamed attribute, actually. Google says links with it “won’t get any credit when we rank websites in our search results,” but the “nofollow” label makes it appear that Google won’t actually follow the link itself. Not so. Google will follow the link, it just will not confer ranking.

More bothersome is the fact that other search engines (Yahoo and MSN, notably) have signed on to this. Why bothersome? Well, because Google’s PageRank algorithm is supposed to be a Trade Secret, and theoretically other search engines’ technologies are Trade Secrets also, so who knows how the others will actually implement processing of this attribute? Will they choose to actually not follow such links, allowing sites to potentially drop out of their indices? There’s no guarantees. But if they’re all similar to PageRank, and PageRank is broken, then they may all be broken and this won’t fix things.

Oh well. My various megalomaniacal rantings won’t change things in the world at large, so I’ll stick to what I can do on my own site. :)

Referrers, search engines, trends

Going through my site’s logfiles, I figured it’s about time for one of those navel-gazing site-analyzing posts. I’ve noticed some trends along the way, I think.

By far, the most search engine hits I get are from Google; over the past 30 days, I clocked 2,617 hits from Google, nearly four times more than Yahoo at 763 hits. In fact, the top ten search engines are:

1 Google 2,617
2 Yahoo 763
3 MSN 188
4 Altavista 82
5 AskJeeves 61
6 AOL Search 35
7 Netscape 20
8 AllTheWeb 16
9 Mamma 4
10 Lycos 4

I’m a little surprised by the amount of variation there.

The trends I’ve noticed are in the breakdown of what people are searching for from each site. Most of the Google searches are for free Palm ebooks, Matrix names, and variations on those themes; it seems that people are using Google to find specific types of information, knowing the parameters of what they’re looking for—targeted. The other search engines, on the other hand, seem to better reflect pop cultural references and more general searching. Among Yahoo searches, for instance, I see such phrases as, “boba fett” (number one), “kermit the frog,” “dell dude,” “a-team movie,” and so on. Same for the others.

So I’d guess that in Google searches, when they find me I’m near the top of the lists for what they’re searching for and the users are looking for specific things. On Yahoo and the others, though, it looks like people are more into browsing on vaguer searches, and clicking through on links that look interesting, but may not be relevant. The conclusion I’d draw from this (not surprisingly) is that Google users are power users, and the search engine people go to who want to really find something and get the job done, whereas Yahoo users are more casual, not so worried about the results, but they’ll do in a pinch.

And of course, the best part of this whole entry: listing some of the best/worst search phrases people have actually typed to get here. All verbatim.

  • thongs in public
  • what’s your name
  • purple flowers
  • jones green bean casserole soda
  • van helsing absinthe
  • donner party cannibalism
  • heroin
  • green bean soda
  • white trash sex
  • pong is a violent game
  • twas the night bush
  • green bean casserole soda
  • ugliest picture
  • topless rotten
  • skinsuits
  • donkey brew
  • if you had a male tiger what would you name it
  • snoop dog fir shizzle
  • frog master
  • fett ass
  • cracker ingredients
  • beer mugs carved in pumpkins
  • what is the proper way to charge cell and cordless phones
  • on the sierra nevada summerfest beer label what mountains are featured
  • is there a formula for figuring out when thanksgiving day will be
  • how do i clean vomit from couch
  • check out my wife
  • turkey soda
  • where is it snowing in the united states november 11, 2004
  • donner party beer
  • emachine turns it’s self on
  • halloween hooch drink

Bots and JavaScript

Here’s something to think about: do any search engine bots and crawlers recognize and parse JavaScript? I haven’t heard of any (and I’m really too lazy right now to do any real research :) ), but I got to thinking about this today, and there’s really no reason that they shouldn’t be able to handle it.

Sure, there’s a lot of cruft and dross in JavaScript code that isn’t relevant in a searchable context, but what about something like I’ve been working on recently: dynamic menus? Each menu item points to a valid page with some contextual link text, but since the menus are generated in JavaScript, the search engine process parsing the content out of the code might easily pass it up and miss the links. Those same links are ultimately being repeated in the actual content of the page, so they’ll be picked up for sure, but what about next time?

Of course, then it would be easy to abuse search engine rankings, by stuffing JavaScript full of hidden and obfuscated content. Perfect for the snake oil of Search Engine Optimization. Even so, though, there might be a lot of content or linkage going unnoticed…