March 31, 2004

April Fool's... Not

No, I don't really care for April Fool's Day on the Web. I find it amusing for exactly 90 seconds when the occasional, well-done, online April Fool's prank shows up, but when sites like Slashdot and others have every other item an April Fool's joke, it gets really old really fast.

Posted by jon at 11:51 PM


Governator No More

Sadly, it appears that The Governator Ale is to be no more. Lawyers for Schwarzenegger basically issued a cease-and-desist. I mean, really. Where's the harm? Arnold should've snatched up a bottle and gotten into the fun: like saying, "Hasta la vista, baby" and then drinking it, or, "I'll be back... for more Governator Ale!"

Posted by jon at 11:43 PM


March 29, 2004

Some nights I just hate computers...

God damn the computers are pissing me off tonight. All evening our broadband cable connection has just been running slower than molasses, so it takes forever to accomplish anything online. And then I'm trying to get my wife's computer fixed up, it's been running really slow lately and locking up a lot. So I rolled back the Windows ME that was installed on it (have I mentioned before how I hate Windows ME??) to Windows 98, which by and large worked well enough, but now can't get the blasted TCP/IP to work properly.

It tells me it's assigned to some 169.* address, and the DHCP server is "255.255.255.255" (yeah, sure), instead of being sensible and using the perfectly acceptable DHCP server and IP address assignment that has worked with every other computer we've had in this house. And the worst part is, I'm sure I've encountered this same problem at work, and solved it, but I can't remember what the solution was. I've already tried uninstalling and re-installing TCP/IP, so I don't know. Maybe it's just time for the straight low-level format route. Son of a bitch.

Posted by jon at 11:38 PM


March 28, 2004

Deadwood

Only two episodes in and I'm already liking HBO's new series, "Deadwood," quite a lot. It's quite a bit different from any other Western series I've seen on TV; they're certainly pulling no punches.

Posted by jon at 11:25 PM


March 26, 2004

Frontier Doctor

Frontier Doctor: Observations on Central Oregon and the Changing WestI was browsing at Barnes and Nobles this evening and found a book that looks very interesting (so I bought it): Frontier Doctor: Observations on Central Oregon and the Changing West. It's the autobiographical account of a doctor during the formative years of Bend.

Urling Coe came to the new town of Bend, Oregon, in 1905, a young medical student graduate seeking adventure and opportunity in the West. Frontier Doctor, Coe's autobiographical account of his thirteen-year residency, details the extraordinary experiences of a young physician in frontier Oregon and offers a vivid social history of town and ranch life on the Oregon high desert.

Cool! Looks very much like a fun and interesting read.

Posted by jon at 11:12 PM


Bend.com Needs RSS

Okay, Bend.com seriously needs an RSS feed. I'm seriously considering writing a script to scrape their archive page for headlines and producing one myself.

Posted by jon at 11:30 AM


March 25, 2004

Conspiracies in Web Tracking

Despite my headline, I'm not really going to go all Mulder on you and start ranting about Big Brother and privacy issues and all that. Instead it's just some thoughts I've been entertaining lately on technology and tracking people and habits on the Web. Some people may choose to see the things I'm writing about as conspiratorial, and that's fine for them; they may not want to read on, though :) .

More...

Posted by jon at 10:47 PM


March 24, 2004

South Sister Quakes

Sweeping the local news this evening is the South Sister earthquakes: more than 100 shook the area three miles west of the South Sister today, with a magnitude of up to 1.5 on the Richter scale. Bend.com has the best writeup on the story I've seen online.

The quakes were occurring in the northeast part of an area centered three miles west of South Sister, in which the ground has undergone what scientists call "crustal uplift" (but others have called "the bulge") by as much as 25 centimeters (about 10 inches) since late 1997....

The magma appears to be accumulating at a depth about four miles below the ground surface, and measures about 50 million cubic yards in volume.

Interesting stuff; of course the entire Cascade Range is geologically active, so it's not really a surprise, but with the South Sister about, oh, 30 miles away, this news has more than a few people worried, I'm sure.

Personally, I'd expect Mount Hood to be the one to erupt first, of all of them.

Posted by jon at 11:45 PM


Oregon Weblogs

Just wanted to give a few plugs and props to one of the better weblog-related sites out there, Oregon Blogs.

I thought about trying to describe what it does, but the best I could come up with is that it's equal parts RSS aggregator, group blog, and weblog directory rolled into one; really, the way to find out about ORBlogs is to just visit it. It's one of the more solid, useful and innovative blogging apps I've seen (even if it is written in ASP! :) ), and continues to surprise me with new features; for instance, clicking on the "info" link for a blog reveals a detail-rich page devoted to that site, including a snapshot of what the site looks like, state and city-scale maps showing where the blog lives, metatag data gleaned from the HTML source (much of it clickable), and the most recent blog posts.

Definitely a great site. Worth looking at even if you're not from Oregon.

Posted by jon at 9:47 PM


Button Trendsetter

After I last posted about the buttons (see "Button Sites" from March 14) on Web sites, and the button maker, I've noticed that others are taking notice, and it's popping up elsewhere:

Damn, I'm ahead of the curve! I must be some kind of trendsetter!

... and if you believe that, I have some lovely oceanfront property in South Dakota you might be interested in...

Posted by jon at 12:06 AM


March 22, 2004

WTF is up with Amazon?

Just WTF is up with Amazon.com this evening? I'm trying to place an order and I keep getting a "We're Sorry!" message every other page load:

An error occurred when we tried to process your request. Rest assured, we're working to resolve the problem as soon as possible. If you were trying to make a purchase, please check Your Account to confirm that the order was placed. We apologize for the inconvenience.

It's totally unusable and just completely pissing me off.

Heh. You can tell I have a lot of patience for this sort of thing, eh?

Posted by jon at 11:18 PM


American Gods

Didn't post anything yesterday because I was wrapped up finishing American Gods, by Neil Gaiman. Amazing, fascinating book. Absolutely worth it.

Posted by jon at 11:08 AM


March 20, 2004

And another Bend blogger

Add to that ever-so-slightly growing list of Bend (Oregon) webloggers: Brainside Out. Excellent.

Posted by jon at 11:22 PM


March 19, 2004

Search Patch

While waiting to find out if my hosting provider will change the minimum fulltext word length for MySQL, here's what I've done in the meantime to deal with viable three-character search terms.

First, I split the search string into the component words (an array). I subtract any stopwords (I've got a big list) and for any remaining words that are under four characters long, I add to the SQL query I'm running.

Here's the basic form of the query that I'm running, say searching for "porter":

SELECT *,
MATCH(body) AGAINST('porter') AS relevance
FROM content
WHERE MATCH(body) AGAINST('porter')
AND [additional conditions]
ORDER BY relevance DESC
LIMIT 10

This uses fulltext indexing to search for "porter" with weighted relevance, and returns the appropriate content and its relevance score. Pretty straightforward, and it works really well.

Here's what the modified query looks like, if there's short words present, for the search "porter php":

SELECT *,
MATCH(body) AGAINST('porter') +
  (1 / INSTR(body, 'php') + 1 / 2[position of word in string])
AS relevance
FROM content
WHERE ( MATCH(body) AGAINST('porter')
  OR body REGEXP '[^a-zA-Z]php[^a-zA-Z]'
  )
AND [additional conditions]
ORDER BY relevance DESC
LIMIT 10

Two new things are happening. First, in the WHERE clause, I'm using both the fulltext system to find "porter" and using a regular expression search for "php." Why REGEXP and not LIKE? Because if I write LIKE '%cow%' for instance, I'll not only get "cow" but also "coworker" and other wrong matches. A regular expression lets me filter those scenarios out.

That takes care of finding the words, but I also wanted to tie them into relevance, somehow. The solution I hit upon in the above SQL is relatively simple, and does the trick well enough for my tastes. Basically, the sooner the word appears in the content, the higher its relevance, which is reflected in the inverse of the number of characters "deep" in the content it appears. And I wanted to fudge the number a bit more by weighting the position of the keyword in the search string; the sooner the keyword appears, the higher the relative score it gets.

It's not perfect, and I definitely wouldn't recommend using this method on a sufficiently large dataset, but for my short-term needs it works just fine. The only thing really missing in the relevance factoring is how many times the keyword appeared in the content, but I can live without that for now.

Posted by jon at 10:49 PM


March 18, 2004

Searching and Minimum Word Length

Mike Boone, in the comments section of yesterday's entry on searching ("Updated Search"), correctly points out that searching my site for a word that is less than four characters in length (like "php" or "cow") does not work—no results are returned. Obviously, since I write about PHP on occasion, this is untenable.

The problem is that MySQL's fulltext indexing, by default, only indexes words greater than three characters long, and I don't think I have any way to change this, despite my initial reply to Mike's comment. This site is running on a shared server setup on pair.com, and I have absolutely zero control over the MySQL server configuration. I might post a question to their tech support, but I'm not overly optimistic about the response. So, what to do?

Short term, here's my solution (though it's not implemented yet): examine each word in the search string, throwing out stopwords (like "the," "and," "so," etc.), and for any word shorter than four characters long, do a LIKE search against the content for them. No, it's not ideal, but it's a patch. Comments?

Posted by jon at 10:47 PM


March 17, 2004

Updated Search

I've been vastly updating the search functionality on my site. I'm still using MySQL's built-in FULLTEXT indexing to perform searches, but I've made the results page look a lot more (okay, almost exactly like) Google's. The main differences are that I'm not paginating search results (yet)—all searches limit to 10 results—and that I'm showing a relevance percentage, the first result being arbitrarily determined to be a 100% relevant.

To determine relevance, I'm relying on MySQL: a fulltext MATCH(field) AGAINST('search string') directive will return the relevance number that MySQL computes when used in the SELECT part of a query. (See MySQL Full-text Search in the online manual for detailed info on this.)

Further plans for searching that I haven't implemented yet: utilizing MySQL's IN BOOLEAN MODE parameter with searching to allow advanced things like phrase searches (with quotes), required word matching (using the plus sign), and subexpressions using parentheses. It's pretty cool stuff. Oh, and I want to be smarter about presenting excerpts: Google tries to show you content excerpts with your search terms in them, I want to be able to do the same; currently I'm just showing the first 250 or so characters of the text with HTML stripped out of it.

And since I'm developing my whole Personal Publishing System in an open process, I'll write up a detailed technical article soon on how to effectively use MySQL fulltext searching and show Google-like results. All real-world; the code will be cribbed right out of my search.php file.

Posted by jon at 11:47 PM


St. Patrick's Day

Just a quick note to wish everyone (Irish or not) a happy St. Patrick's Day today.

Enjoy some Guinness!

Posted by jon at 10:15 AM


March 16, 2004

PHP Development Hint

Here's a general hint for PHP development: A quick and easy way to check for syntax or compile errors without uploading the PHP script to the Web server and testing online through a browser is via the command line. It's obvious, and I don't know why I didn't think of this sooner, but I've been doing more and more of it lately.

I develop primarily under Windows (with PHP installed) and upload to a Unix-variant server, and this what I've been doing to run a PHP script on the command line on my Windows system:

php-cli -l filename.php

You could omit the -l option (it's a syntax check option only) to parse and run the code, if you like. Either way, it's an easy way to check your code without uploading it and potentially breaking your site.

Posted by jon at 3:35 PM


March 15, 2004

New (Old) Design

Just flipped the switch on the site design I wrote about (see "Everything Old is New Again"). So far things are looking good, but there might be some bugs still lurking. And right now the changes only apply to the blog pages; I haven't reworked the ebooks page or others, yet.

And there's two new pages available: What is Syndication? and My Projects. The Syndication page is a sort-of FAQ on syndicating a site and RSS—a helper page, or primer page, as it were, to anyone who sees my RSS link and wonders what the hell that is. Consider it a draft, but I will be updating and maintaining that page, and aim to make it a good landing page for syndication/RSS questions.

Posted by jon at 4:43 PM


March 14, 2004

Button Sites

My post about the buttons from the other day ("Those small web buttons...") yielded up some excellent links:

Taylor McKnight has an amazing archive of all the known buttons, 2025 at current count. Nice. I'm not even bothering to "collect" any more.

Kalsey Button Maker is an online app that automagically creates the buttons for you. Though with all the buttons available at the other site, this is kind of redundant—but it's fun to fool around with.

Posted by jon at 11:18 PM


Random Law and Order Plot Generator

Here's a funny link for tonight: the Random Law and Order Plot Generator. Enjoy!

Posted by jon at 12:19 AM


March 12, 2004

Google Image Search

Playing around with Google's image search, I've thought of some advanced search features they need to implement. Hopefully someone at Google is reading this and will get right on it ;)

You need to be able to search by specific image dimensions (in pixels); for example, I'd like to be able to type "width:80 height:15" or maybe "dimensions:80x15" and have Google return all the images that are 80 by 15 pixels (yes, this idea is directly related to my last post on the 80x15 images). This can't be hard; Google's already caching the size of the image and displaying that on the search results pages, so why not be able to search them?

Posted by jon at 11:21 PM


Those small web buttons...

I've been noticing recently the proliferation (mostly on blog sites) of those small image files that are 80 pixels wide by 15 pixels high, are generally two-tone in color and use a simple old-school-looking font. Like these:

RSS 2.0 button ORBlogs.com button

And I'm wondering, what's the story? What are they called, exactly? (I'm thinking either "buttons" or "badges.") Who's making them? I think they're pretty cool, actually; clever, simple, and elegant, and a damn good graphical meme that's working it's way around my brain. I just haven't been able to find out anything about them online, and I'm getting really curious.

So I've started "collecting" them, saving any news ones I come across into a "badges" directory on my computer. I've got 42 already in two days. (Hmmm, 42. Coincidence?)

So, what's the scoop? Anyone know?

Posted by jon at 11:07 PM


Old Farm District

So I was driving home from work today and as I crossed Third Street onto Brosterhous I noticed a new sign proclaiming the area I was entering the "Old Farm District." What was interesting about this is that the sign is in the same style as those of The Old Mill District, so I thought perhaps the city of Bend was giving the area a facelift in the same way the Old Mill District had been done. Which would be kind of cool; it's a neat area where the old farmland acreages and farmhouses are side-by-side with the more modern housing and commercial developments. Historically, this district used to be the outer frontier of Bend, which is hard to believe these days when it's a ten-minute drive from downtown.

I do a quick search and find the Bend Neighborhood Associations Web site, which contains details about the Old Farm District and the other official neighborhood associations. No Old Mill-style plans for the area, simply prettying up the place by planting these gilded signs everywhere. The Bend Neighborhood Map is interesting, presenting a territorial view of Bend that I hadn't seen before. Although I'd be inclined to point out that the real old farm district of Bend should really be extended to include the big white area between the "official" area and the Orchard and Mountain View neighborhoods. As it stands, I wonder what that neighborhood will end up being called?

Amusingly, it didn't take me long to notice that the site was developed by my old employer, Alpine Internet Solutions. One thing they need to do is make that map a clickable image map, where the user can click on the neighborhood and be taken to the appropriate page.

Posted by jon at 10:42 PM


March 11, 2004

Free? Palm Reader software

Is it just me, or did Palm Digital Media make it a whole lot harder to get the free version of their Palm Reader software? From their front page, there's no mention of the free version anywhere, and I finally found it when clicking through the ad for the Pro version.

...Oh. I just found this on the free download page:

The free and Pro versions of Palm Reader are now one application. You can try out the Pro features for up to 15 days. After the 15 day trial period, the Pro features will be disabled, but you can continue to use Palm Reader freely.

Well, that seems rather dumb. I mean, it's still good that it's free, but not advertising that there's a free version available is definitely going to turn away a good number of users.

So remember this link: Free Palm Reader download page.

Posted by jon at 11:25 PM


Computer Languages History Timeline

From the Computer Languages History site comes an impressive computer languages timeline chart. It's as much a lanugage family tree as it is a timeline. Very nice, though a little hard to read.

Posted by jon at 11:10 PM


Shakespeare Social Networks

This is an amazing link: Shakespeare Social Networks.

PieSpy is a tool designed to infer and visualize social networks on Internet Relay Chat (IRC). It works by applying simple heuristics to work out who is talking to whom. This information can be used to produce a visualization of the social network, essentially showing which users are connected and how strong those connections are.

As PieSpy matured, it became obvious that IRC was not the only suitable testing ground. By feeding PieSpy with the entire texts of Shakespeare plays, it became possible to produce drawings of the social networks present in his plays - it is now possible to visualize the relationships between the characters in his works.

So it treats a Shakespeare play as an extended IRC session. Brilliant. I love thinking outside the box!

Of course, it doesn't have to be limited to Shakespeare. You could feed the program any play, script, or written work that looks enough like dialogue from a chat session. Jeez, or law enforcement agencies could use it to draw social network diagrams of people based on wiretaps...

Posted by jon at 10:57 PM


Violent Pong

Here's a link I found from Scoble, which was too good not to post: violent pong. No, it's not a game (how many of you even remember pong?), which is what I thought at first; it's a Flash movie. Watch it. It's crazy and philosophical!

Posted by jon at 12:00 AM


March 10, 2004

Everything Old is New Again

I've started tinkering with the design of my site here, changing things around, making the blog pages more blog-centric, and in doing so I realize that this "redesign" is basically the same design I was using up through July of last year. How quickly we forget.

As to what I'm changing, I'm simplifying the table layout and applying more style sheet rules to clean up the underlying HTML, and I'm moving back to a two-column format, with the blog content in the left column (wider) and all the rest in the right column (narrower). After staring at the three column layout for over half a year, I've finally decided it's just too busy, and going with a more readable format is better. Hey, the two column layout with content on the left is almost a blog standard, if there is such a thing. Damn Movable Type for destroying the curve :)

I'm also restructuring the overall site architecture a bit, moving some clutter off the front page to inside pages, consolidating some stuff, adding some new pages to (hopefully) enhance overall usability. Maybe someday I'll even tinker around with an all-stylesheet layout approach; I know HTML table-based layouts are anathema to some folks out there. But right now my general philosophy is, if it ain't broke, don't fix it—but simplifying it is okay.

Posted by jon at 11:29 PM


March 8, 2004

At least one new ebook

Lovecraft notwithstanding, I did finally get around to adding a new ebook, Anne's House of Dreams (240KB .PDB file). It had been sitting in the queue for quite awhile now. The conversion went quick; I'd forgotten how quick, so that's encouraging.

Posted by jon at 11:26 PM


Lovecraft Copyright

Hmmm... I was all set to post up the H.P. Lovecraft ebooks tonight that were sent to me, but when double-checking the publication dates online I found this:

Please note that Lovecraft's fiction is still considered to be under copyright by Arkham House, and any texts presently available on the web without their consent are in violation of that copyright.

Found this on The H.P. Lovecraft Archive. It's better to be safe than sorry, so I'm holding off on posting the Lovecraft ebooks for now.

Posted by jon at 11:17 PM


March 7, 2004

Beautiful Day

It was an utterly beautiful day today here in Central Oregon, right around 70 degrees and sunny all day. Raked some leaves, played with the kids outside, just gorgeous. And the best part is, I didn't have to be stuck at work on the first nice day of the year :)

Posted by jon at 10:29 PM


Lovecraft Ebooks

Coming up in the next day or two, some H.P. Lovecraft Palm Reader ebooks, four of his short stories. The cool part of it is they were created by someone else who wanted to donate them for hosting, Leandro Liñares. Once I've verified that the stories in question are in the public domain, I'll have them posted. Stay tuned.

Posted by jon at 12:04 AM


March 5, 2004

SharpReader Gone

SharpReader is outta here. Last night's crash wasn't a fluke; after I downloaded the latest version and restored my feeds, I went back and added two Amazon feeds and sure enough, it crashed again. Turns out the URLs for Amazon's feeds are too long for SharpReader to handle.

The worst part is, after last night's second crash, SharpReader wouldn't even start back up at all—not last night, not today, nothing. So, it's gone and I'm done with it. Won't be going back.

Right now I'm playing with FeedDemon. Seems pretty nice so far.

Posted by jon at 11:23 PM


March 4, 2004

SharpReader Crashed

Grrr... SharpReader, the news reader I use to read RSS feeds, just crashed on me, and lost all my feeds—data and URLs. After I'd added four of the new Amazon feeds. Shit.

Oh, well. Fortunately, I had a recent backup of the OPML for my feeds, so I was able to get them back quickly.

Posted by jon at 11:36 PM


Disposable Paperboard Computer

Pen and spiral notebookRouted via Slashdot comes the story of the disposable paperboard computer, which can "can collect, process, and exchange several pages of encrypted data." It even has a generous 32KB of memory.

After reading about this, I couldn't help but thinking that we've already had disposable, paper-based computers around since, well, forever. It's called pen and paper.

And hey, if you throw in one of those sweet old-school PeeChee folders (why the hell can't I find a web page for those things?? Other than online school supplies lists, I mean), you've instantly upgraded: not only your storage capacity, but processing power because you've got all those conversion and multiplication tables and various references at your fingertips!

Posted by jon at 11:17 PM


Amazon RSS

Another piece of news everyone pointed to yesterday: Amazon is now offering RSS feeds. A list of all their feeds can be found at the Amazon.com Syndicated Content page. Looks like they're offering feeds for each top-level category in their hierarchy. The next logical step, of course, would be to offer a personalized RSS feed of your recommendations...

Posted by jon at 1:25 PM


Water on Mars

Forgot to point to this the other day: Opportunity finds evidence of water in Mars' past. Probably you've all heard this by now, but it's still incredible.

"Liquid water once flowed through these rocks. It changed their texture, and it changed their chemistry," said Dr. Steve Squyres of Cornell University, Ithaca, N.Y., principal investigator for the science instruments on Opportunity and its twin, Spirit. "We've been able to read the tell-tale clues the water left behind, giving us confidence in that conclusion," he said.

Posted by jon at 1:09 PM


Rasmus is the Man

... Rasmus Lerdorf, that is, the creator and godfather of PHP. He's got an article on the Oracle Technology Network titled "Do You PHP?" that's definitely worth a read. Here's a sample:

What it all boils down to is that PHP was never meant to win any beauty contests. It wasn't designed to introduce any new revolutionary programming paradigms. It was designed to solve a single problem: the Web problem. That problem can get quite ugly, and sometimes you need an ugly tool to solve your ugly problem. Although a pretty tool may, in fact, be able to solve the problem as well, chances are that an ugly PHP solution can be implemented much quicker and with many fewer resources. That generally sums up PHP's stubborn function-over-form approach throughout the years....

Despite what the future may hold for PHP, one thing will remain constant. We will continue to fight the complexity to which so many people seem to be addicted. The most complex solution is rarely the right one. Our single-minded direct approach to solving the Web problem is what has set PHP apart from the start, and while other solutions around us seem to get bigger and more complex, we are striving to simplify and streamline PHP and its approach to solving the Web problem.

The guy just oozes common sense. Here's another bit about PHP that he wrote on the PHP-DEV mailing list about two years ago, one of my favorites that just sums up beautifully the philosophy of PHP:

The golden rules of PHP are to keep the WTF(*) factor low and the POTFP(**) factor high.

(*) What The Fuck
(**) Piss Off The Fewest People

No two ways about it: he's one of my heroes.

Posted by jon at 12:01 AM


March 2, 2004

Catching up on email

I've been terribly lax lately in responding to my emails that are ebook requests. I'm awfully sorry about that; I'm responding to some tonight, but if you sent me a request for an ebook and haven't heard back from me, I apologize.

Posted by jon at 11:56 PM


March 1, 2004

Advanced PHP Programming

The book Advanced PHP Programming is out, by George Schlossnagle. Looks like it might be pretty interesting; there's certainly a scarcity of good PHP books that cover advanced topics—most of them are targeted at the beginner and the basics, and don't have anything to offer me.

(Quick disclaimer: some of the Wrox books actually look like they might be decent, but I haven't had my hands on a Wrox PHP book since the first couple they published.)

There was a time when I wanted to write a PHP book. It was going to be an advanced book, called "PHP Secrets" and cover all sorts of topics. I never really pursued it, though, largely because of a general disillusionment in the computer book industry: you spend a year or more writing a book on a subject, and by the time it gets published it's obsolete.

Thinking about it now, though, maybe a better venue for such a thing would be online, like what Mark Pilgrim did with his Dive Into Python book. That might be kind of cool; a live work-in-progress that I could (theoretically) keep up-to-date. Hmmm.

Posted by jon at 11:49 PM