|
Blogroll
Bitch, Ph.D. Shakespeare's Sister The Angry Black Woman Susie Bright I Blame the Patriarchy Majikthise Body Impolitic Republic of Dogs
|
09/12/2002 Entry: weblog analysis review
For years I've used analog to analyze my webserver logfiles. It's robust and fast, but one thing sets it apart from any other logfile analyzer I've used: analog's author, Steven Turner, has a Ph.D. in statistics and it *shows*. People keep talking to me about webalizer, though. I could tell it was easy to use, because I kept having conversations that went like: "I'm having trouble with my CGI, can you help me?" "Are you seeing errors in your server logs?" "What are server logs? Do you mean these?" ...and they'd proudly display their webalizer reports. Okay, so this is one of those things that doesn't require you to understand anything about logfiles to use. Combine that with the wealth of information I saw on the reports, and the pretty eye-candy chart graphics, I wondered if it wasn't time for me to make the switch. I've been spending a little time polishing and updating, and the statistics help me focus more energy on pages people actually visit. Webalizer is pretty and accessible, but analog is flexible and powerful and gives me data I just can't see how to expose in webalizer. Most likely, I will go back to what I've been doing over the past few years: run a report once or twice a year, learn a thing or two about how people use my site, and then go back to not caring. But if you care to read about my observations about each program anyway, read on... So, I took a look at the webalizer report. The reports are well-designed. The superimposed pages/files/hits and visits/sites bar charts group together data that is well-served by being grouped together. By contrast, the default analog reports, even the relatively-new pie charts, are clunky looking. The default per-month reports are a good intuitive separation for most applications. The "visit" model is exciting, though it's misleading. The web is stateless and "visits" are only the software's best guess -- it's not data. I like that analog's author understands that and continues to work to keep his software statistically accurate. I was impressed with the "incremental" functionality of webalizer, where I could feed it another week of stats and it would update the existing reports. I was less impressed, though, when I noticed that I can only feed it one file at a time, and that those files *had* to be in chronological order. This was disappointing.
At least one item in the sample webalizer configuration file that came with the program is just plain wrong. They suggest: More troubling is that I can't seem to tell webalizer that for my /cgi-bin/thumbframes2.cgi script, the information past the ? in the URL *is* in fact relevant. Analog tells me that my burning man 1999 pictures are marginally more popular than my burning man 2000 pictures because "/cgi-bin/thumbframes2.cgi?section=burningman_1999&pg=1" gets more hits than "/cgi-bin/thumbframes2.cgi?section=burningman_2000-new&pg=1". No such data from webalizer, which only reports that /cgi-bin/thumbframes2.cgi gets a lot of hits. Maybe I'll set something up where I get the requests-for-which-URL reports from analog, and the pretty usage-charts from webalizer. Most of the rest of the functionality I use is comparable between the two programs (especially now that analog calls out search terms; I used to do that by hand). Posted by sev @ 10:42 AM PST |
|