I just pulled a log file for the last week and cranked up Webalizer. It took me 30 minutes or so to screen out all the spiders I could find and all IP addresses that might be someone associated with this site, but after finally getting some clean stats, the following two bits of information struck me as pretty cool:
- Of all hits to a “front page,” 55% are to the HTML version, 45% to the RSS feed.
- The Googlebot visits this site, on average, 180 times a day. AllTheWeb’s crawler, 150 times a day. Inktomi’s crawler (used by several companies), 105 times a day.
I push about 22MB per day out of this site. That’s just text — the site has no graphics to speak of. I found UserAgents for the following aggregators in order of frequency:
Those were only the ones I recognized. There are some goofy looking UserAgents in there, so I’m sure there are more. Note that Bloglines includes the number of people subscribed to your site in the UserAgent. Handy, but I got another Bloglines subscriber during the week, so the UserAgent changed from one day to another.