Stats! [5-Jan-00]

OK, since I wrote this initially, I've become somewhat more advanced in my handling of things...or maybe just lazy would be a better term. At any rate, my log files are now automatically run through the log parser and reports generated. This is accomplished by means of a ksh script run out of cron. (Unix-speak). About a year ago, I decided to get serious about development work and that I needed a platform here at home to work on. I installed FreeBSD, and liked it so much that I switched over completely. It took a long time before I got back on my "automation development project", but here it is:

Site Stats

Logs! [2-June-97]

Well, I have been working site-wise, but not in a way that is visible to you visitors. My latest passion has been data processing of the logs. Briefly, here is a screen shot of what I have been up to. It shows all of the *NON-AT&T* visitors who hit my site today. DataBase Query [32k .gif]. I don't have anything against my fine compadres at AT&T, but over the last two days there have been over 1000 of you per day! (brief explaination...PWP service just went public and I'm still on the top 10 list)

If you are interested in the story, read on...

I got access to a site which has a nice log feature. Too nice really. They give a record of everything pulled off the site. This includes each and every little .gif and therefore makes the logs HUGE! Eventually a light went on in my head and I made things such that one of the .gifs on my front page comes from that machine. This gives me a record of every hit on my AT&T site.

OK, so now I have a file in Common Log Format with a *bunch* of data, but it is hard to read such is it's magnitude. What now?

What I did is import the raw data into excel and devised a few routines to determine what type of a hit it was. (i.e., new hit on my site index.html, old hit on a document NOT the index.html of the other site, etc.) Also I did a little routine to extract the extension from the address. (i.e., .net, .com, .za, etc.)

Next I imported this info into access and used a query to extract only the data I want. For example, the screen shot above represents new and old hits from visitors that do not have a .net extension.

Hey...There are some old friends...Hi Garth, Hi Lu, Hi Leo ;-)

In case you are concerned, no...I can't get (and don't care to get) anyone's e-mail address. I simply recognize things about the machine name.

I sure am having fun!!!

BTW, Yes, I could have just gotten a log analyzer, but I need the spreadsheet/database experiance. I have only brushed the surface of what is possible!

