OK, since I wrote this initially, I've become somewhat more
advanced in my handling of things...or maybe just lazy would be
a better term. At any rate, my log files are now automatically
run through the log parser and reports generated. This
is accomplished by means of a ksh script run out of cron.
(Unix-speak). About a year ago, I decided to get serious about
development work and that I needed a platform here at home to
work on. I installed FreeBSD, and liked it so much that I
switched over completely. It took a long time before I got back
on my "automation development project", but here it is:
------ for ref-------
Well, I have been working site-wise, but not in a way that is visible to you visitors. My latest passion has been data processing of the logs. Briefly, here is a screen shot of what I have been up to. It shows all of the *NON-AT&T* visitors who hit my site today. DataBase Query [32k .gif]. I don't have anything against my fine compadres at AT&T, but over the last two days there have been over 1000 of you per day! (brief explaination...PWP service just went public and I'm still on the top 10 list)
If you are interested in the story, read on...
I got access to a site which has a nice log feature. Too nice really. They give a record of everything pulled off the site. This includes each and every little .gif and therefore makes the logs HUGE! Eventually a light went on in my head and I made things such that one of the .gifs on my front page comes from that machine. This gives me a record of every hit on my AT&T site.
OK, so now I have a file in Common Log Format with a *bunch* of data, but it is hard to read such is it's magnitude. What now?
What I did is import the raw data into excel and devised a few routines to determine what type of a hit it was. (i.e., new hit on my site index.html, old hit on a document NOT the index.html of the other site, etc.) Also I did a little routine to extract the extension from the address. (i.e., .net, .com, .za, etc.)
Next I imported this info into access and used a query to extract only the data I want. For example, the screen shot above represents new and old hits from visitors that do not have a .net extension.
Hey...There are some old friends...Hi Garth, Hi Lu, Hi Leo ;-)
In case you are concerned, no...I can't get (and don't care to get) anyone's e-mail address. I simply recognize things about the machine name.
I sure am having fun!!!
BTW, Yes, I could have just gotten a log analyzer, but I need the spreadsheet/database experiance. I have only brushed the surface of what is possible!
*I* tend to prefer using the Back Arrow, but you are welcome to Jump to the Trunk.
Note: This will maintain your current navigation preference.
|Feel free to contact me. If you do so through My Contact Page, then you can read my policy on e-mail and privacy. Else, tomwp@_my_last_name_.com. Also, you may Use the Site Feedback Form|