I <3 Steve McConnell*
Coding Horror
programming and human factors
by Jeff Atwood

August 23, 2005

Microsoft LogParser

Ask yourself this question: what if everything could be queried with SQL? Microsoft's LogParser does just that. It lets you slice and dice a variety of log file types using a common SQL-like syntax. It's an incredibly powerful concept, and the LogParser implementation doesn't disappoint. This architecture diagram from the LogParser documentation explains it better than I could:

logparser_architecture.gif

The excellent forensic IIS log exploration with LogParser article is a good starting point for sample LogParser IIS log queries. Note that I am summarizing just the SQL clauses; I typically output to the console, so the actual, complete commandline would be

logparser "(sql clause)" -rtp:-1

Top 10 items retrieved:

SELECT TOP 10 cs-uri-stem as Url, COUNT(cs-uri-stem) AS Hits
FROM ex*.log 
GROUP BY cs-uri-stem 
ORDER BY Hits DESC

Top 10 slowest items:

SELECT TOP 10 cs-uri-stem AS Url, MIN(time-taken) as [Min], 
AVG(time-taken) AS [Avg], max(time-taken) AS [Max], 
count(time-taken) AS Hits 
FROM ex*.log 
WHERE time-taken < 120000 
GROUP BY Url 
ORDER BY [Avg] DESC

All Unique Urls retrieved:

SELECT DISTINCT TO_LOWERCASE(cs-uri-stem) AS Url, Count(*) AS Hits 
FROM ex*.log 
WHERE sc-status=200 
GROUP BY Url 
ORDER BY Url

HTTP errors per hour:

SELECT date, QUANTIZE(time, 3600) AS Hour, 
sc-status AS Status, COUNT(*) AS Errors 
FROM ex*.log 
WHERE (sc-status >= 400) 
GROUP BY date, hour, sc-status 
HAVING (Errors > 25) 
ORDER BY Errors DESC

HTTP errors ordered by Url and Status:

SELECT cs-uri-stem AS Url, sc-status AS Status, COUNT(*) AS Errors
FROM ex*.log 
WHERE (sc-status >= 400) 
GROUP BY Url, Status 
ORDER BY Errors DESC

Win32 error codes by total and page:

SELECT cs-uri-stem AS Url, 
WIN32_ERROR_DESCRIPTION(sc-win32-status) AS Error, Count(*) AS Total
FROM ex*.log 
WHERE (sc-win32-status > 0) 
GROUP BY Url, Error 
ORDER BY Total DESC

HTTP methods (GET, POST, etc) used per Url:

SELECT cs-uri-stem AS Url, cs-method AS Method, 
Count(*) AS Total 
FROM ex*.log 
WHERE (sc-status < 400 or sc-status >= 500)
GROUP BY Url, Method
ORDER BY Url, Method

Bytes sent from the server:

SELECT cs-uri-stem AS Url, Count(*) AS Hits, 
AVG(sc-bytes) AS Avg, Max(sc-bytes) AS Max, 
Min(sc-bytes) AS Min, Sum(sc-bytes) AS TotalBytes 
FROM ex*.log 
GROUP BY cs-uri-stem 
HAVING (Hits > 100) ORDER BY [Avg] DESC

Bytes sent from the client:

SELECT cs-uri-stem AS Url, Count(*) AS Hits, 
AVG(cs-bytes) AS Avg, Max(cs-bytes) AS Max, 
Min(cs-bytes) AS Min, Sum(cs-bytes) AS TotalBytes 
FROM ex*.log 
GROUP BY Url 
HAVING (Hits > 100) 
ORDER BY [Avg] DESC

There's an entire book about LogParser, and Mike Gunderloy even started an unofficial LogParser fansite.

Here are a few other articles I found that touch on different aspects of LogParser:

Although LogParser is 96.44% awesome, there are a few things that I didn't like about it:

  1. I really, really need a standard deviation function. Min, Max, and Avg are nice but totally inadequate for determining how variable something is.
  2. The graphing output is cool-- but it's also a MS Office dependency. If you try to graph something on a machine without Office installed, you'll get an error.
  3. The automatic detection of column types in CSV files isn't always reliable. This meant I couldn't graph some numeric values in my PerfMon dumps because LogParser decided they were strings. I couldn't find any way to force a column to be detected as a certain type, either.

Of course, the idea of SQL being used to query a bunch of stuff isn't exactly a new one; Microsoft's WQL (WMI Query Language) is similar but more annoying and less powerful. And you'll get tons of hits if you logically extend this concept to querying HTML, too. Just try searching Google for Web Query Language.

Posted by Jeff Atwood    View blog reactions
« Is DoEvents Evil, Revisited
The User Interface Is The Application »
Comments

About logs, I often find sad that few .NET developers know about the wonderful Log4Net Framework (http://logging.apache.org/log4net)

One cool example is the ability to receive your logs (error level form example) by mail.
You instantly know when there is a production problem!

There are many other ways to configure log4net, one can do almost everything!

Just look at the features page:
http://logging.apache.org/log4net/release/features.html

benjamin Francisoud on August 25, 2005 3:49 AM

And Log4Net is free!

benjamin Francisoud on August 25, 2005 3:50 AM

Yes, but MS LogParser has nothing to do with writing logs; it's all about processing and/or converting them.

Jeff Atwood on August 25, 2005 12:46 PM

gee, finally, the stake in the heart of XML/XQuery/XFoolishness. and it's from Uncle Bill. ah the irony, the sweet irony.

robert on August 25, 2005 10:25 PM

A bit off topic, but I agree: the more I work with XML the less I like it.

It's like someone took a text file, glorified it, then sainted and canonized it as the One True Data Format.

Jeff Atwood on August 25, 2005 10:30 PM

I've been playing with this tool again recently to analyse IIS log files.

Something that I miss is the ability to do sub-queries. The work-around is to just run another query on the output of the first query, but it would be much nicer to have this feature supported.

Maybe I should write up a blog post myself and share my queries with everyone else.

Cheers,
Stuart.

Stuart on January 19, 2006 7:14 PM

Yes but Log4Net is over kill , like most of the Enterprise Library .

I like things simple.

Jackal on May 4, 2006 7:05 AM

The post was really helpful. Thank you

Pradeep on April 19, 2007 6:20 AM

Hi All,

LogParser is a good tool. It comes with a straight-forward dll that you can easily use with your custom .net (or other) applications.

I only have 1 problem. I use it to parse windows media streaming logs. It sees the sc-bytes as integer. But it happens that I parse more data that an integer can hold. The LogParser parses it as a negatif number :(

grtz

Koen on August 2, 2007 2:07 AM

Thanks Jeff, great article. I digging into log files now and this was a pleasant find.

Steve Trefethen on August 7, 2007 11:23 PM

Sorry for chiming in so late but I think I might have a solution for one of the problems you raised with LogParser, personally I've only used it a few times and liked it very much.

You raise the issue of automated detection of column type, I encountered that in the past and I believe it is caused by the MS Text Driver implementation, the same one that does the job for Excel or SQL server when reading from a file. The solution I used back then was using the schema.ini file (read more here: http://msdn2.microsoft.com/en-us/library/ms709353.aspx) it allows you to define exactly what you want the driver to do explictly - and leave the implicit stuff well alone)

Hope this helps
Moshe

Moshe Eshel on September 11, 2007 4:27 AM

I am trying to set up LogParser to output Netmon data (.cap file) to a SQL DB on a daily basis. In order to do this, I need to be able to set up the .sql file to read the .cap file using a wildcard.

So far, I have tried using all the variants that I could think of, find or otherwise attempt to use in my testbed.

Basically, my task will be to do:

Select frame,dateTime,SrcIp,SrcPort,DstIp,DstPort,PayloadBytes,Connection from '<filename>.cap' to myTable
where DstPort = x
or DstPort = y

The difficulty I am having is finding the 'right' <wildcard> to replace <filename>, since the file will be changing daily.

Suggestion(s)???

Thanks in Advance!

Pete on December 12, 2007 2:50 PM

Pete, one idea (haven't tested) is to use SQL to dynamically generate the command to execute. There's probably a better way to do it from a batch file or whatnot, but this should work. Mind the syntax.

declare @sqlstatement varchar(500)
select @sqlstatement = 'Select frame,dateTime,SrcIp,SrcPort,DstIp,DstPort,PayloadBytes,Connection from ''' + convert(char(8),getdate(),112) + '.cap'' to myTable where DstPort = x or DstPort = y'
exec master..xp_cmdshell (@sqlstatement)

mbourgon on January 7, 2008 9:33 AM

nice GUI for using LogParser:

http://www.codeplex.com/visuallogparser

Jeff Atwood on May 5, 2008 8:49 PM

Regarding Standard Deviation in LogParser - I found this LogParser query a while back (forgot who the original author was, sorry). I checked it against SQL's built in STDEV function and returned the same data so I am confident it is accurate - enough so for my needs anyway.

SELECT DIV ( MUL(1.0, SUM(time-taken)), Hits ) as AvgTime, SQRROOT ( SUB ( DIV ( MUL(1.0, SUM(SQR(time-taken)) ), Hits ) , SQR(AvgTime) ) ) AS StDev

deathwagon on September 24, 2008 10:50 AM

I for one would love to find the source code to microsoft log parser some place, and then give it to some kid to do a nice C# version using interfaces and AppDomain.CurrentDomain.CreateInstanceAndUnwrap instead of COM.

Sure it has got its limitations, but it has proven to be just good enough to do the job and a huge time saver when compared with other more "enterprise" solutions.

On the other hand I wonder if Google Mapreduce might be a better
way to go, to avoid the confusion that people can have with
limited SQL subsets.

Or is LOG PARSE to go the way of DEC DATATRIEVE product?
Another cute query system that worked over all sorts of file formats.

George Carrette on January 23, 2009 5:36 AM

Log Parser GUI, Log Parser Lizard (www.lizardl.com)

lizard on January 23, 2009 1:07 PM
Content (c) 2009 Jeff Atwood. Logo image used with permission of the author. (c) 1993 Steven C. McConnell. All Rights Reserved.