I <3 Steve McConnell*
Coding Horror
programming and human factors
by Jeff Atwood

December 3, 2008

The Problem With Logging

A recent Stack Overflow post described one programmer's logging style. Here's what he logs:

INFO Level

  • The start and end of the method
  • The start and end of any major loops
  • The start of any major case/switch statements

DEBUG Level

  • Any parameters passed into the method
  • Any row counts from result sets I retrieve
  • Any datarows that may contain suspicious data when being passed down to the method
  • Any "generated" file paths, connection strings, or other values that could get mungled up when being "pieced together" by the environment.

ERROR Level

  • Handled exceptions
  • Invalid login attempts (if security is an issue)
  • Bad data that I have intercepted forreporting

FATAL Level

  • Unhandled exceptions.

I don't mean to single out the author here, but this strikes me as a bit .. excessive.

Although I've never been a particularly big logger, myself, one of my teammates on Stack Overflow is. So when building Stack Overflow, we included log4net, and logged a bunch of information at the various levels. I wasn't necessarily a big fan of the approach, but I figured what's the harm.

Logging does have a certain seductive charm. Why not log as much as you can whenever you can? Even if you're not planning to use it today, who knows, it might be useful for troubleshooting tomorrow. Heck, just log everything! What could it possibly hurt?

Oh, sure, logging seems harmless enough, but let me tell you, it can deal some serious hurt. We ran into a particularly nasty recursive logging bug:

  • On thread #1, our code was doing Log (lock) / DB stuff (lock)
  • On thread #2, our code was doing DB stuff (lock) / log stuff (lock)

If these things happened close together enough under heavy load, this resulted in -- you guessed it -- a classic out-of-order deadlock scenario. I'm not sure you'd ever see it on a lightly loaded app, but on our website it happened about once a day on average.

I don't blame log4net for this, I blame our crappy code. We spent days troubleshooting these deadlocks by .. wait for it .. adding more logging! Which naturally made the problem worse and even harder to figure out. We eventually were forced to take memory dumps and use dump analysis tools. With the generous assistance of Greg Varveris, we were finally able to identify the culprit: our logging strategy. How ironic. And I mean real irony, not the fake Alanis Morrissette kind.

Although I am a strong believer in logging exceptions, I've never been a particularly big fan of logging in the general "let's log everything we possibly can" sense:

  1. Logging means more code. If you're using a traditional logging framework like log4net, every logged event is at least one additional line of code. The more you log, the larger your code grows. This is a serious problem, because code is the enemy. Visible logging code is clutter -- like excessive comments, it actively obscures the code that's doing the real work in the application.

  2. Logging isn't free. Most logging frameworks are fairly efficient, but they aren't infinitely fast. Every log row you write to disk has an overall performance cost on your application. This can also be tricky if you're dissecting complex objects to place them in the log; that takes additional time.

  3. If it's worth saving to a logfile, it's worth showing in the user interface. This is the paradox: if the information you're logging is at all valuable, it deserves to be surfaced in the application itself, not buried in an anonymous logfile somewhere. Even if it's just for administrators. Logfiles are all too often where useful data goes to die, alone, unloved and ignored.

  4. The more you log, the less you can find. Log enough things and eventually your logs are so noisy nobody can find anything. It's all too easy to bury yourself in an avalanche of log data. Heck, that's the default: any given computer is perfectly capable of generating more log data than any of us could possibly deal with in our lifetime. The hidden expense here isn't the logging, it's the brainpower needed to make sense of these giant logs. I don't care how awesome your log parsing tools are, nobody looks forward to mining a gigabyte of log files for useful diagnostic information.

  5. The logfile that cried Wolf. Good luck getting everyone on your team to agree on the exact definitions of FATAL, ERROR, DEBUG, INFO, and whatever other logging levels you have defined. If you decide to log only the most heinous serial-killer mass-murderer type problems, evil has a lot less room to lurk in your logfiles -- and it'll be a heck of a lot less boring when you do look.

So is logging a giant waste of time? I'm sure some people will read about this far and draw that conclusion, no matter what else I write. I am not anti-logging. I am anti-abusive-logging. Like any other tool in your toolkit, when used properly and appropriately, it can help you create better programs. The problem with logging isn't the logging, per se -- it's the seductive OCD "just one more bit of data in the log" trap that programmers fall into when implementing logging. Logging gets a bad name because it's so often abused. It's a shame to end up with all this extra code generating volumes and volumes of logs that aren't helping anyone.

We've since removed all logging from Stack Overflow, relying exclusively on exception logging. Honestly, I don't miss it at all. I can't even think of a single time since then that I'd wished I'd had a giant verbose logfile to help me diagnose a problem.

When it comes to logging, the right answer is not "yes, always, and as much as possible." Resist the tendency to log everything. Start small and simple, logging only the most obvious and critical of errors. Add (or ideally, inject) more logging only as demonstrated by specific, verifiable needs.

If you aren't careful, those individual log entries, as wafer thin as they might be, have a disturbing tendency to make your logs end up like the unfortunate Mr. Creosote.

Posted by Jeff Atwood    View blog reactions
« Tending Your Software Garden
Blu-Ray: Is It Time? »
Comments

I'm with you on logging exceptions. But I am definitely not a fan of logging *everything*. Of course, I do write smaller apps, but when it comes to debugging I rely on my debugger, not logging.

I disagree with everything that should be logged should be in the UI, but that might because I only think exceptions should be logged.

Anyway, good post. It is always interesting to see things that are VERY very different than what I normally do.

Sam McDonald on December 4, 2008 1:22 AM

If you have to perform any post-mortem debugging on remote systems, you better be logging as much as you need to clearly recreate the sequence of events that led up to the crash.

Also, log4net is already thread safe due to its own internal locking. Even from your description, I can't imagine a scenario where those internal locks could affect your app. Why on earth would you add your own nested log locking mechanisms on top of that? You're asking for obscure deadlocks...

Loren on December 4, 2008 1:29 AM

> Why on earth would you add your own nested log locking mechanisms on top of that?

We didn't. Referencing one object caused that object (hey, database object abstraction magic!) to reference another object which wrote to the logs. It's extremely timing sensitive, would only happen about once a day on a fairly heavily loaded public website.

Jeff Atwood on December 4, 2008 1:32 AM

I agree on most things, especially that too much logging counters productivity.

In the example, I think a major flaw is that INFO seems to be in the wrong position, according to most ligging tools it should be between DEBUG and ERROR/WARNING. The things that are logged with the INFO level now should normally be logged at the TRACE level, which generally is at the position that INFO holds now.

Naturally no items that would only interest developers should be logged at the INFO, WARN or ERROR levels.

Haf on December 4, 2008 1:32 AM

Unhanded exceptions are called bug reports in my book though, so I don't log them but rather send them to the bug repository.

Goran on December 4, 2008 1:34 AM

I have also my own rules:

FATAL Level
Unrecoverable Errors (the application will end after this log)

ERROR Level
Recoverable Errors (the application is not going to stop)

NORMAL Level
Here is the main trick. I like to show what is logged at this level to the end user in a way he can read what the application is doing. Is just like if the application could tell to the user what is doing.

Example:
Reading application file ...
... done.
Parsing the application data ...
... done.
Found 3 items to process.
Processing item 1.

You can not put here a lot of information, just the main things the application is doing, like if you write your own diary.

HIGHER LEVELS
Here I put as much info as i want. It can be excesive if you want. 99% of time the information will be filtered out because at run-time you have to work in NORMAL debug level.

This is how I see logging, completely agree to the point that you should show to the user what you're logging.

Nacho Soler on December 4, 2008 1:36 AM

If you've ever supported someone else's code, you begin to appreciate logging. What's probably more important is good exception handling, which I think you're kinda saying.

Julian Simpson on December 4, 2008 1:41 AM

I love the example you gave. Shows how bad we are at *really* understanding the complexity of the program.

Although I occasionally log, I try to be flexible about it. There are scenarios when I feel that debugging would be more effective than logging. There are opposite scenarios as well.

There is, I think, an even deeper issue here: being excessive about a practice may get you into trouble. It seems that it happens all over the place: static type safety, monkey patching, factories, etc. Here's one example WRT compactness http://javadots.blogspot.com/2008/11/shorter-is-better.html.


Itay Maman on December 4, 2008 1:43 AM

Sometimes logging can be useful as a tool to prove to the customer that he's doing something wrong.

Our programs interface to large PBX's and a host of other telephone-related systems. Looking into the log, peering over the different binary exchanges between the systems, and being able to show to the customer that it is, in fact, his PBX that is doing something wrong, has saved us hours and days of unbillable support time.

I'd say that in any situation where you're communication between different systems, log what's happening. If not, you'll have no clue which system that actually malfunctioned.

Mats Gefvert on December 4, 2008 1:47 AM

Yeah, I have to agree with you Jeff. That is just way too much information. Your system could crash under the weight of your logging infrastructure. I don't see how this would scale.

Charles on December 4, 2008 1:48 AM

Help me out here. I don't understand how a logging system can cause a deadlock.
I assume any logging framework would perform the following steps:

1. Check log level
2. Format log message
3. Acquire lock
4. Output log message
5. Release lock

I don't know log4net, but from your description, I gather that it performs step 2 after step 3. Is that what happens? It sounds weird. Why would it do that?

itsadok on December 4, 2008 1:53 AM

>If it's worth saving to a logfile, it's worth showing in the user interface
Here I have a few problems, A few years back (around 2006 or so), I worked with Pharmaceutical software (21CFR-Part11 anyone?); and we used to bubble major errors to the user. Typically we'd get a call from a customer saying they had a weird bug(bear in mind this is software they'd paid millions for their couple thousand seats). Asking what it said on the screen usually turned up blanks (and often non-repeatable-yet-significant bugs turned out this way).

I'm all for bubbling up to the interface so that it can be read and noted by an intelligent user, but so far I've had no luck. Until that unhappy state of affairs changes, logs will have to do.

Don on December 4, 2008 1:55 AM

The problem I've seen is that people are off creating new logging frameworks. Things way more complex than they need to be, with lots of little moving parts, but not a lot of sense...and there's so many of them, none of them get tested.

It's yet another recreation of the wheel. On Unix-like systems, we've got syslog. Three calls: openlog, syslog, closelog. One call to open early on, one call to close nicely, and just syslog() whenever you want. The manpage, while pithy, is usually enough to figure out the levels.

LOG_EMERG: system is unusable
LOG_ALERT: action must be taken immediately.
LOG_CRIT: critical conditions.
LOG_ERROR: error conditions.
LOG_WARNING: warning conditions.
LOG_NOTICE: normal, but significant, conditions.
LOG_INFO: informational message.
LOG_DEBUG: debug-level message.

And you can suppress these messages system-wide by configuring syslogd, or if you want to do it yourself, you can do it with setlogmask().

So you say "But that isn't portable!", to which I respond, "Neither is 'log4net'"

syslog is specified in POSIX.1-2001. log4net, log4java, log4cpp, .NET, and much of this week's Windows API isn't really specified anywhere.

transiit on December 4, 2008 1:59 AM

You know, if you have multi-threaded execution, you eventually have to make sure that your logging is lock-free, even if you don't think it's possible it could deadlock.

Our application basically dumps each log message to a threadsafe but lock-free linked list, then flushes the list to file every x ms.

A scheme like that can make quite a difference in a multi-threaded app that *needs* to log frequently, compared to a naive scheme that synchronizes on each log call.

Christoffer Lern on December 4, 2008 2:07 AM

There is nothing worse than poorly implemented logging code! We log alot as we are doing data transporation and ETL. We treat logging as the cross-cuting concern it is an apply it transparently using AOP. Logging code does NOT belong in your core logic and only poor or lazy developers haven't sorted themselves out

Chris Canal on December 4, 2008 2:11 AM

Argle! The add I got at the bottom of this post was for Lighthouse.
To my understanding, this is a competitor to FogBugz... -_-'

gabouel on December 4, 2008 2:12 AM

Another aspect of logging levels are how/where it is configured and how easy it is to change. If your configuration management process means it takes 2 weeks of paperwork and the CIO to sign in blood to make a change from "Error" level to "Info" level to get some more info about a problem then those other levels may as well not exist. Being able to dynamically change these levels at run-time without recompilation or application restarts are key.

JosephCooney on December 4, 2008 2:14 AM

The code footprint of logging can be greatly reduced by using AOP techniques. It's pretty much the canonical example of aspect oriented programming. A simple method boundary aspect (straight from the postsharp site) looks something like:

public class SimplestTraceAttribute : OnMethodBoundaryAspect
{
public override void OnEntry( MethodExecutionEventArgs eventArgs)
{
Trace.TraceInformation("Entering {0}.", eventArgs.Method);
Trace.Indent();
}
public override void OnExit( MethodExecutionEventArgs eventArgs)
{
Trace.Unindent();
Trace.TraceInformation("Leaving {0}.", eventArgs.Method);
}
}

In postsharp these aspects can be projected onto your types conditionaly at compilation time using either method/class level attributes or assembly level multicast.

[assembly:SimplestTraceAttribute (AttributeTargetTypes="TypesThatIWantToLog;AndSomeOtherTypes")]

Having these as aspects also allows you to produce versions on the fly with higher levels of instrumentation.

There are also dynamic proxy based techniques that work well in combination with dependency injection to achieve the same result.

I agree with your overall point though,knowing *what* to log is a fine art; one that I am more than a little amateurish at myself.

Ryan Roberts on December 4, 2008 2:15 AM

It all depends on why you have logs in the first place. Is it for debugging, troubleshooting occasional errors / exceptions, helping to analyze the program flow, give extra technical information to your customers, etc ...

In general the more complex the system gets, the more you will need logs to help see what happens and where, but that mainly interests developers.

Users would be more interested in why their instance is not getting or sending emails properly, in general communication with other systems that are configured by the user.

jwickers on December 4, 2008 2:17 AM

Also, filesystem based logging is not a good idea in systems you need to scale heavily, asynchronous stuff like UDP or message queuing are a better bet. Look ma, no locks.

Ryan Roberts on December 4, 2008 2:18 AM

First of all, there are two activities: logging and tracing.

Logging is for end user, it should be written in a language that the user understands and it should be informative for the user.

Tracing is for developers and one can put any internal debug informating in the file. I agree that too much information here can render tracing useless.

As for perfomance issues, the best solution is to delegate actual logging to an external application or service (syslog on Unixes is a good example).

Alexey Bobyakov on December 4, 2008 2:20 AM

Professionally, I develop technical software, i.e. software that controls a piece of hardware, which gets shipped to a customer. In our case, when a customer submits a problem report, logging is the only mechanism we have of getting some amount of technical information to tell us what the user was doing, and what went wrong.

Jeroen on December 4, 2008 2:29 AM

Nice post

I totally agree with your opinion, for myself I don't use log4net, because I found it to complicated. I wrote my own logging web service that I have implemented in all my webapps. It can log 2 types of logs, exception and action logs. After every request to the backoffice I do a total action log (contains input, ouput and authentication) or an exception log contains (input, exception and authentication) I also write the loggedby identifier and the duration time.

So basically I always do a total log write, but beside that I wrote some SQL jobs that handles the log entries daily, weekly, monthly and yearly for statistics or cleanup, based on the logged by identifier.

So the log will be (not) inserted into the statics based on the identifier and every log is deleted periodically.

For my opinion I think this is the best approach, log as much as possible, but do an automatic cleanup periodically.

Freek on December 4, 2008 2:31 AM

Can you elaborate on your example problem? I too am struggling to understand what causes a thread to acquire a log lock and then do anything that causes it to wait on another lock before releasing the log lock. Is it writing the log into a database?

Weeble on December 4, 2008 2:34 AM

God, how many times I was saved by logs. Then again, I'm a trouble shooter. My speciality is discovering what is wrong, whether it's a design issue or a bug, and determining what the parameters for fixing it are. Not that anyway pays me to do that specifically, of course -- they just ask me to do it beyond my normal day to day tasks.

There is a certain method to my madness. At the DEBUG level I want to know the I/O. A surprising amount of bugs result from one not sending what he thought he was sending, or receiving what he thought he was receiving. At the INFO level -- if I used such a distinction from DEBUG -- I put flow control. Every method being called, and every flow control fork being taken (including loops, of course). I think ERROR and FATAL are self-evidend, and yes, they should be communicated through the interface too -- if the program has an interface at all.

Now, the thing Jeff seems to be forgetting is that this error logging framework is intended for debugging problems when you have access to two things: 1) the problem report and; 2) the error log.

When you are tending your own web site this does not apply, and if you happen to use open source and be willing to discover why Apache is behaving erraticaly, then you might not ever be on the user side either. But when you are tending a commercial application and CommercialApplication (tm) keeps acting in unexpected ways, and, specially, when the support is not allowed access to your installation, then The Log is all there is, really.

Daniel on December 4, 2008 2:38 AM

I think you're exaggerating. I use logging a lot, and it's never caused me any deadlocks.

I do agree with points 1 and 2. However,
> if the information you're logging is at all valuable, it deserves to be surfaced in the application itself,
Of course not! If you use logs for debugging, it's no reason to build a special page just for showing them. And even if you do, this doesn't make them non-logs.

> I don't care how awesome your log parsing tools are
Ctrl-F in my editor works just fine

> Good luck getting everyone on your team to agree on the exact definitions of FATAL, ERROR, DEBUG, INFO
Easy! An exception is ERROR, everything else is DEBUG. I'm not a big supporter of log levels myself, but that's not an argument against logging.

Lev on December 4, 2008 2:58 AM

My app keeps a length-limited in-memory log of recent events, at quite a high level of detail. It attaches the contents of that log to crash dumps. The extra context that the log provides over and above the instantaneous information provided by the crash dump is really useful - well worth a few extra lines of code and a few extra microseconds.

Richie Hindle on December 4, 2008 3:20 AM

Why is your logging system acquiring a lock? What about simple logging, like to a file, a socket or a fifo/pipe, needs to be locked? Are you not opening the files with O_APPEND (or whatever the windows equivalent is)? Are entire log entries not being written with a single write call?

Andy on December 4, 2008 3:20 AM

After looking at the original post on stackoverflow, it appears that this comment was made with an additional caveat that you have failed to mention in your post. In fact a lot of people posting comments seem to be looking at this problem from various different perspectives.

From the original comment, "I tend to be a little verbose on logging because when you have to change to debug mode, the more info you have, the better."

This poster doesn't seem to be considering applying this level of active logging to the live system, but instead when it comes to him attempting to debug the solution. As some have commented, the log becomes a diary of what was occurring at any particular point before an error was encountered.

Maybe the question we need to ask is, "The Problem With Logging As A Method Of Debugging". Since it seems that there is a common attitude amongst some developers to just as easily replace one with the other.

Simon on December 4, 2008 3:22 AM

Yes, there is a balance. I'm trying to maintain some code, and it feels like 50% of the source lines are involved with writing to the log. This makes the code unreadable.

There's a bug somewhere, and there's so much gunf in the log already, that I feel the need to create a separate log to write stuff I actually want to know to help diagnose this problem. I'll take my own logging system out again when the problem's resolved.

Scott Langham on December 4, 2008 3:23 AM

I can depend on the type of app. In your case as a website the desire to log is low. However I work in transport. We have literally tens of thousands of machines world wide. If a bug report comes in we have very little chance of getting a good bug report from the original driver.

A good log is the difference between having a chance of identifying the issue and having no hope in hell.
And yes we should catch as many of these issues at development and we try to but one can never be sure if the hardware that works well in Europe will work in the same way in say: Kuwait. So logging and even a level of comprehensive logging can be v.useful to some.

Jax on December 4, 2008 3:24 AM

Agree, but I think something is missing here: _why_ logging.
In most cases, logging is made with debugging purposes.
If a log suits that goal, then it is OK. Otherwise, it is an excessive logging.

Of course, this is not something easy at all to determine - no one knows of bugs in advance :-)

Roberto Liffredo on December 4, 2008 3:28 AM

I'm with andy on this one, why are you keeping the file open for extended periods of time? and why use a database connection to log things? Hey, why not use a deamon (or service in the windows world) to handle the logging, that way you will just have to pass the error to it and it handles the rest. This would give you a minimal load on your application too. Hey, you could even push it to it's own box. How is that for scalability!

I'm for logging as much as possible. And the only problem I see with logging is I/O and hard drive space.

Thomas Winsnes on December 4, 2008 3:34 AM

For me, simple runtime control of loggers is the killer feature in frameworks like log4j/log4net. Changing log levels becomes more like opening/closing a tap.

I use debug level log entries during development to verify process flow. In production our log entries are turse but the ops guys can drop to debug for a given package and get a clear rundown of the order of events.


Dave Patterson on December 4, 2008 3:36 AM

Assertions would serve the same purpose as logging but with the added benefit of helping to document the assumptions you've made rather than obscuring them. Heavy logging makes the original intent of the code very difficult to detect. I have yet to see a code base where it worked well.

Jack Hughes on December 4, 2008 3:40 AM

Im with Jeff on this, sure logging CAN help, but how often do you Really open up a log file and look through it?

When i was working on a site where we had a lot of users doing some i.e. transactions between accounts etc. we needed good logging, however, the logs were easily over-logged and didn't help at all, we had to try forcing the site into the same scenario to get the same log-output. Even though the logs were pretty clear and logged all data possible to log.

So, logging can be good, but often is excessive! I'd go with the "Exception only"-logging, and maybe store a trace.

Filip Ekberg on December 4, 2008 3:50 AM

> this resulted in -- you guessed it -- a classic out-of-order deadlock scenario

Wait -- does this log component stupidly acquire a long-lived lock, or did you use it in a stupid way?

If your logging is deadlocking you, there is stupid going on. A logger only needs a lock to serialise information going into the log, so ...

Option 1: use one log file per thread. No locking required.

Option 2: dedicate one thread to logging. Other threads enter a critical section just long enough to write to a buffer; the log thread commits the buffer to disk when it has the time.

Option 3: sacrifice performance and enter+leave the critical section on every log call.

Option 4: do something stupid and get deadlocked.

> Logging means more code

Then use a language or addon (like AOP) that can inject logging automatically.

> it actively obscures the code that's doing the real work

You're assuming logging doesn't do real work. 8 years of maintaining applications that run at customer sites in other cities suggest otherwise.

> If it's worth saving to a logfile, it's worth showing in the user interface.

No. The user interface is there to interact with the user. Event logging is there for the administrator to get a summary of what's going on (alerts, errors, warnings, changes in configuration, access to sensitive data, etc). These all relate to the operational requirements of the application.

What you are describing - primarily - is debug logging: dumping enough stuff in enough detail to help a developer figure out what went wrong, and why.

> Logging isn't free.

No, it's not. Which is why most sensible logging subsystems will have a runtime option to change the log detail level, and will start with a quick check to determine if the log statement in question needs processing or not. In C/C++ one would use macros for this, to avoid a function call if the particular log statement isn't going to be executed.

> The more you log, the less you can find.

YMMV. This depends substantially on how your log is presented (does it have a way to show stack depth, for example), what you are logging, how consistently you are logging, and whether you have tools to parse the logs.

> it's the brainpower needed to make sense of these giant logs.

Or the brainpower needed to log sensibly. Here are some approaches:

1. Don't allow a single concern or transaction to cross a thread boundary. Use one log file per thread, or include the thread id in the log, and it is easy to reconstruct the transaction.

2. Tag each transaction with a system-wide unique ID, and make sure it gets logged. It is now easy to reconstruct the transaction.

> Good luck getting everyone on your team to agree on the exact definitions of FATAL, ERROR, DEBUG, INFO, and whatever other logging levels you have defined.

It doesn't matter. So long as there is agreement on what goes to the event log (read by the administrator, not the developers) and what goes to the development log (development only) ... and that's a customer requirements problem.

The whole problem here is that you see logging as some optional extra. That's kindof like a civil engineer dismissing drainage as an optional extra - sure it's not the core of the building, but you're going to end up knee-deep in shit if you leave it out! Design to accomodate it from the outset and it will perform as expected.

Twylite on December 4, 2008 4:28 AM

I was also not a big fan of logging, but two years of work in the company that produces software which works with unstable custom hardware proved me wrong. It's better to log everything I can than waste time on some obscure bug.

I also support the guy who recommended to use AOP for decreasing amount of code written: method entrance, exit and exception logging could be easily done this way.

Alex on December 4, 2008 5:03 AM

On the one hand:
Logging is great for multithreaded scenarios, where debugging can't help you because of the context switching.

On the other hand:
Seeing a method that looks like this:
public void DoSomething() {
m_Logger.Trace( "Entering DoSomething()" );
object result = CallDatabase();
if( object == null )
{
m_Logger.Trace( "result is null" );
}
else
{
m_Logger.Trace( "result is not null" );
}
m_Logger.Trace( "Exiting DoSomething()" );
}

This is just awful, and for so many reasons, and this doesn't include exception handling.

Regarding sending messages to the GUI, this only is practical if there is also a way to save that data out to a file or to print it. Asking a user who has called support to start reading error messages, no matter how well formatted, is a recipe for missing something important.

Schmuli on December 4, 2008 5:18 AM

Anyone work with Sharepoint and try to dig through the logs to find out what went wrong? It is an absolute nightmare. Even when you look at the screen to limit the event categories, you're presented with a choice of about 150 items...
Anyways...
I've used logging in applications where I am not in control of the infrastructure, yet still need to debug problems. My favorite approach is to have logging configurable so that if you're encountering a problem, turn logging on, reproduce the problem, find out what went wrong, and then turn it back off.

J on December 4, 2008 5:26 AM

Log the exception stack trace.
Log any critical database changes according to the business requirements.
Log access to protected or critical systems.

Mac on December 4, 2008 5:36 AM

I generally only log information that would help me reproduce user actions (such as button clicks and information entered). I think logging the start and end of every logical block of code is extremely excessive. Now granted you may need to do that in some rare debugging instances, but in general you shouldn't have to do something like that.

Billkamm on December 4, 2008 5:38 AM

We have been using for a long time a combination of logs and traces.

1) Logs are those very important exception events that you want/need to keep. They are stored locally on an embedded device and they might as well be small and clear. The conventions for ERROR, WARNING and INFO must be crystal clear and in some cases it takes an approbation or a review to have the "right" to log something. The other problem with logs is recurring conditions. You do not want to end up with your whole log filled with "timeout exception, could not connect to xxx". Thus, it takes extra attention to only log when the condition occurs, not when it repeats, and also when it clears.

2) The second level are the traces. They behave similarly than the "OutputDebugString" function. Those are run-time trace information about the internal workings of the system. If you are not there to see them, you loose them. Each component that traces register itself with the system and can creates as many level of traces he wish. We end up with a tree of checkboxes for all the tracers in the system and you check only those you wish to see. Because we know that executing all that code takes extra time and mangles a bit with the multi-threading timing, trace code is conditionally called "only" if someone has checked the relevant checkbox in the tree. Thus, no CPU or very very little is spent executing those. We have some nice mechanisms of doing this for all kind of situations (even decoding complex objects can be conditionally run) without obfuscating the relevant source code.

We found that this second level of traces was VERY helpful on deployed systems to identify problems and fix issues that sometimes were configuration related or inter-compatibility issues with the external systems. It was also essential in finding complex bugs/problems that can be reproduced on live systems (but may be difficult to reproduce in house) while having a minimal impact with the live system (read attaching a debugger and tracing step-by-step).

philibert on December 4, 2008 5:46 AM

I'm into embedded and real-time things and into applications that need to run for very long times without being interrupted. I usually reserve a buffer somewhere in memory and log copious details to it. When the buffer gets full, I write out anything above a certain level to permanent storage (if available) and purge the rest. If the application crashes, I can dump out everything that's in the buffer and get all the gory details.

Logging to permanent storage (note I'm not using the word "disk") or through syslog tends to be unacceptably slow on the systems I work on while more and more memory tends to become available.

Philip Paeps on December 4, 2008 5:50 AM

Logging is important for the following reasons:

1) Users lie. They don't mean to, but they report what they "think" happened and what steps they remember taking. Logs are unambiguous and don't distort reality to match their preconceptions.

2) Some applications need to log events for regulatory reasons. Application logs serve as proof that certain events happened.

3) Applications don't *always* behave the same in the debugger as they do in a production-like environment. The additional overhead of a debugger can change the behavior and/or timing of control and data flow in a way that makes the problem "disappear".

My own informal approach to logging is to map the following to the log4net levels:

FATAL - An unrecoverable error just occurred and this is the last-gasp attempt to document what happened before everything went boom.

ERROR - A recoverable hard error (network loss, database constraint violated, etc.) just occurred and this is an attempt to identify where it happened and what caused it for later investigation.

WARNING - A recoverable semantic error (invalid login credentials, invalid application operation attempted, etc.) occurred and was corrected.

INFO - Something interesting happened and it needs to be documented. Interesting depends on the type of application and the customer requirements.

DEBUG - This is the data I need when I get the call at 2 a.m. and need to diagnose the problem immediately. It's typically state information that will allow me to identify exactly where the data/flow stopped being good and transitioned into crappy.

I also completely misuse the NDC stack in log4net by pushing the routine name onto it when I enter the routine and popping it off just before I exit. Sue me. I know that's not what it was intended for, but it allows me to see at a glance where I am in the call stack and has occasionally identified events recursively calling each other.

The whole point of non-regulatory logging is to capture enough information to shorten the time and mental distance between a problem appearing and knowing enough about it to start resolving it. It's just as much of an art as debugging is and is learned largely by trial and error (anybody else ever run out of disk space because their log files got to large, or am I the only one?) and resistant to fully automated approaches. For me, it all goes back to predicting what information I will need when I need to resolve an issue rapidly and can't depend on eyewitness accounts of what happened.

If a tree falls in the forest and it wasn't logged, can you prove it?

Gilbert on December 4, 2008 6:02 AM

I have a logging class that is called as if I'm going to do far too much logging, but only actually writes entries above a level set as a constant. Most of the time it logs only fatal errors, but at the flip of a constant it becomes really anal and logs everything it gets.

There's obviously some wasted processing going on most of the time, and I can't get any great detail until after the problem has been reported, but I find the pros easily outweigh the cons. Most of the benefits of light/no logging, and a simple tweak of a constant to get all the benefits of heavy logging.

Schmoo on December 4, 2008 6:05 AM

Our software doesn't use logging to log errors - if there is an error, the user wants to know without looking into the log file, so it will display the error to the user. Our apps log if a user comes and says "I did this and that and then bang, error" and we try and it works flawlessly. Then we need the log. We don't need it to see what error the user got, it is on the user screen in bold letters, we know the error. We also don't need it to find out what the user did, he just told us. But since we don't get the same error doing the same thing we need to find out what circumstances led to this error. And if we don't log everything that goes on in the system, we will never have a chance to find the circumstances. Exactly the one little piece of information we may not log is exactly the rare, obscure situation that caused the error for the user. So log as much as possible.

Regarding the log becoming too big to find something, just write a tool that filters it. Just because everything is there in the log doesn't mean your log viewer needs to display everything! Make it filter by log level, make it filter by which part logged the info, make it filter by time stamp, by string, whatever. With a good set of filters you can extract exactly the info you need to know from the log, hiding all the stuff you don't care for at the moment. But what is not in the log is not there, it is information you don't have. And we can't always modify hundreds of code lines and release an update for our software products whenever users get a certain error and we are not logging enough to find out what's going on.

Mecki on December 4, 2008 6:06 AM

Good stuff.

Logging can be addictive at first, but as you point out there is no free lunch. With all our fancy abstractions in code, eventually it still usually boils down to a *physical* database read/write.

Having the ability to turn logging on/off at different levels is of course highly valuable.

Steve on December 4, 2008 6:07 AM

Regarding deadlock, yes it can happen with Log4Net and I ran into the exact same problem Jeff described, but almost 3 years ago.

You can argue back and forth about what the real problem is, but Log4Net treats the "logger" as a singleton and serializes (locks) access the the output file such that each record is written atomically. Otherwise you could get half a log statement written and then another one written in the middle of it.

If your DB layer also has serialized access to a shared resource (i.e. a connection) then the scenario Jeff discussed will cause a deadlock. But not in the DB where it would be detected and interrupted, in your code where it will sit idly doing nothing.

I'd say the problem is not understanding what's going on in the details of the APIs you are using (which is why it happened in my case).

David Avraamides on December 4, 2008 6:09 AM

Let me put on my SysAdmin hat..

Code [for logging] is the enemy? Wow does that sound like developer speak.

While sorting through a gig of log data on an application you wrote sounds like a heinous job to you, not having any log data other than exceptions sounds even worse to me. While I dread the task of being handed a volume full of Websphere application logs and being told "Find the place where the hacker broke in", I dread not having those logs even more.

As a former boss of mine, one of the few who did his job very well, once said "Sometimes the message is unimportant, but how many times it appears is very important."

How hard is it to write a decent logging object so your code doesn't get crowded with catch bracketology? [Answer: not hard]

Ryan C. Moon on December 4, 2008 6:16 AM

Sounds like a poor design in the logging software. I have written logging code for decades and never had to hold a lock to get it done. With routines that can be intrusive in other code you have to work hard to keep things local.

AL on December 4, 2008 6:28 AM

I love the way people are comparing debugging to logging...when the example that Jeff points out would never be caught while debugging. It's these high multi-user systems where sometimes the only way to figure out what's wrong is to have "good" logging. Excessive logging is bad...just as bad as no logging...but I've got to believe there's a happy medium in there somewhere. The world isn't black and white.

fishstick_kitty on December 4, 2008 6:35 AM

It's already been mentioned but its work repeating.. You usually don't need to log synchronously. As long as they are still logged in order and you can handle the possible memory load introduced by a potential backup on the queue, why not log asynchronously?

log4j has a rather handy AsynAppender which you can wire up multiple appenders for it to dispatch to:

http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/AsyncAppender.html

I'm sure log4net has something similar. If not, it should be fairly simple to roll your own.

Mike on December 4, 2008 6:36 AM

i don't understand how you could get a deadlock. if you are only holding the log-lock temporarily then you can't get deadlock. maybe dotnet is doing some unsafe lock coarsening. ie converting:

lock log
unlock log
lock db
lock log
unlock log
unlock db

to

lock log
lock db
unlock log
unlock db

or more likely log4net code has buggy internal locking and is deadlocking by itself without the assistance of the db locks.

hast on December 4, 2008 6:39 AM

I log info, warnings, and exceptions.

Info is just for various 'break points' in a program.

Warnings is when something happened that is accounted for, but that users should be doing (like typing in random query strings)

Exceptions are, exceptions.

Mal on December 4, 2008 6:42 AM

I agree with most of the article, however there are critical parts of the code (i.e.: banking transactions or a part of the code known for its instability when dealing with external systems, like CORBA interfaces) that must be logged to the finest detail to allow later debugging.

Something never mentioned is the fact that some of the debugging can be disabled (the all-too-famous isDebugEnabled) to prevent cluttered code.

However, at least some of us use log4j, log4net and such logging frameworks: I've witnessed some of the worst no-nos in the industry: using System.out.println() to debug Java code. Of course, it means such a burden to the disk (especially for high-traffic applications) that just by changing those lines by a logger.debug() we can cut the app response times by half (sometimes even more).

everac99 on December 4, 2008 6:47 AM

Where the system is deployed makes a difference. A server in your data center can be subjected to probing and analysis to track down a problem. An embedded system deployed halfway across the world cannot be.

Denton Gentry on December 4, 2008 6:49 AM

It all depends on the application and scenarios to me. If I have a middleware app like some kind of rules engine as opposed to a GUI frontend, logging becomes critical (we found MQSeries logging to be cheap and flexible). Even in a GUI situation, some use cases and problems can be extremely tough to reproduce and logging can be a real lifesaver as to what really happened when user X produced strange behavior Y. Everything in moderation and the right tool for the right job.

Geoff on December 4, 2008 6:53 AM

I use tracepoints instead of Trace statements in my massive Windows client project. Tracepoints do not affect your code and can be enabled or disabled all at once. Mostly I use them to trace the program execution using the message: Function: $FUNCTION, Caller: $CALLER. But I still use logging for remote web applications.

Robert S. Robbins on December 4, 2008 6:57 AM

Logs are not necessarily useful to show in the UI. That depends on who your users are. If Stack Overflow has an error parsing a config file, not a single one of the users will care about that. If Stack Overflow has a problem applying reputation because something is inconsistent, no user wants to see it.

The admins, however, do want to know this stuff, and they have access to the logs. That IS the UI for the logs. Also, because you have only one consumer of your program, you can get away with minimal logging because you can access the production servers and get memory dumps and whatnot. But when you have dozens of clients, and you're working with a system that requires a certain amount of reliability, logging lots of stuff could be invaluable. For example, we log responses from our ecom providers. This proved to be critical when something went wrong with our database and the responses weren't recorded properly. We were able to reconstruct the responses from the logs and fix the data. Saved our butts.

Also, it can be extremely helpful to log certain kinds of things under certain circumstances. Our framework allows us to log SQL statements or Command invocations; we turn this off most of the time but it is extremely useful for development. So what if your logging can sometimes deadlock? Turn off that log level, or stop logging for that class (in log4j this is easy) and you're good to go... developers can keep their logs and production still runs.

Mr. Shiny & New on December 4, 2008 6:59 AM

To your first point, use AOP to reduce to code clutter, like so: http://ayende.com/Blog/archive/2008/07/31/Logging--the-AOP-way.aspx

zac on December 4, 2008 7:04 AM

Clearly there's no shortage of opinions on this topic, but I'd like to chime in.

First, what's your batch size set at? If you set it to a reasonable level, then it shouldn't be hitting the database that often. Of course the higher your batch size, the higher your risk of losing log statements (at those would be the ones closest to the error) but I still think they're valuable.

As for the "extra lines of code" you can lean on IoC containers to weave in logging aspects to your interface based components... you get that for free. Seriously... there's no code to log entry/exit of methods. Of course the value of these statements isn't very high, I'd prefer more granular "what is the user doing" messages than line level ones. But this is why you have log levels.

Ben Scheirman on December 4, 2008 7:06 AM

You can't always show a meaningful error to the user - like if your program is a server, with no UI.

Having spent many hours dealing with issues of multi-server, large-scale systems, I can state that unless you are the sole developer, you will need to have debugging traces in your code to determine what happened at the customer site, because, as others have noted, the customer will not be able to tell you what was going on, either because of lack of knowledge (server:server issues) or lack of desire (network group has turf wars with ops and changes routing without prior notice)

Pete on December 4, 2008 7:10 AM

Unusually, I disagree. When you're dealing with complex software that parses a huge amount of data, your logs can be the major indicator of problems- sure, you log the exceptions, but if you aren't logging a fair amount of everything else then you don't know the precursor to the exception so you don't get much handle on where the problem comes from and if the software takes hours to run, you really need to find whereabouts the problem comes from.

A little logging overhead and a lot of unnecessary data when it does work is a very low price to pay for the benefits of being able to identify the whereabouts of a problem on a slow run.

Breakfast on December 4, 2008 7:15 AM

I still don't understand how the logging caused a deadlock here, unless log4net was configured to use a special database appender that in turn used locks that other parts of the application used. In that scenario, just changing the configuration of the logging should have resolved the problem.

I believe that a logging framework that provides the ability to turn turn logging on and off at runtime with some kind of fine granularity (like log4net allows), you can make your log show the information you need without all of the fluff and without impacting performace all of the time. Additionally using AOP for the mundane entry/exit/parameter logging (which I rarely need) that can also be configured at runtime makes for a great performing application that allows thorough troubleshooting if necessary.

ZaDDaZ on December 4, 2008 7:16 AM

Exception logging is all I ever use. Really, do you care about all the successful opperations on your site? Or do you care more about the one or two that fail? Ussually an exception is more than enough information for me to recreate the error and fix it.

Kris on December 4, 2008 7:25 AM

Greets Jeff,

I appreciate the nod/reference (ahh shaft, my web page is all weird with the CSS template now...). Here I was getting ready to call in to the SO podcast to get your opinion on just this matter!!! 8^D

I don't know if it helps or hinders the discussion, but the only thing about the initial post was that I failed to complete the explanation of things. Somebody else had followed up on the issue (http://stackoverflow.com/questions/231903/how-much-to-log-within-an-application-how-much-is-too-much) and I had added the following comment

***

In the sample I provided, that method logs on a daily basis in WARN mode. Which means that the only thing that gets logged "by default" is if an exception occurs. If I get a call from one of my clients about having an error in the application, they don't have to read me some cryptic message on the screen, I jump in the log and can see what's going on. Most of the time, the answer is right there.

What happens if the answer isn't readily available? Log4net allows me to update my configuration file (no re-compilation necessary, no need to get access to some special system file on the web server with the sysadmin's approval) and go into INFO mode. Now you start seeing a second layer of logging. Maybe the code never made it to a certain loop. Maybe the data retrieval had an empty record set. This second level of debugging is helpful, and the log only gets slightly larger. Once this is done, I can change the config again and go back to the light logging.

Naturally if things are REALLY crazy, then I go to the full debug level, and I want to know what each variable is reporting, what DataRows I'm dealing with, and what is going on in the application. At my current place of work, we don't have the ability to do remote debugging into our web applications, and we can't always tap into the production database without potentially augmenting data, so having this full debug is the next best thing.

I agree with the majority of folks out there that excessive logging can really bring down an application and cause more problems than it is worth. If wouldn't recommend this kind of verbose logging in an application either unless the application warranted it for security reasons. However, being able to leverage verbose logging when needed and without having to recompile my code IS a HUGE benefit in my opinion and if you have a framework that can allow for it easily (such as log4net), then I say get nice and verbose and it is easy enough to mentally filter out the log code references if you're having to go back into the code itself.

***

I'm not trying to back peddle or anything, I guess my point was that if the logging architecture can leverage things properly so that you can get everything or nothing at the flick of a switch (or update of a config file without a recompile), then it would be beneficial to do so.

Sean Patterson on December 4, 2008 7:26 AM

I'm going to pile on log4net here: if your logging system uses locks in your threads and has the potential to help you deadlock your application like you described, it's a piece of crap. Staying out of the way of your application is one of the most important things a logging framework can do. Multiple producers / one consumer is a textbook multiprocessing problem (literally!), it has lockless solutions, and failing to do this in a way that it *cannot* lead to deadlocks is unforgivable. Isn't the point of a third party package for this sort of thing, that you expect them to get just this sort of detail right?

Yes, you should blame log4net!

Thomas on December 4, 2008 7:28 AM

While I agree that logging can be abused, I think your description of what is necessary has swung the pendulum too far the other direction and is too absolute. Saying that you are not anti-logging, just anti-abusive logging is kind of a no sh** statement.

The real problem here, one you did state, is in determining exactly what should be logged and that is subjective and squishy. Your ultimate advice to start small and with the most heinous of problems is good, but a bit abstract.

"3. If it's worth saving to a logfile, it's worth showing in the user interface".

I firmly disagree with this. I can think of numerous examples where this isn't desirable.

- It is possible that issues are occurring that are undesirable but recoverable that you may want to log so that your development staff can investigate but are not issues the end user needs to be concerned about. For example, if you are creating SQL deadlocks that are causing some of your queries to be killed by the SQL engine and you are able to recover by re-committing the transaction, then you may want to log the issue for developers but the end user doesn't need to know.

- Another even bigger issue is that the level of technical granularity that is appropriate at the user-interface level is often insufficient for developers/admins. As you are well aware (at least your book list suggests) you should provide meaningful errors messages to the user in terms that are relevant to their problem domain and that they can understand. However, abstracting the issue to their level can often strip out the information necessary or at least useful to the developers/admins.

"Honestly, I don't miss it at all. I can't even think of a single time since then that I'd wished I'd had a giant verbose logfile to help me diagnose a problem."

This statement provides a false sense of security to the reader (and even you). StackOverflow is a fresh codebase that is fairly small, written by a very small team of highly-skilled developers and most if not all of the original developers are still around. This means that most of your team knows the intricate details of most of the codebase. If you play with any of those five variables - a larger team, a larger codebase, an older codebase, a loss of original team members/addition of new ones, or less-skilled developers - then you may quickly find that the logs are indispensable. This is especially true in software that is deployed on the desktop or in the customer environment where you can't just access the running installation. Logging is like a fire extinguisher - you don't need one until you need one and obtaining one during the problem takes too long. Sure you can add logging in as you need it but as your code base grows and ages and your team changes, this becomes more difficult and time consuming to do. It also can lead to false positives as you see things being logged for the first time that are issues but have really always been there, don't have to do with the bug at hand, and you are just now noticing.

My $0.02

Zach on December 4, 2008 7:29 AM

Another very important thing to remember with logging is to always CHECK THE LOGGING LEVEL YOURSELF before you do any work.

Let's say you log some message for debug mode. Doing this may seem ok:

log.debug("Some message " + someThing.toString());

Sure, if the environment isn't configured for debug logging, this won't be logged.

However, all the work of converting the object to a string and combining it with your message and passing it along happens BEFORE the logging function is called! This means that even in your production environment,

You can incur serious performance penalties through excessive logging this way.

The fix is pretty simple:

if (log.isDebugEnabled()) {
log.debug("Some message " + someThing.toString());
}

Now, all you are doing is checking a boolean each time, not building a (potentially very large & complex) string from an object and THEN checking a boolean.

TM on December 4, 2008 7:31 AM

Quick note TM....the beauty about log4net is that it does the level checking for you. No need to add to extra code 8^D

Sean Patterson on December 4, 2008 7:39 AM

Everything's already been said, really -- you might not have access to the machine where the code runs, you might not have the sources for the stuff that broke. The stuff that broke might not even know it's broken... I had to troubleshoot a SAML (single sign on) problem once where the users where getting sporadic errors while trying to SSO to our partner site. Everything looked fine on our end, and only by carefully comparing our logs with the partner's we were able to figure out that the server's time was out of sync far enough for some of the assertions to start becoming invalid. This troubleshooting would not be possible if the application did not generate couple gigs worth of really detailed logs every day.

Basically, I agree with people who are saying that too much logging is better then too little. (And yes, you can obviously overdo it and kill your performance, or write tons of useless stuff, or screw up and introduce deadlocks, etc.)

max on December 4, 2008 7:43 AM

Logging is hard, let's go shopping!

Rick Branson on December 4, 2008 7:50 AM

<<If it's worth saving to a logfile, it's worth showing in the user interface>>

I'm not sure you want to expose your mistakes or possible lacks of security to an experimented user (stackoverflow has a lot of them) or show technical information for common users, it's always better to show a "friendly" message.

Logfile has a purpouse, to help you clean your systems of undesireble bugs and/or potencial ones...what do you think?

Rulas on December 4, 2008 7:53 AM

Logging is used for so much different things that it has lost it's meaning over time.

First of all, don't use log levels. Murphy's law states that you always need that DEBUG output when it's disabled and they're always wrong anyways. The best is to use different methods for different purposes : trace, should_not_happen, user_notify are all different things.

Also, don't marshal your data to an unstructured text representation like syslog. You're never sure if you missed something when you have like 3 lines of regexps.

zimbatm on December 4, 2008 7:56 AM

Logging is not mainly for tracing where your process goes, but for where it was when something went wrong.

Trace begginig
Trace exceptions
Trace the first time a process begun
Trace when it failed
Trace if it got recovered or just when you know it will not be recovered
Trace end

Oso on December 4, 2008 7:58 AM

Treat logging like this familiar phrase: "Take only what you need to survive."

Don't be the Druish princess that says "It's my industrial-strength hair dryer... and I can't live without it." :)

Scott B on December 4, 2008 8:07 AM

There is a third way, as well. It is possible to set up your logging system so that you can turn off certain log levels through config. Thus, you can log as much at the debug level as much as you like if you can simply flip a switch and turn it off in production. If you want to diagnose a specific prod error, try flipping it back on for a time and searching for statements that you don't need.

To avoid code clutter, you can build a declarative logging model for method entrance / exits / params / return values using annotation / attributes (depending on your language), coupled with the previous ability to turn those messages on and off via config.

Jonathan on December 4, 2008 8:10 AM

I do a ton of logging during development phases, and I use a variable to determine what levels of logging I want to accomplish. When I am done developing an app, I just kill the extensive logging by changing the value of the variable.

Scot McPherson on December 4, 2008 8:11 AM

Having developed many an app for custom hardware, often that's fielded in industrial applications with no UI to speak of, logging is the *only* way to get state information for problems durign an actual production run.

That said, I agree that developers usually get way too deep into logging for their own good. Spartan logging of critical pieces is useful, but you don't need to log everything. Typically we add it for exceptions and when there is a problem we add more incrementally as we search for the issue, then remove it once the problem is identified and resolved.

ctacke on December 4, 2008 8:37 AM

@Sean Patterson

It does the logging checking for you... after the method is invoked. All the parameters to the method are built BEFORE that check.

So the string building is still done.

TM on December 4, 2008 8:41 AM

I agree with ctacke, and building in logging as needed is precisely what I do, however I use a variable to turn it off rather than having to hunt down each logging statement. I am not writing software that millions of people use, most of the software I write is to reduce manhours, so that a job previously took 100 hours now only takes a few minutes. The reduced of efficiency due to condition checking for logging really is miniscule considering the performance of the task has been time efficiency improved by a factor of over 100x in some cases.

Scot McPherson on December 4, 2008 8:45 AM

@TM

True true, it does construct the string, so I guess that ultimately depends on how efficient you need the entire process to run. From the log4net FAQ:

***
What is the fastest way of (not) logging?

For some logger log, writing,

log.Debug("Entry number: " + i + " is " + entry[i]);

incurs the cost of constructing the message parameter, that is converting both integer i and entry[i] to a string, and concatenating intermediate strings. This, regardless of whether the message will be logged or not.

If you are worried about speed, then write

if(log.IsDebugEnabled)
{
log.Debug("Entry number: " + i + " is " + entry[i]);
}

This way you will not incur the cost of parameter construction if debugging is disabled for logger log. On the other hand, if the logger is debug enabled, you will incur the cost of evaluating whether the logger is enabled or not, twice: once in IsDebugEnabled and once in Debug. This is an insignificant overhead since evaluating a logger takes less than 1% of the time it takes to actually log a statement.
***

Sean Patterson on December 4, 2008 8:46 AM

Logging isn't free, but it's pretty cheap if designed correctly.

I have a hard time believing you'd actually write a diagnostic logging program that would hit the disk for EVERY line. There's this new concept called "caching to memory" you ought to try.

I write all my diagnostic logs to memory and only dump them to disk when I explicitly call the "diagnostics.logdump([FileName])" function All lines are counted, of course, so I can also force a dump at X number of lines or on certain fail conditions.

Jeff, you might actually want to consider *designing* your event logs before implementing them. Diagnostics are no more an ad-hoc programming problem than anything else.

ThatGuyInTheBack on December 4, 2008 9:22 AM

I think you only need exception plus a stack trace to solve problems, so just log those as a default.

To me, if you have logging and tracing all over the place, the application design and code is probably in rough shape, thus the need for logging.

Selective logging may come in handy in certain situations and developers should put in logging when necessary.

For example, say you have an import process where it was doing 10 major tasks, you would probably want to log what is was doing (report back to user) and you could monitor each task within the import. In cases like that it would come in useful if there was a problem.

But overall it adds clutter to the code. The performance cost is small unless you are tracing every other line of code.

I always found that using sample Debug.Write (or Trace.Write in release mode) with a debug listeners installed should be sufficient. In some cases that is not feasible due to client a restraint so a file is required. .Net comes System.Diagnotics built in, why go log crazy?

A lot of people bash log4net, but it does have the ability to define what and who gets logged, which if you have 100s of classes and lots of assemblies can help out with large scale applications.

Jon Raynor on December 4, 2008 9:22 AM

I think you only need exception plus a stack trace to solve problems, so just log those as a default.

To me, if you have logging and tracing all over the place, the application design and code is probably in rough shape, thus the need for logging.

Selective logging may come in handy in certain situations and developers should put in logging when necessary.

For example, say you have an import process where it was doing 10 major tasks, you would probably want to log what is was doing (report back to user) and you could monitor each task within the import. In cases like that it would come in useful if there was a problem.

But overall it adds clutter to the code. The performance cost is small unless you are tracing every other line of code.

I always found that using sample Debug.Write (or Trace.Write in release mode) with a debug listeners installed should be sufficient. In some cases that is not feasible due to client a restraint so a file is required. .Net comes System.Diagnotics built in, why go log crazy?

A lot of people bash log4net, but it does have the ability to define what and who gets logged, which if you have 100s of classes and lots of assemblies can help out with large scale applications.

Jon Raynor on December 4, 2008 9:23 AM

I used to log a lot, but have since moved to using Elmah (thanks for turning me on to it, Jeff). Since then, I get all the info I needed, and none that I don't. Unhandled exceptions are logged straight to a SQL Server table, and can be viewed in a nicely formatted page which includes stack trace info.

Mark Struzinski on December 4, 2008 9:28 AM

Thank you very much for that posting. It was the right thing at the right time. It's not, that logging really was my problem, but this article put me on the right lead. Thanks!!!

Thomas on December 4, 2008 9:32 AM

Not to be too nitpicky, but your irony is *exactly* the fake Alanis Morissette kind. To quote Bender: "The use of words expressing something other than their literal intention. Now that IS irony!"

ftyuv on December 4, 2008 9:46 AM

Bottom line is it's not that simple, Jeff. Write in C and use macros if you can't find a good logging class :-)

As others have said there's a difference between trace logging for development (which is what you have with INFO / DEBUG) and the sorts of errors you may want on customer systems such as your ERROR / FATAL. Trace logging really shouldn't be exposed via the UI, for example.

Writing logging code is even easy and quick. I don't know any good programmers who are limited by their typing speed rather than their brain speed in producing good code -- if you have rules for when INFO / DEBUG logging needs to be done then it happens as second nature when your code calls for it.

I'd be interested to hear Joel's view on the next podcast...

Adam on December 4, 2008 9:52 AM

What about applications with no user interface?

I write a lot of Windows Services. Logging is critical. You don't have to fill up gigabytes of storage. We have a config setting that allows you to set the level (debug, info, warning, error, fatal). Normally only "Error" and "Fatal" are logged. However, for particularly nasty bugs you can turn on "Debug" and "see" everything.

Also, there are numerous strategies for controlling the size of log files: create new daily, create new hourly, create new by size, etc.

MattH on December 4, 2008 9:53 AM

Excessive logging is really just a symptom of not doing TDD.

http://carltonnettleton.com/blog/?p=260

Carlton Nettleton on December 4, 2008 10:20 AM

If writing a log row creates a potentially contentious lock, your logging API sucks.

Noah Yetter on December 4, 2008 10:40 AM

The problem with logging isn't that there's too much of it, but that our tools for searching logs are so primitive.

I've taking to using "retroactive logging", whereby I keep small, debug-focused logs in memory, and only spit one out if it's triggered by an error condition.

I've also taken to storing structured data in these retroactive log entries, in particular snapshots of variables that are likely to cause trouble.

The goal is to provide much of the functionality of a retroactive debugger. Those things, which never escaped Computer Science research labs, let you step both forward and backward in your code.

David Leppik on December 4, 2008 11:04 AM

I'm still rubbing my chin, digesting everything you've said.

You raise excellent points. Even if we change nothing at my site, it will cause discussion and review. Good post.

Charles on December 4, 2008 11:19 AM

Logs are there for two reasons they are separate and distinct

Debugging/Trace - log everything and use instead of the debugger because the program runs differently in the debugger ....

Normal Logging : Record what happened and what was done so you have a record when either a) someone asks (sometimes for legal reasons) b) you need to know what made something brake so you can duplicate it

This last one happens more often then pure software developers think, when you interact with hardware it sometimes does not respond "normally" and you have to cope... you can test 'til the cows come home but never spot some of these, and the log file (if you use it correctly) will tell you what happened ...

Jaster on December 4, 2008 11:47 AM

If your logging can cause deadlocks, your design/use of logging facilities is seriously flawed. Logging must not have any side-effects other than the time penalty, this even includes parameter evaluation for logging values of certain variables. As an example, suppose you want to log the current speed of something, you should absolutely make sure that your distance/time calculation doesn't involve any time near zero.

All other points you made are surely valid in some ways, but apart from that logging can be a great tool, *if* used correctly. Problem is: It often turns out, that the developer usually logs the wrong things, simply because he logs things he thinks are important, automatically taking more care of the code which is about to be logged. Usually the flaws are then in the parts that don't get logged, because nobody thought of them possibly going wrong and needing to be logged. BTDT.

Vinzent Hoefler on December 4, 2008 12:26 PM

I did work on a data processing system, with a fairly high volume of records going through on a nightly basis. Efficiency and accuracy were both extremely important. Naturally, (?) I logged everything that happened, and then some. It was a good lesson to learn, as I went back through the code and wheedled out log code that.

I also found great jumps in efficiency for the overall system... and guess what? Not once in six or so months have I said to myself "Man... I wish I still had been running those logs..."

Chris on December 4, 2008 12:31 PM

Logging is essential in server software. Doubly so in software that runs on someone else's servers so you can't drop all the source code on it to run a proper debug session. Post-mortem debugging with core dumps is useful, but often you need to know what happened before the program got into the state where it crashed.

Logging should never be done to a database -- WTF were they thinking?? -- if only because one of the things you want to be able to log is database failures.

Logging should be done either straight to a file, or through a simple IPC call to a logging daemon, either syslogd or something like that. One IPC call, one disk write call, done. No locks, no DB server overhead.

As for the question of excessive logging, any decent multi-level logging system quietly eats messages below the current logging threshold. You normally run in the middle of the logging range, turning up the level only briefly, to debug a severe problem. 99% of the time, the default log level should be sufficient. Also, that log level should be of a low enough volume that a person can reasonably read it in real time. If it's going by so fast that you can't read it, it won't be read. The exception is higher debugging levels, where you can study a snapshot of the logs off-line, but again, this should be a 1% or less case.

As for the contention that messages should go to the user interface, what if the program doesn't have a user interface? (Server software again.) I'm not talking about a user interface that may be bolted on top of the server software. That has its own concerns, and so can have its own messaging architecture. Anything the server wants surfaced to the user level, it can send up to the UI, sure, but that's no reason to not do logging. There's plenty that a program wants to say in a log that users shouldn't see.

As for logging being like excessive comments, I find that log lines are effective substitutes for comments. Anytime the programmer wants to explain what's happening to both future programmers and people running the software, it goes in a log message, not a comment.

Warren Young on December 4, 2008 12:40 PM

***
What is the fastest way of (not) logging?

For some logger log, writing,

log.Debug("Entry number: " + i + " is " + entry[i]);

incurs the cost of constructing the message parameter, [...]

If you are worried about speed, then write

if(log.IsDebugEnabled)
{
log.Debug("Entry number: " + i + " is " + entry[i]);
}
***

Speed vs. safety:

Even this supposedly simple example may trigger something like an "array out of bounds" exception, making the application crash, once you enable debug logging. :D

Vinzent Hoefler on December 4, 2008 12:55 PM

Jeff,

In your example, were you logging to a database? If you were logging to a database, were you using the same connection for the logging and the transactional work?

Using the same connection for both could lead to potential problems - the least of which would be deadlocks. What happens to any log messages when the transactional work get rolled back?

Steve on December 4, 2008 1:07 PM

Regarding how logging might cause deadlocks, I'm not familiar with Log4Net, but the scenario I envisage would be that some complex object has been passed to the logger, the logger has attempted to convert it to a string, and the object's method for obtaining a string representation is doing complex stuff - maybe causing object state to be lazy-loaded from the database, that kind of thing.

Simon on December 4, 2008 1:10 PM

I still haven't seen a clear root cause of the deadlock error.

Please tell me how your log4net can deadlock if it is only acquiring one lock? There is no chance for deadlock in this case.

In the case where multiple locks are used, the simple solution might be just to disable the logger lock if you can. The logger lock should only be to serialize the output to a non-threadsafe resource.

The real problem here is likely poor documentation in log4net on how to setup logging to a device that implements its own thread-safe locking and/or a discussion on this exact problem which should have been realized before log4net was ever released.

Casey on December 4, 2008 1:12 PM

Be sure to log the deadlock!

Steve on December 4, 2008 1:55 PM

I agree with you Jeff.

I think it's safe to add that logging can be viewed like a set of requirements for a software project: start simple and increase complexity if need be, rather then anticipating "this info would be useful in a log!" (similarly, "the user would probably want these additional features!").

coonj on December 4, 2008 1:58 PM

On my project we basically log anytime something "important" happens, how's that for arbitrary and non-descript =)

We log everything deep down in the datalayer, and all unhandled get logged and use is redirected to an error page.

I think just like anything, moderation works best.

ManiacPsycho on December 4, 2008 2:17 PM

Sounds like a mess there. Syslogd can log messages from hundreds of running processes running hundreds of threads on a busy busy server and not spend more than a running glance at the overhead it creates. Why oh why can't all logging mechanisms be that efficient?

KevC on December 4, 2008 2:32 PM

I log only when debugging. I prefer it to a debugger, in fact.

If my loops are running fine, why the hell would I want my code outputting every time a loop iteration begins?

My lord.

Bill on December 4, 2008 2:54 PM

Perhaps your goal was more profiling. There are a lot of reasons to log and it's part of a complete application. IBM and Tandem does a fabulous job, and I really miss it over Linux. Some of the reasons to log:

Compliance and Audit
System troubleshooting
Proactive security
Application errors
Application information

Here's a nice article:
http://www.ibm.com/developerworks/java/library/j-logging/

dj on December 4, 2008 3:30 PM

Cue NP rant: "Anyway, we only say a program is "deadlocked" because we can't prove it isn't. Nobody really understands deadlock, and sometimes it's just unavoidable."

Jeff, YOU ARE DOIN IT WRONG.

Some people *do* understand the basic CS-undergrad topics well enough to avoid coding up deadlocks, or livelocks, or exponential algorithms, or XSS exploits, or faulty regexes to weed out &lt;script&gt; tags, or whatever newbie mistake you're whining about today. If you don't understand something, *say so*; don't try to hold down the other newbies by feeding them your misinformation! (Didn't somebody nearby blog about CrabMentality a while back?)

Anonymous Cowherd on December 4, 2008 3:55 PM

Any major project without a Logging system is infinitely more complex to debug.

Saying or teaching anything else is complete bullshit.

Many projects would not have shipped without the logging support.


Perhaps the logging system has bugs, and if it changes behavior, your doing it wrong.

Roscoe on December 4, 2008 7:04 PM

I'm not sure that constantly logging everything is a good idea for the reasons you mentioned. However, let's ignore the code bloat for a second and focus on performance. In many cases you need the detailed log only once you already have a recurring hard to place problem when you would be willing to inspect through all of it.

So it makes sense in those situations to have the code in place, and just have the logging mechanism ignore the calls, and give you the option of throwing a switch to turn it on.

Existing log calls are way better than having to rely on ad-hoc printfs that people would add on a per-need basis.

Uri on December 4, 2008 7:08 PM


> We spent days troubleshooting these deadlocks by .. wait for it ..
> adding more logging! Which naturally made the problem worse and even
> harder to figure out. We eventually were forced to take memory dumps
> and use dump analysis tools. With the generous assistance of Greg
> Varveris, we were finally able to identify the culprit: our logging
> strategy. How ironic. And I mean real irony, not the fake Alanis
> Morrissette kind.

GAAAAAAAAAAAAAAAAAAAGH!

For the love of god... THAT IS NOT IRONIC! Don't you even watch Futurama?!? That is a COINCIDENCE, not IRONY.

What would be ironic is if adding more logging FIXED your problem, not CAUSED your problem.

bex on December 4, 2008 8:59 PM

I completely agree with the confusion of excessive logging. I have walked through log files so huge that, I had to use file splitter to load it in editor. Also programmers following different definitions of logging levels is havoc.

But the amount of logging at every level and specifically in log files, depends on the kind of application. I have written an data migration tools transferring 70 million records. In such scenarios, its extremely important to log files indicating which data was used and at what level it failed. Without these details in logs, we would have to spend lot of time rerunning the scenarios.

Also, though I agree with "logging polluting business code", AOP eases up decoupling most of the logging code from business code though not all.

Aakash on December 4, 2008 10:52 PM

Hey Jeff, enough phoning it in, get serious. This is a post with exactly zero content. How do I qualify that? The snippet below would be equally correct for just about anything:

So is [insert feature here] a giant waste of time? I'm sure some people will read about this far and draw that conclusion, no matter what else I write. I am not anti-[insert feature here]. I am anti-abusive-[insert feature here]. Like any other tool in your toolkit, when used properly and appropriately, it can help you create better programs. The problem with [insert feature here] isn't [insert feature here], per se -- it's the seductive OCD "[insert what feature does]" trap that programmers fall into when implementing [insert feature here]. [insert feature here] gets a bad name because it's so often abused.

addys on December 4, 2008 10:57 PM

If it's worth saving to a logfile, it's worth showing in the user interface.

NO! Didn't you read the books you yourself recommend? The Inmates are Running the Asylum, Design of Everyday Things..

coy on December 5, 2008 12:22 AM

I agree with Gilbert.
In my (SHORT, of course) experience, I see that it's better to have more logs than less logs.
The web-application runs - there's an exception - if I don't log the values of variables on begin and on end of methods I spent more time on debug for studyng if it's a problem of bad coding or uncorrect behaviour of customer or other problems.
I think that the safe time on debug gives back the time on coding the log.debug/info/fatal/etcetera in the methods.

alepuzio on December 5, 2008 12:54 AM

As an administrator of a small LAN (my family network), I like network daemons when they've high levels of logging.
My favorite daemon: The exim MTA.
Kilobytes of logging are output but I quickly find WHY my e-mail was rejected or wasn't properly transmitted.
As I'm not necessarily familiar with the software, logging errors is not enough, I need to have some level of tracing to see what's going on.

I agree that very high levels of logging aren't necessary... When I want to get every single system call, I use strace (acutally, this happens quite often).
However, if logging is properly hierarchized so that DEBUG info that is only used to diagnose program's internal errors aren't shown at normal log levels.
Logging exceptions only isn't enough.

Another thing: Good logging isn't so much defined by the quantity than by the quality!
That's obvious but.
"foo-bar-app: Main configuration file /etc/foobar.conf read access forbidden for daemon user foor-bar-user."
is much better than:
"Error: EACCESS"
As the former contains info about:
1) The exact application or daemon which met the error. Depending on the logging facility, this may be automatic.
2) The file name.
3) The type of access that was attempted (read access).
4) The effective user id of the daemon.
With all that info, fixing the issue should be TRIVIAL!

When strace is better than your log files, then, there's something wrong with your logging policy.

SuperKoko on December 5, 2008 1:45 AM

> If it's worth saving to a logfile, it's worth showing in the user interface.

But the logfile *is* part of the user interface: it is the user interface for system administrators and/or developers rather than end users. Each user category needs to see different information.

A common generic use case in most web applications I've worked on is:

1) The application bombs with a simple error message for end users ("retry in a few minutes", "don't bother retrying", "contact support", etc.); hiding exception stack traces etc. is a priority.
2) Someone is alerted; the log file is downloaded from production servers, compressed, and emailed it to the relevant product or account manager.
3) The relevant developer reads the log and figures out what happened. More often than not the problems are corrupted and invalid data, server overload, network failures etc. rather than application bugs.

In this kind of scenario excessively verbose logs are a minor problem; wasting days of testing to reproduce a rare condition because the log is unclear is much worse than wasting a bit of performance or filling disk quotas.

Lorenzo Gatti on December 5, 2008 2:46 AM

shakes his head

fred on December 5, 2008 3:12 AM

Logging tool "Chainsaw"
Chainsaw is a GUI log viewer and filter for the log4j package.

Amit Jain on December 5, 2008 4:01 AM

Yes, I agree that logging is a good point to start a discussion. Personally I think that there are no fixed rules when writing logging for an application as there are almost no fixed rules in general when writing code because everything depends on the context of your application. For example, you can't simple follow the rules of the "Effective Java" but you must think about what are you doing, you must think about the meaning of the objects, services ecc that you are writing. If there is a rule i think it is: "Think about it before writing down (logging, exception ecc)!". I agree, that this don't prevent to have errors because I think writing absolute bug-less code is almost impossible; but if the reasons for writing that code (if there was a correct logic behind it)were correct your life will be much easier.
I think that AOP is a good instrument to get a cleaner code (I hate mixing logging with business code) but I don't know if it is enough.
A "wrong" logging can produce errors that seems strange, for example when I changed the logging level from info (stage) to warn(prod) my application started to behave differently because log.info incorrectly helped to initialize some hibernate cglib proxies and because in stage everything worked well nobody was aware about it until we went in production!

Indrit on December 5, 2008 5:37 AM

You are absolutely right. We've an app where the log file is an absolute mess. Nothing can be found in there (at least not in the first couple of minutes) and it's nothing more than a waste of time.

Nicols Miyasato (miya) on December 5, 2008 7:26 AM

I think very detailed logging like that has its place in traditional software systems which are installed on hundreds of different servers at customers' sites. If you have a detailed log file at the customer's server then you can many times look at the log file and look at your code and figure out what's happening. I've seen this with an application that we're a reseller for many times.

I think that's what Brian W. Kernigan and Rob Pike are talking about in their book "The Practice of Programming":
"As personal choice, we tend not to use debuggers beyond getting a stack trace or the value of a variable or two. One reason is that it is easy to get lost in details of complicated data structures and control flow; we find stepping through a program less productive than thinking harder and adding output statements and self-checking code at critical places. Clicking over statements takes longer than scanning the output of judiciously-placed displays. It takes less time to decide where to put print statements than to single-step to the critical section of code, even assuming we know where that is. More important, debugging statements stay with the program; debugging sessions are transient."

Getting all the necessary data from the customer onto a test system so you can do your debugging can be very time-consuming.

In today's environment with hosted systems such as StackOverflow the data is much more accessible so you can quickly get it to a test environment, then debug the problem and get an answer, so having detailed logging is not as necessary as with an On-Premise system.

Sounds to me like that logging framework is not totally thread-safe, which would be a bug in that code. I wouldn't call all detailed logging evil because of it.

Sarel

Sarel Botha on December 5, 2008 8:06 AM

"I don't mean to single out the author here"

Who do you think you're kidding?

mbhunter on December 5, 2008 9:33 AM

I agree with lots of the folks who have disagreed with Jeff. Logging can be essential in apps that sold to run on many computers, and it's trivial to cope with multithreading and alter the level of logging through a config file. Make a team decision about how much logging is applicable and stick to it. If you only deploy on one server next to your desk, none might be fine for you.

Is it just me, or do a number of Jeff's recent posts over-generalise his experience one particular day with one project into a general rule for all development? And in this particular case, why isn't the deduction "understand deadlock potential throughout your code or don't use multi-threading" rather than "don't log"!

MarkJ on December 5, 2008 9:52 AM

This story would be a lot easier to understand if you mentioned the log messages were (apparently) going to the database ( http://logging.apache.org/log4net/release/config-examples.html has examples of how to log to the database). The locking in question was apparently a combination of internal database logic and internal log4net locking ( http://ddj.com/architect/202802983 ).

You may also want to consider logging to a file or the Windows Event Log instead of to the database. It's relatively easy to use the .Net API to copy your messages out of the Event Log and into a Linq-compatible container, so your queries become very similar to dealing with log messages in a database.

Max Lybbert on December 5, 2008 2:01 PM

You know what? Futurama irony (rhetorical irony) isn't the only kind of irony. And being a coincidence doesn't disqualify a thing from being irony.

In this case, logging is expected to aid in solving a problem. But it in fact exacerbates the problem. This is, legitimately, irony. Specifically, it's situational irony.

http://dictionary.reference.com/browse/situational+irony

"an outcome that turns out to be very different from what was expected..."

You could even argue that all uses of rhetorical irony are also uses of situational irony, because the expectation is normally a fairly literal use of words.

Ens on December 5, 2008 3:39 PM

I unintentionally portmanteaued "Stack Overflow post" as "Stack Overfloast". Thankfully I didn't portmanteau "Stack Overflow logging"!

i feel that this is worthwhile information

Kelsey on December 5, 2008 4:09 PM

"If it's worth saving to a logfile, it's worth showing in the user interface."

I have to disagree with this one. "Hallo? How I are translate string? I not find nowhere after look long time. This look like error message but what is errors being about? What is 'pointer'?"

There's a lot of stuff in our application that falls into this category. I'm the guy who has to deal with our translators, and this kind of thing drives me up the wall. I canned all the visible dialogs and pulled it all out of the GUI. The rare useful error reports about such and such dialog popping up are not worth the headache of dealing with explaining obscure development-related concepts to translators.

Another approach is to just pop up dialogs in English, but every time I do that, some helpful person bitches and moans that I forgot to include such and such string in the translation, and "fixes" it for me.

I can't win.

The best solution in this case is to say the hell with it, and hide everything from the surface interface.

Michael on December 5, 2008 7:23 PM

I work in the tax and accounting industry and, in our production environment, we have near zero visibility into what our apps are doing (beyond what a customer sees). We have some high level monitoring tools, but if not for detailed application-level logging, we'd have zero visibility.

I can't remote desktop or otherwise log into the boxes, either. We have a proprietary viewer we use (which is only accessible through a single machine, I might add) to view the log messages even. having the log entries is a life and time saver. Given the constraints of our environment/industry, I can't imagine NOT having mucho logging. More the better, I say.

In fact, it really bothers me when I take on someone else's code and it doesn't have logging. The first thing it gets is, of course, logging.

Scott Marlowe on December 6, 2008 9:04 AM

Logging costs performance!

Sounds silly? Untill it really happens that your log database has gotten so incredibly huge that adding records to the log-db takes milliseconds.. The DBA forgot to move+empty the logdb for a while :)

Melle Koning on December 6, 2008 11:54 AM

Jesus freaking Christ, if I hear someone talk about logging ONE MORE TIME, I'm going to puke. It's vi vs. EMACS, tabs vs spaces, K&R vs Allman. There can be NO outcome to this discussion, no matter how much you argue.

Don't you guys ever code anything complicated? I mean "complicated" as in "can't figure this out by just staring at it" or "holy shit this stuff is insane and I have to read whitepapers for a week just to understand what it does". That's a rhetorical question, BTW, because if you did logging would be on the very bottom of your priority list.

Dmitriy on December 6, 2008 7:32 PM

There's no right answer to logging - it depends entirely on the context your software is operating in.

The software we build for clients logs information that's appropriate to the environment and context - sometimes that's nothing more than a "Started" and "Stopped" event message. Other times it's doing full logging of all messages from third party systems (eg: payment gateways).

What's excessive for one environment might be nowhere near enough information for another.

What might be worth noting, however, is that you can still build in very high level bits of logging - but wrap compiler instructions around it.

eg:
#if DEBUG
Logging.Log("..whatever");
#endif

...if there's some weird issues we can deploy the debug DLLs temporarily and see if we can find out more info. Once that's fixed, we re-deploy Release Mode DLLs and we're back to minimal logging. No unnecessary performance hits.

Will Hughes on December 7, 2008 7:13 AM

I would more prefer use higher log level is most case / use other filter to filter out logs, because, if you don't log it, you lose the detail.

Even if the application build work is great that you can release to add logging very quick, it still take time to release. So all to have more logging by config is a good idea in most cases.

Carfield Yim on December 7, 2008 6:38 PM

The funniest thing about this post, imo, is that you will see the opposite advice quite frequently on highscalability.com. Some of the biggest sites out there log incredible amounts of data for debugging purposes.

They don't do sill things like sending the logs directly to their main database on the same connections/txns that work their main data, though. :)

Jess Sightler on December 8, 2008 1:27 PM

> if you mentioned the log messages were (apparently) going to the database

The log messages were not going to the database. An action was taken which resulted in a) logging of the action and b) writing actual real data to the database.

> and the object's method for obtaining a string representation is doing complex stuff - maybe causing object state to be lazy-loaded from the database, that kind of thing.

Bingo.

Jeff Atwood on December 8, 2008 2:15 PM

>Don't you guys ever code anything complicated? I mean "complicated" >as in "can't figure this out by just staring at it" or "holy shit >this stuff is insane and I have to read whitepapers for a week just >to understand what it does". That's a rhetorical question, BTW, >because if you did logging would be on the very bottom of your >priority list.

I agree. Most difficult scenarios that need "logging" lie in the application's problem domain, that's where the real fun is. Like discovering that the cherished and patented algorithm to do XYZ simply does not work in the real environment. After that the petty issues with logging become like the warm childhood memories.

But software engineers need to prove their right to exist, thus these discussions always pop up.

Max on December 8, 2008 4:49 PM

Thank you for correcting me. I couldn't see where the "shared state" was which would lead to a deadlock (short explanation: you only get deadlocks when there is some kind of locking -- although it may not be operating system-level mutex locking (TM) -- and you don't need locks unless there is some sort of shared state).

Max Lybbert on December 8, 2008 5:38 PM

I agree fully, MOSS is a prime culprit for logging too much trash, making it hard to get at the errors when they occur, and MOSS also logs the crucial errors, and displays pretty errors to the screen.

Michael on December 9, 2008 3:50 AM

I think for web-based applications, it's always a good idea to log how long it takes to service each request and also how long every database query takes. The data gives you a good picture of the frequency of each request/query and which request/query is the worst performing and may need tuning/caching.

Alex on December 9, 2008 7:58 AM

You can log all your method calls etc using interceptor (castle) so that your code isn't filled with logger code.

Dan on December 9, 2008 8:14 AM

Now, that makes sense Jeff, but in this case, logging still isn't at fault. The action of logging is creating side effects in the code, which it shouldn't be doing.

(Hindsight, 20/20, et cetera.)

Matt Green on December 9, 2008 8:27 AM

Use ETW on Windows, DTrace on Unix. Just because you were doing it wrong doesn't mean it's a bad idea. The guy you're quoting in the beginning has a pretty good strategy.

Dude on December 9, 2008 11:29 AM

>> and the object's method for obtaining a string representation is doing complex stuff - maybe causing object state to be lazy-loaded from the database, that kind of thing.

>Bingo.

If a read is causing a lock, this blog post should be titled "The Problem With Database Software".

Or maybe "The Problem With ToString() Referencing Lazy Loaded Data".

It would be an interesting excerise to try the same code on a database backend where readers don't block writers. (Postgres/Oracle).

Steve on December 9, 2008 1:22 PM

Jeff discusses some interesting points, but clearly from the mind set of a developer. Supporting applications where transactions span multiple servers is not a trivial effort and root cause analysis of problems is time consuming and difficult at the best of times. I would like to address several of his points in detail.

1) So what? Better than not having the information necessary to perform root cause analysis of a problem.
2) Right. It's always a trade off between application speed and ignorance of what the application is doing.
3) Doesn't make any sense. There is information necessary to audit a transaction (e.g. which server processed a transaction) which the user has no need to know.
4) True. But I'd much rather use something like Microsoft log parser and mine for data than lamenting over data which is not available.
5) Sounds like bad design. Be sure to involve operational support so logs designs reflect real world experiences. Logs are great during the development process, but their real work begins supporting the application.

Bob Guzik on December 9, 2008 2:11 PM

Good logging messages act as documentation


$self->log->info("Doing daily expire run on outdated foos");
for my $foo ($self->get_expired_foos){
$foo->set_expired;
}

pawingCat on December 9, 2008 6:30 PM

Huh? You can't set the log level on startup? How about ERROR for production and DEBUG or INFO for development/test? Isn't that why there are different levels?

Ian on December 12, 2008 9:13 AM

U need to go to hell.

Sorry on December 14, 2008 12:37 AM

What happened to coding horror. I can read this crap any where. This is the day the earth stood still. Is there a reason to live if you are a programmer?

Sorry on December 14, 2008 12:40 AM

At least as far as my experience goes, with embedded real time coding logging should be minimized as possible. On real time application the logging itself change the outcome. with multi threaded scenarios locking the log mechanism is a suicide. At least from my limited prospective logging other than exceptions and basic keep alive info is must NOT.

gal on December 14, 2008 12:45 AM

I usually use a standard logging tool (log4j for instance, yeah i'm a java guy :) mostly), and have something like that for the debug messages.

http://195.251.251.19/~bkarak/weblog/2007/01022007.html

I think that something similar might apply to C# world.

bkarak on December 15, 2008 3:00 AM

There can definitely be a performance hit to overlogging, however, the decision is not a "yes" or "no" answer. When all is well, an ERROR level makes sense. Sometimes however, you may want to open the gates to DEBUG or INFO, ex. an alpha/beta release of a new feature or architecture change.

I believe the key here is flexibility, you want to have the ability to flip the switch for better forensics, but don't want to become a slave to the data.

Interesting note, where I work, there have been more than a few ocassions where full logging would have helped isolate and track down intermittent "nasties" that we often see in prod but not in dev.

HC on December 18, 2008 7:18 AM

I'm with you, Jeff, on this one about logging. I program for embedded systems and most embedded systems software are timing sensitive, so the more traces/logs, the more the logs may skew the software performance. I have many colleagues who do the info level type of trace at the start and end of a method which generates a lot of noise. I think it's a just a lazy way of bypassing the debugger and debugging software directly from the log. Otherwise they are just copying the previous guy's code and modifying it. Anyway I have always made it a point to question this practice when I am in the position, and the answer was always "Eh, it is useful to have the trace there for debugging.". Oh well, they are going to have to deal with the performance problems and the unreadability of logfile.

Whui Mei on December 18, 2008 8:09 AM

The thing about logging is there is no right answer that is applicable for every problem.

When it comes to logging those logging hooks in your code still take time to run even when you turn that particular log off. If that's a huge impact on your program then you are better off having debug builds and use one of those if you have a problem (or ship one to the customer when they have a problem).

When it comes to production code. I've got a couple of rules of thumb.

1. Log when it's cheap compared to the operation.
2. Log when an exception occurs.
3. Log when it's really important and you need to pay the price.

If those logs aren't enough then I use an alternative, such as conditional logs , debug builds or a debugger. Which I choose depends upon the systems constraints.

Ze on December 20, 2008 1:50 AM

Hello,

I believe the main problem with logging is that we cannot know at coding time what is going to go wrong at runtime, so we cannot know what to log ... That's why some of us end up logging everything. If you have an easy access to the running software, you can little by little log your way out of the outages, you should end up with meaningfull logs on critical parts.

DTrace and runtime weaving AOP techniques provide another way. Define what you want to log at runtime when facing the bug.

Philippe on January 7, 2009 3:43 AM

Now, that makes sense Jeff, but in this case, logging still isn't at fault. The action of logging is creating side effects in the code, which it shouldn't be doing.
http://novhouse.ru/

Olof on January 29, 2009 1:52 PM

Good morning. Every day you may make progress. Every step may be fruitful. Yet there will stretch out before you an ever-lengthening, ever-ascending, ever-improving path. You know you will never get to the end of the journey. But this, so far from discouraging, only adds to the joy and glory of the climb.
I am from Bhutan and learning to speak English, give true I wrote the following sentence: "Product information for solar wall clock from shenzhen zy pv science tech."

With love :o, Gordy.

Gordy on April 22, 2009 11:55 PM

Good morning. There are a terrible lot of lies going around the world, and the worst of it is half of them are true.
I am from Zambia and also am speaking English, please tell me right I wrote the following sentence: "See our large assortment of wall sports clocks."

THX ;), Quinby.

Quinby on April 24, 2009 11:22 AM

At least as far as my experience goes, with embedded real time coding logging should be minimized as possible. On real time application the logging itself change the outcome. with multi threaded scenarios locking the log mechanism is a suicide. At least from my limited prospective logging other than exceptions and basic keep alive info is must NOT.
http://manytour.ru

Kedrik on May 13, 2009 1:03 PM
Content (c) 2009 Jeff Atwood. Logo image used with permission of the author. (c) 1993 Steven C. McConnell. All Rights Reserved.