I <3 Steve McConnell*
Coding Horror
programming and human factors
by Jeff Atwood

June 28, 2004

Hungarian Wars

I've found a number of blog posts about the pros and cons of Simonyi's Hungarian Notation, most notably, this blog post commenting on the extreme polarity of the reprinted MSDN article rating:

msdn article score graph

This single image really cuts to the heart of the debate, pointedly illustrating what a religious war this topic is.

Coming from a traditional VB background, with our txts, our frms, and our strs and ints, I was befuddled when presented with .NET-- what naming scheme do you use for a fully OO language where.. everything is an object? objEverything isn't very satisfying. So, you start to question whether the naming scheme ever made any sense at all.

After a lot of thought, and a lot of hand-wringing, here are the conventions I ultimately settled on for my .NET development. I'm not proposing these as a standard, merely documenting the thought process that goes into coherent variable naming:

  1. Most functions should be short enough that you won't have a zillion variables. If you have that many variables to tell apart, you have bigger fish to fry.
  2. I want to be able to tell "simple" intrinsic types from full blown objects at a glance*. This distinction is important to me. Yeah, they're all still objects, but there are the common simple variables types we use 99% of the time (eg, String and Integer), and then there's everything else.
  3. I want to be able to tell class level variables from local variables at a glance*. How far up do I need to scroll?
  4. The variable names should be descriptive, readable and succinct.
  5. I do not believe every single object needs a unique prefix. This is insane, and as the VB6 document illustrates, this way leads madness..

* At a glance means without having to mouse or cursor over the variable name, eg, it should work even in the high tech Notepad IDE.

If there is a theme here, it is simplicity and readability. The other theme is that Hungarian Notation seems to have somehow evolved into a catch-all term for "Here's the variable naming convention we use on our team." It's like Linux: there are umpteen zillion "distros" out there, all slightly different flavors of the same basic theme. Here's what my flavor looks like:

Public Class Class1

    Public _strCustomerName as String

    Public Function GetCustomerFields(ByVal intCustomerID As Integer) As Specialized.NameValueCollection
        Dim nvc As New Specialized.NameValueCollection
        Dim ds As New Data.DataSet
        Dim dr As Data.DataRow

        For Each dr In ds.Tables(0).Rows
            nvc.Add(dr.Item("name"), dr.Item("value"))
        Next

        Return nvc
    End Function

End Class

The numbered list above documents the rationale (or lack thereof) behind this. You can see where I totally punted on the concept of object prefixes in a fully object oriented language. So many of objects I create are "one off", with such a limited lifetime and such an obvious, scoped usage that I don't feel the need to give them unique names. Does it really help to call the dataset dsCustomers in this case? I don't think so. Keep it short and sweet.

Ultimately, as in the MSDN rating, naming conventions are kind of personal. Pointing out how stupid someone's variable names are is like telling them how stupid they are for naming their first born child "Melvin."

On the other hand, I do think it is rude to enter a development team and arbitrarily decide to settle on "the best" conventions; deciding what conventions to adopt is certainly a topic worth broaching in a team developer meeting, but it's also just plain good manners: when in Rome, do as the Romans do. In the end, it's more important to be internally consistent with a naming standard than it is to spend a lot of time sussing out some kind of perfect, interplanetary naming standard that will never be definitively decided to anyone's satisfaction anyway. Pick a reasonable, basic set of standards that most can agree on, but leave room for personal interpretations, too. There's nothing quite as soul crushing as over-standardizing in a religious area where there really isn't a "right" answer.

In closing, it is evident that the conventions participated in making the code more correct, easier to write, and easier to read. Naming conventions cannot guarantee good code, however; only the skill of the programmer can.
-- Charles Simonyi

Posted by Jeff Atwood    View blog reactions

 

« Code Complete 2: The Revenge Visual Diff Tools »

 

Comments

Well, this may strike some as excessive typing, but I just add some descriptive text to the end:

MainForm
SettingsForm
AddressNotFoundException

etc...

I do tend to be a tad anal about self-documenting code, though I don't go overboard (IMO).

Also, I use the prepended-underscore notation to denote private variables, a habit I picked up in the beautiful Python language.

Dave on August 7, 2004 11:52 PM

I'm definitely open to suggestion w/r/t naming of general objects. objEverything is out of the question.

Ditto on the underscore, it's an incredibly effective and very simple convention. The best kind!

The other thing nobody does any more: declare constants in UPPERCASE. Remember that?

Jeff Atwood on August 7, 2004 11:53 PM

I have an alternate approach to the one that you showed. I agree wholeheartedly that dsCustomers is overkill. You chose to emphasize the type, but for everything other than the most trivial code (and I would suggest that even trivial code such as the example tends to become less trivial over time), I would go the other way 'round. Don't trim to "ds", trim to "customers". Yes, the type becomes hidden if you use Notepad and scroll, but its generally more obvious what the type is by context than it is what the data is. At least, that's my experience.

In the truly trivial case I would agree - a tight loop on an Iterator (yeah, I'm a Java guy) would use the name "iter" for the iterator - but once you get into nested loops or anything else, having them be "customers" and "addresses" is much nicer than "iter" and "iter2" or, more aggrevatingly, some bizarre hybrid like "iter" and "addr" added by someone trying to change as few lines of code as possible (more applicable to large shops).

Richard on March 23, 2005 06:59 PM

> You chose to emphasize the type, but for everything other than the most trivial code (and I would suggest that even trivial code such as the example tends to become less trivial over time), I would go the other way 'round

Definitely, if there's more than 10-12 lines of code. I tend to write fairly small, focused functions 80 percent of the time. In the 20 percent where I can't, I definitely deviate from the "very short variable name based on type" rule.

Nested loops would be another valid reason, but for some reason, I rarely need to nest loops.. I think because I tend to break that into two functions: a plural one for operating on "a bunch of" and a singular one for operating on "one of". This the plural function calls the single one, and nested loops aren't present.

Anyway, as with all guidelines, YMMV. I think the golden rule is to always try to keep simplicity as your ovearching goal, whatever you're writing.

Jeff Atwood on March 24, 2005 03:33 PM

I use almost the exact same notation, but I only use simplified variable names (int i, DataSet ds etc...) if the variable is confined to a loop. Otherwise, I do think it's important to use a descriptor as part of the variable name even if it's just generic (cmdCommand). This gives another level of differentiation IMHO and keeps your functions from being full of "ads = dbr.Property;" which makes things difficult sometimes.

Either way, as long as SOME sort of standard is adherred to, it makes code re-use and refactoring much easier.

russ on September 7, 2005 02:48 PM

My approach has changed a bit since I wrote this. I use the "add the type to the end of the variable" style most often now:

CancelButton
ClickEvent

I think this is a lot more .NET friendly than the "classic" 3-character prefix eg

btnCancel
evtClick

I've also stopped trying to distingish strings and integers. In the above example,

_strCustomerName -> _CustomerName
intCustomerID -> CustomerID

Jeff Atwood on September 7, 2005 04:33 PM

Pretty much every single article on MSN shows the polarity that the Hungarian notation has. I figure there is some bot that visits MSN and marks every article with a 1. I don't think there is any inference to be drawn from the polarity of opinions on MSN: it's the same for every article.

Hugh Brown on April 13, 2007 01:50 PM







(hear it spoken)


(no HTML)




Content (c) 2008 Jeff Atwood. Logo image used with permission of the author. (c) 1993 Steven C. McConnell. All Rights Reserved.