I <3 Steve McConnell*
Coding Horror
programming and human factors
by Jeff Atwood

June 28, 2004

Hungarian Wars

I've found a number of blog posts about the pros and cons of Simonyi's Hungarian Notation, most notably, this blog post commenting on the extreme polarity of the reprinted MSDN article rating:

msdn article score graph

This single image really cuts to the heart of the debate, pointedly illustrating what a religious war this topic is.

Coming from a traditional VB background, with our txts, our frms, and our strs and ints, I was befuddled when presented with .NET-- what naming scheme do you use for a fully OO language where.. everything is an object? objEverything isn't very satisfying. So, you start to question whether the naming scheme ever made any sense at all.

After a lot of thought, and a lot of hand-wringing, here are the conventions I ultimately settled on for my .NET development. I'm not proposing these as a standard, merely documenting the thought process that goes into coherent variable naming:

  1. Most functions should be short enough that you won't have a zillion variables. If you have that many variables to tell apart, you have bigger fish to fry.
  2. I want to be able to tell "simple" intrinsic types from full blown objects at a glance*. This distinction is important to me. Yeah, they're all still objects, but there are the common simple variables types we use 99% of the time (eg, String and Integer), and then there's everything else.
  3. I want to be able to tell class level variables from local variables at a glance*. How far up do I need to scroll?
  4. The variable names should be descriptive, readable and succinct.
  5. I do not believe every single object needs a unique prefix. This is insane, and as the VB6 document illustrates, this way leads madness..

* At a glance means without having to mouse or cursor over the variable name, eg, it should work even in the high tech Notepad IDE.

If there is a theme here, it is simplicity and readability. The other theme is that Hungarian Notation seems to have somehow evolved into a catch-all term for "Here's the variable naming convention we use on our team." It's like Linux: there are umpteen zillion "distros" out there, all slightly different flavors of the same basic theme. Here's what my flavor looks like:

Public Class Class1

    Public _strCustomerName as String

    Public Function GetCustomerFields(ByVal intCustomerID As Integer) As Specialized.NameValueCollection
        Dim nvc As New Specialized.NameValueCollection
        Dim ds As New Data.DataSet
        Dim dr As Data.DataRow

        For Each dr In ds.Tables(0).Rows
            nvc.Add(dr.Item("name"), dr.Item("value"))
        Next

        Return nvc
    End Function

End Class

The numbered list above documents the rationale (or lack thereof) behind this. You can see where I totally punted on the concept of object prefixes in a fully object oriented language. So many of objects I create are "one off", with such a limited lifetime and such an obvious, scoped usage that I don't feel the need to give them unique names. Does it really help to call the dataset dsCustomers in this case? I don't think so. Keep it short and sweet.

Ultimately, as in the MSDN rating, naming conventions are kind of personal. Pointing out how stupid someone's variable names are is like telling them how stupid they are for naming their first born child "Melvin."

On the other hand, I do think it is rude to enter a development team and arbitrarily decide to settle on "the best" conventions; deciding what conventions to adopt is certainly a topic worth broaching in a team developer meeting, but it's also just plain good manners: when in Rome, do as the Romans do. In the end, it's more important to be internally consistent with a naming standard than it is to spend a lot of time sussing out some kind of perfect, interplanetary naming standard that will never be definitively decided to anyone's satisfaction anyway. Pick a reasonable, basic set of standards that most can agree on, but leave room for personal interpretations, too. There's nothing quite as soul crushing as over-standardizing in a religious area where there really isn't a "right" answer.

In closing, it is evident that the conventions participated in making the code more correct, easier to write, and easier to read. Naming conventions cannot guarantee good code, however; only the skill of the programmer can.
-- Charles Simonyi

Posted by Jeff Atwood    View blog reactions
« Code Complete 2: The Revenge
Visual Diff Tools »
Comments

Well, this may strike some as excessive typing, but I just add some descriptive text to the end:

MainForm
SettingsForm
AddressNotFoundException

etc...

I do tend to be a tad anal about self-documenting code, though I don't go overboard (IMO).

Also, I use the prepended-underscore notation to denote private variables, a habit I picked up in the beautiful Python language.

Dave on August 7, 2004 11:52 PM

I'm definitely open to suggestion w/r/t naming of general objects. objEverything is out of the question.

Ditto on the underscore, it's an incredibly effective and very simple convention. The best kind!

The other thing nobody does any more: declare constants in UPPERCASE. Remember that?

Jeff Atwood on August 7, 2004 11:53 PM

I have an alternate approach to the one that you showed. I agree wholeheartedly that dsCustomers is overkill. You chose to emphasize the type, but for everything other than the most trivial code (and I would suggest that even trivial code such as the example tends to become less trivial over time), I would go the other way 'round. Don't trim to "ds", trim to "customers". Yes, the type becomes hidden if you use Notepad and scroll, but its generally more obvious what the type is by context than it is what the data is. At least, that's my experience.

In the truly trivial case I would agree - a tight loop on an Iterator (yeah, I'm a Java guy) would use the name "iter" for the iterator - but once you get into nested loops or anything else, having them be "customers" and "addresses" is much nicer than "iter" and "iter2" or, more aggrevatingly, some bizarre hybrid like "iter" and "addr" added by someone trying to change as few lines of code as possible (more applicable to large shops).

Richard on March 23, 2005 6:59 PM

> You chose to emphasize the type, but for everything other than the most trivial code (and I would suggest that even trivial code such as the example tends to become less trivial over time), I would go the other way 'round

Definitely, if there's more than 10-12 lines of code. I tend to write fairly small, focused functions 80 percent of the time. In the 20 percent where I can't, I definitely deviate from the "very short variable name based on type" rule.

Nested loops would be another valid reason, but for some reason, I rarely need to nest loops.. I think because I tend to break that into two functions: a plural one for operating on "a bunch of" and a singular one for operating on "one of". This the plural function calls the single one, and nested loops aren't present.

Anyway, as with all guidelines, YMMV. I think the golden rule is to always try to keep simplicity as your ovearching goal, whatever you're writing.

Jeff Atwood on March 24, 2005 3:33 PM

I use almost the exact same notation, but I only use simplified variable names (int i, DataSet ds etc...) if the variable is confined to a loop. Otherwise, I do think it's important to use a descriptor as part of the variable name even if it's just generic (cmdCommand). This gives another level of differentiation IMHO and keeps your functions from being full of "ads = dbr.Property;" which makes things difficult sometimes.

Either way, as long as SOME sort of standard is adherred to, it makes code re-use and refactoring much easier.

russ on September 7, 2005 2:48 PM

My approach has changed a bit since I wrote this. I use the "add the type to the end of the variable" style most often now:

CancelButton
ClickEvent

I think this is a lot more .NET friendly than the "classic" 3-character prefix eg

btnCancel
evtClick

I've also stopped trying to distingish strings and integers. In the above example,

_strCustomerName -> _CustomerName
intCustomerID -> CustomerID

Jeff Atwood on September 7, 2005 4:33 PM

Pretty much every single article on MSN shows the polarity that the Hungarian notation has. I figure there is some bot that visits MSN and marks every article with a 1. I don't think there is any inference to be drawn from the polarity of opinions on MSN: it's the same for every article.

Hugh Brown on April 13, 2007 1:50 PM

Traipsing through old posts on a slow Friday afternoon...

The idea of using extremely short variable names for tightly-scoped variables triggered some really old synaptic paths in my brain. In most of my university classes, (circa 1986), the world was just coming out of the era where the languages themselves restricted the length of variable names. (Basic only "saw" 2 significant characters, FORTRAN 6 .OR. 8, etc.) most instructors were encouraging longer variable names as a good practice.

Then I took a course based on Modula-2, a language designed by Nicklaus Wirth, of Pascal fame. *Every* single code sample in the reference manual (written by Wirth) contained *only* single-character variable names. I found that stangely discordant with what I was being taught in class, given Wirth's reputation.

But the code examples were all less than ten lines or so, making the scope/lifetime of those variables very short. I remember finding it very easy to follow, because the variable names didn't get in the way of focusing on the language feature being described.

Using today's OO languages and feature-rich IDEs, I have disposed with Hungarian prefixes and the numerous bastardizations thereof. I name a variable what it means in the application's domain, like "checkAmount" for a check amount, for example. I let the compiler catch typos, (or have IntelliSense prevent them in the first place), and use the "Jump to definition/Jump back" keystrokes if I ever find myself questioning a variable's type. It's such a quick trip there and back that it's not worth junking up the variable names just to avoid the quick F12 and Ctrl-dash. (VS2008 mappings)

Probably not quite worth $0.02, but since nobody's paying me anyway...

Jeff on June 13, 2008 1:47 PM

I agree, I've switched to extremely short name / short scope local variables, too.

Jeff Atwood on June 13, 2008 8:22 PM

Ouch ... Tons of people have COMPLETELY missed the point of Hungarian notation. Types are checked by the compiler. YOU DON'T NEED TO CHECK VISUALLY, THE COMPILER DOES IT FOR YOU. What Hungarian is for is to embed SEMANTIC information in a name, not SYNTACTIC. In the original, i was a prefix indicating an index into an array, r was a prefix indicating a row, c indicating a column ... things a compile can't check for you. It is unfortunate that some idiot mistook the original intentions and started using i to mean an integer, c to mean char and so on. It really is a neat system ... in its original context

Check out http://www.joelonsoftware.com/articles/Wrong.html for a great discussion on this. Also the "More Reading" section has some great stuff.

greenyballz on August 22, 2008 6:38 AM

> I agree, I've switched to extremely short name / short scope local variables, too.

Dear Jeff,
Please write an update to this article, with your current variable naming schema. I'd be very interested to learn the opinions of your fanclub.

Cheers

[d3m0n] on January 28, 2009 6:06 AM

But the code examples were all less than ten lines or so, making the scope/lifetime of those variables very short. I remember finding it very easy to follow, because the variable names didn't get in the way of focusing on the language feature being described.
http://autoseler.ru/

Oskar on February 7, 2009 1:46 PM

Thanks.
Well explained blog.
http://dexzone.blogspot.com - for free HD wallpapers
http://winguard.blogspot.com - for windows optimization and customization

winguard on March 2, 2009 6:34 AM

Pretty much every single article on MSN shows the polarity that the Hungarian notation has. I figure there is some bot that visits MSN and marks every article with a 1. I don't think there is any inference to be drawn from the polarity of opinions on MSN: it's the same for every article.
http://bigwharf.ru

Ohrenel on May 14, 2009 9:50 AM
Content (c) 2009 Jeff Atwood. Logo image used with permission of the author. (c) 1993 Steven C. McConnell. All Rights Reserved.