The real cost of performance

January 11, 2005

I don't usually get territorial about modifications to "my" code. First of all, it's our code. And if you want to change something, be my guest; that's why God invented source control. But, for the love of all that's holy, don't take working code and break it in the name of highly questionable "performance" gains. I was forced to have a sit-down discussion yesterday with another programmer about some of my NameValueCode code he, uh, refactored:

        If Value <> "" Then
            If nvc.Item(name) = "" Then
                nvc.Add(name, Value)
            End If
        End If

Isn't it absolutely fabulous? This is the kind of master coding wizardry that Donald Knuth can only dream about, friends. But it's not good enough for some programmers, who just can't help themselves:

        If Value <> String.Empty Then
            If nvc.Item(name).Equals(String.Empty) Then
                nvc.Add(name, Value)
            End If
        End If

Do you see the bug?* Imagine my surprise when I got an exception in a previously working section of very straightforward code. I approached the developer and quizzed him about this change. Here's what I got:

  • Calling the .Equals method performs better than straight equality

    I researched this, and he's right:

    When the C# compiler encounters a == operator on strings it generates a call to the op_Equality method of the string class. When the VB.NET compiler encounters the seemingly corresponding = operator it rather generates a call to Microsoft.VisualBasic.CompilerServices.StringType.StrCmp, this is to maintain a functional compatibility with VB6 code.

    However, the performance gain is marginal at best. And if we wanted the most performant string comparison, oddly enough, it wouldn't be StringVariable.Equals-- it would be the shared/static String.Equals method:

    I want to point out that the results contained in here are relative and would hardly alter the overall performance of your application except is some very rare and extreme cases.. I would definitely bear these results in mind when performing an operation which has high volumes and is string comparison intensive in which case I would consider using the String.Equals as an alternative.
  • We can convert between C# and VB.NET more easily if we use .Equals

    Since when did converting our code between .NET languages become a decision point in writing our code? That's news to me. And I never see C# coders using the .Equals operator for trivial equality tests anyway. They probably think it's even dumber than I do.

  • It's more object oriented this way

    And that's adequate justification for throwing out one of the most basic operators in any language with a harder-to-read and more obscure variant?

I do try to be tolerant of other people's crazy practices. I really do. But this is insane. Can you imagine reading code littered with tons of .Equals calls instead of the simple, straightforward = operator? I tried my best to gently convice this guy that perhaps, just perhaps, the readability of using simple s = "string" tests might outweigh any of the incredible speed benefits he's adding to our database driven ASP.NET web application. When it isn't.. querying the database.

I've linked to this Eric Lippert post before, but it's so apropos that it deserves a second link:

Write the code to be extremely straightforward. Code that makes sense is code which can be analyzed and maintained, and that makes it performant. Consider our "unused Dim" example -- the fact that an unused Dim has a 50 ns cost is irrelevant. It's an unused variable. It's worthless code. It's a distraction to maintenance programmers. That's the real performance cost.

Always, always write for simplicity and readability first. That should be your main goal when writing code: to express yourself as simply and clearly as you possibly can. If you must optimize, use real optimizations based on actual metrics in a functional application. Not hypothetical theory based on some article you read on some website.

* Nothing = "" evaluates to True. If the item isn't found in the NameValueCollection, the object will be Nothing. Good luck calling the .Equals method on Nothing. I have no real opinion on the merits of String.Empty versus "" other than noting that "" is shorter and can be used as an assignment in constant and optional declarations.

Posted by Jeff Atwood
23 Comments

This is kinda similar, lately I prefer

string.Concat("String 1: ", strOne)

over

"String 1: " + strOne

Something about seeing those plus signs all over the place looks ugly to me. With string.Concat, it's a little more explicit.

daveg on January 12, 2005 1:47 AM

Ever since I found out that doing:

if( string1 != "" )...

causes another string to be created(the empty string) I've been checking for length instead.

I think FxCop explains the reason as well, but a quick poke with ILDASM shows the other string being created if you wanna see it.
Now i just do:

if( string1 != null string1.Length 0 )...

Chris Carter on January 12, 2005 2:52 AM

Interestingly FxCop suggests you use the Length property of strings to test if they are zero length or not, rather that direct = or comparisons.

If Value.Length 0 Then
nbsp;nbsp;nbsp;nbsp;If nvc.Item(name).Length = 0 Then
nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nvc.Add(name, Value)
nbsp;nbsp;nbsp;nbsp;End If
End If

The Equals method is certainly overkill and not warranted in this instance though.

Tom McDonnell on January 12, 2005 4:27 AM

"Premature optimisation is the root of all evil" -- Donald Knuth

Gustaf Erikson on January 12, 2005 5:33 AM

Any time anyone's "optimising" your code, I'd ask to see the output of the profiler first. What's that? They didn't run a profiler? The change is so obviously right that they didn't need a profiler? Then they need to go back to the two laws described by Michael Jackson (the data structures guy, not the freak):

* Rule 1. Don't do it.
* Rule 2. (for experts only) Don't do it yet.

Eric TF Bat on January 12, 2005 7:33 AM

Oh I so hate it when "performance" developers hack your well written, maintainable code into something unreadable forsake of a few clock cycles.

I worked on a project a few months ago, which involved about 50 developers, and I came on board towards the end. I asked the architect what was going to be done about performance, because the software had not been designed with speed in mind. He told me that a crack team of developers would performance tune the app. What actually happened is that the performance team broke most of the software. Some modules wouldn't build and some no longer ran without crashing.

Concerning your article, yes, who in God's name uses the .Equals method when there is a perfectly good operator that we've been using for years.

BTW, as good practice I would check for null/nothing and not rely on the language evaluating (null == "") or (Nothing = "") as true. In the C++ days this expression would cause a crash.

Rob Garrett on January 12, 2005 7:57 AM

I always use .Length == 0 to check for the empty string. The meaning of that seems clear and unambiguous and is an integer compare instead of a string compare.

It sounds like the "=" in VB.NET is overloaded to also manage handling Nothing. That is completely non-obvious to me and that's why your code broke, because it was non-obvious. I think .Equals is much more obvious. Personally, maybe it's just my time with Java, but I always use .Equals for (non-null) string compare in C# and VB.NET. My (not necessarily correct) understanding is that is not necessary to use .Equals unless you want to support string compare in multiple languages (like English, French, etc.), but I do it anyway because I think it is more clear. Maybe it's my C++ background, but putting a single "=" sign in an "if" statement makes me extremely uncomfortable and I avoid it whenever possible (probably another reason I use .Equals).

This is really a very personal preference thing and I don't think you are necessarily justified in arguing with team members over style if there aren't coding standards covering this issue. If he broke the code, he needs to be talked to. If he made a change that added no value, he needs to be talked to. He did screw up here, but I think not knowing that VB.NET had Nothing handling with the equal sign is non-obvious and an easily forgivable mistake/learning experience.

Were there unit tests that he could have run before his check-in to make sure he did not break anything?

Michael Maddox on January 12, 2005 9:33 AM

2-cents
I think you're over reacting.

I agree that the cost of changing the code does not justify the performance gain especially since it was broken, but to use the words "crazy" and "insane" do describe a fellow co-workers actions can rub a person the wrong way.

IMO, you should've made sure the person understands the error s/he introduced and philosophize that making this change is not likely a valuable use of thier time and leave it at that.
/2-cents

Kris on January 12, 2005 9:48 AM

Yeah your 2.6 ghz machine with 1.5gbs or ram may not be able to handle that sring comparrison, I'm glad your friend straightened you out;)

Aeros on January 12, 2005 9:53 AM

I agree with both of you. His changes make sense, and he is correct on most of his statements. However, that doesn't necessarily merit the code change.

I also get annoyed when a developer adds a variable or function in the middle of a class or project using a completely different set of coding standards than the rest of the project. I found a few functions in a C# project the other day that were named with the classic Java camelCase() style. Ick.

Unless you work a some type of ultra-strict software development company that demands that your tabs be saved as spaces, and that a tab is 2 spaces (so more code can fit on the screen), I would stick to the overall rule that the original code designer picks the coding guidelines for that file or project. Everyone coming in after that must follow (to the best of his/her ability) the same guidelines. The coding standards can be changed only if the code is "given" to another developer due to the original developer leaving the company or the team.

Or, you can just refuse to work with anyone else :)

- Joshua

Joshua Bair on January 12, 2005 9:55 AM

Yeah, I probably should have done the nothing comparison:

If nvc.Item(name) Is Nothing Then
nvc.Add(name, Value)
End If

Rather than testing Item(name) = ""; the additional 'free' comparsion doesn't help me anyway. I just want to know if the key exists in the collection or not.

Of course that's not really material to the issue at hand (why use .Equals() when you have a perfectly good intrinsic comparison?), but you guys are absolutely right ;)

Jeff Atwood on January 12, 2005 10:16 AM

There was nothing wrong with your code and it was certainly needless liability to make a change to working code over what really boils down to personal preference.

But the whole latter part of your post was spent on a defensive nit-pick of the readability of using a method vs. an operator? Come on.. that's as silly as the justifications homeboy made for changing your stuff in the first place.

You chose to code against some of the less-straight forward properties of a collection and a string. You knew that the string comparison class treats Nothings and Emptys the same. You knew that a collection, void of a key, would return Nothing (as opposed to a more java-ish no-such-item exception). Because you knew this your code, like all your sexy, sexy code, worked and worked well.

Homeboy might prefer to leverage the knowledge that marking classes with that "Implements" thing implies. Instead of looking at the (type-unsafe) collection as the sum of all its component's implementations... maybe it was simpler to him to consider it an IComparable object in an ICollection collection. Instead of using known side-effects of their implementation... he might just excercise the contract laid out in Comparable (.Equals for equality) and in ICollection (.Contains for existance).

It doesn't matter to me which way someone swings with regard to their coding style. What's important is that you are able to identify what is really bad and unreadable and not just pick out dogmatic, elitist, personal preference and insignificant performance gains.

sam on January 12, 2005 11:35 AM

Considering that str1 and str2 has some value..

Has anyone evaulated string.Compare(str1,str2) vis-a-vis str1.Equals(str2)
(Since already know that str1==str2 is a bad idea)

Binoj Antony on December 18, 2006 5:22 AM

If you want to know performance you need to profile. Some graphs help too.
http://www.tbiro.com/Check-empty-string-performance.htm

Oli

Olivia Vasquez on July 19, 2007 9:17 AM

Run this code and let vb speak for itself.
There is a bug in this which i don't understand
see comments in code
steve

Public Class Form1
Dim empty As String = ""
Dim uninit As String
Dim t As New System.Diagnostics.Stopwatch
Dim x As Integer = 5
Dim y As Integer = 5
Dim i As Integer
Dim count As Integer
Const n As Integer = 100000000 ' 10 million iterations


Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

count = 0
t.Reset()
t.Start()
For i = 1 To n
'no if then
count += x
Next
t.Stop() ' 303 ms = result on celeron 2.4GHz machine
Console.WriteLine()
Console.WriteLine(" Control. " t.ElapsedMilliseconds)

count = 0
t.Reset()
t.Start()
For i = 1 To n
If String.IsNullOrEmpty(uninit) Then
count += x
End If
Next
t.Stop() ' 955, 961, 967 ms = 3 runs results
Console.WriteLine()
Console.WriteLine(" If String.IsNullOrEmpty(uninit)" t.ElapsedMilliseconds)

count = 0
t.Reset()
t.Start()
For i = 1 To n
If Len(uninit) = 0 Then
count += x
End If
Next
t.Stop() ' 1021, 1068, 1030
Console.WriteLine()
Console.WriteLine(" Len(uninit) = 0. " t.ElapsedMilliseconds)


count = 0
t.Reset()
t.Start()
For i = 1 To n
If uninit = empty Then
count += x
End If
Next
t.Stop() ' 1199, 1196, 1194
Console.WriteLine()
Console.WriteLine(" If uninit = empty " t.ElapsedMilliseconds)

count = 0
t.Reset()
t.Start()
For i = 1 To n
If empty = uninit Then
count += x
End If
Next
t.Stop() ' 1207, 1198, 1194
Console.WriteLine()
Console.WriteLine(" If empty = uninit " t.ElapsedMilliseconds)

count = 0
t.Reset()
t.Start()
For i = 1 To n
If uninit = "" Then
count += x
End If
Next
t.Stop() ' 1225, 1233, 1222
Console.WriteLine()
Console.WriteLine(" If uninit = '' " t.ElapsedMilliseconds)
' this one above is 200msec slower after 10 million iterations

' so choose what is easiest to understand
' personally i have always used the slowest one
' If uninit = "" Then...

' but I now know that
' If String.IsNullOrEmpty(uninit)...
' is fastest and I can save
' one msec for every 50,000 iterations

' interestingly the following integer test is 5 times slower ???
' it reports 5500 ms but the output appears in less than a sec
' so what's happening here?

' If x = 5 Then
count = 0
t.Reset()
t.Start()
For i = 1 To n
If x = 5 Then
count += x
End If
Next
t.Stop() ' 5501, 5504, 5496
Console.WriteLine()
Console.WriteLine(" If x = 5" t.ElapsedMilliseconds)

count = 0
t.Reset()
t.Start()
For i = 1 To n
If x = y Then
count += x
End If
Next
t.Stop() '
Console.WriteLine()
Console.WriteLine(" If x = 5" t.ElapsedMilliseconds)


'If uninit.Length = 0 Then ' throws exception
End Sub
End Class

SABTrader on August 12, 2007 4:17 AM

I agree that you are seriously overreacting. If I were the optimizing employee I would shun you for life and pray I never had to work with you again.

I have performed similar premature optimization and, yes, I have broken working code. What you should have yelled at your coworker for is not for his optimization, but his lack of testing before and after the change. (It is also very easy to call someone up and ask a question, or it should be)

Now, if your code does not distinguish between Nothing and String.Empty, then you are to blame for having unclear code. If other programmers have to make assumptions like that, than your code is the true horror.

In C#, an equivelent idea would be:

string o = GetNewThingy(); // could be null
if (o == String.Empty)
// give every employee $300 raise

Now, if I were to come along this code I would be instantly aware that o could be null. If I planned on optimizing it (people do it out of habit) then I would rewrite it like so:

string o = GetNewThingy();
if (String.Empty.Equal(o)) // it is okay to compare against null
// give every employee $300 raise

I agree that this is harder to read. However, a large number of programmers come from a world where optimizations were once beneficial. I come from a group of C/C++ programmers who write algorithms and data structures for fun. Some optimizations are so force of habit and so simple that they kind of spill from the finger tips.

So, blah!

Jehu Galeahsa on October 12, 2007 11:59 AM

What about

StringBuilder vs String.Concat

...

Nikes on May 29, 2008 10:55 AM

Or + operator

Nikes on May 30, 2008 4:10 AM

I agree clarity of your intention is the #1 objective. And in your case you may be better off using the static String.IsNullOrEmpty method. With this, your intentions would be even clearer since the = operator in VB has a somewhat strange convention of returning true for comparisons between a null and an empty string.

James on October 23, 2008 12:21 PM

Hello everyone. For me, it's that I contributed, ... That I'm on this planet doing some good and making people happy. That's to me the most important thing, that my hour of television is positive and upbeat and an antidote for all the negative stuff going on in life.
I am from Madagascar and learning to speak English, give please true I wrote the following sentence: Multimedia tutorials enable you to learn software and business skills without downloads or delays.

Thank you very much :P. Hadar.

Hadar on March 9, 2009 4:58 AM

I agree with both of you. His changes make sense, and he is correct on most of his statements. However, that doesn't necessarily merit the code change.

I also get annoyed when a developer adds a variable or function in the middle of a class or project using a completely different set of coding standards than the rest of the project. I found a few functions in a C# project the other day that were named with the classic Java camelCase() style. Ick.

Unless you work a some type of ultra-strict software development company that demands that your tabs be saved as spaces, and that a tab is 2 spaces (so more code can fit on the screen), I would stick to the overall rule that the original code designer picks the coding guidelines for that file or project. Everyone coming in after that must follow (to the best of his/her ability) the same guidelines. The coding standards can be changed only if the code is "given" to another developer due to the original developer leaving the company or the team.

Or, you can just refuse to work with anyone else :)

- Mr Jackson

michael jackson on February 6, 2010 9:29 PM

I recently came across your post and have been reading along. I thought I would leave my first comment. I wonder how this pertains to Business Loans? I don't know what to say except that it caught my interest and you've provided informative points. I will visit this blog often.
Thank you,
Mazi

Businessloans2010 on April 5, 2010 5:57 PM

How is it that just anybody can write a blog and get as popular as this? Its not like youve said anything incredibly impressive –more like youve painted a pretty picture over an issue that you know nothing about! I dont want to sound mean, here. But do you really think that you can get away with adding some pretty pictures and not really say anything?
Business Loans

Hassan Imtiaz on February 7, 2011 5:58 AM

The comments to this entry are closed.