The Sad Tragedy of Micro-Optimization Theater

January 29, 2009

I'll just come right out and say it: I love strings. As far as I'm concerned, there isn't a problem that I can't solve with a string and perhaps a regular expression or two. But maybe that's just my lack of math skills talking.

In all seriousness, though, the type of programming we do on Stack Overflow is intimately tied to strings. We're constantly building them, merging them, processing them, or dumping them out to a HTTP stream. Sometimes I even give them relaxing massages. Now, if you've worked with strings at all, you know that this is code you desperately want to avoid writing:

static string Shlemiel()
{
    string result = "";
    for (int i = 0; i < 314159; i++)
    {
        result += getStringData(i);
    }
    return result;
}

In most garbage collected languages, strings are immutable: when you add two strings, the contents of both are copied. As you keep adding to result in this loop, more and more memory is allocated each time. This leads directly to awful quadradic n2 performance, or as Joel likes to call it, Shlemiel the painter performance.

Who is Shlemiel? He's the guy in this joke:

Shlemiel gets a job as a street painter, painting the dotted lines down the middle of the road. On the first day he takes a can of paint out to the road and finishes 300 yards of the road. "That's pretty good!" says his boss, "you're a fast worker!" and pays him a kopeck.

The next day Shlemiel only gets 150 yards done. "Well, that's not nearly as good as yesterday, but you're still a fast worker. 150 yards is respectable," and pays him a kopeck.

The next day Shlemiel paints 30 yards of the road. "Only 30!" shouts his boss. "That's unacceptable! On the first day you did ten times that much work! What's going on?"

"I can't help it," says Shlemiel. "Every day I get farther and farther away from the paint can!"

This is a softball question. You all knew that. Every decent programmer knows that string concatenation, while fine in small doses, is deadly poison in loops.

But what if you're doing nothing but small bits of string concatenation, dozens to hundreds of times -- as in most web apps? Then you might develop a nagging doubt, as I did, that lots of little Shlemiels could possibly be as bad as one giant Shlemiel.

Let's say we wanted to build this HTML fragment:

<div class="user-action-time">stuff</div>
<div class="user-gravatar32">stuff</div>
<div class="user-details">stuff<br/>stuff</div>

Which might appear on a given Stack Overflow page anywhere from one to sixty times. And we're serving up hundreds of thousands of these pages per day.

Not so clear-cut, now, is it?

So, which of these methods of forming the above string do you think is fastest over a hundred thousand iterations?

1: Simple Concatenation

string s = 
@"<div class=""user-action-time"">" + st() + st() + @"</div>
<div class=""user-gravatar32"">" + st() + @"</div>
<div class=""user-details"">" + st() + "<br/>" + st() + "</div>";
return s;

2: String.Format

string s = 
@"<div class=""user-action-time"">{0}{1}</div>
<div class=""user-gravatar32"">{2}</div>
<div class=""user-details"">{3}<br/>{4}</div>";
return String.Format(s, st(), st(), st(), st(), st());

3: string.Concat

string s = 
string.Concat(@"<div class=""user-action-time"">", st(), st(),
    @"</div><div class=""user-gravatar32"">", st(), 
    @"</div><div class=""user-details"">", st(), "<br/>",
    st(), "</div>");
return s;

4: String.Replace

string s =
@"<div class=""user-action-time"">{s1}{s2}</div>
<div class=""user-gravatar32"">{s3}</div>
<div class=""user-details"">{s4}<br/>{s5}</div>";
s = s.Replace("{s1}", st()).Replace("{s2}", st()).
    Replace("{s3}", st()).Replace("{s4}", st()).
    Replace("{s5}", st());
return s;

5: StringBuilder

var sb = new StringBuilder(256);
sb.Append(@"<div class=""user-action-time"">");
sb.Append(st());
sb.Append(st());
sb.Append(@"</div><div class=""user-gravatar32"">");
sb.Append(st());
sb.Append(@"</div><div class=""user-details"">");
sb.Append(st());
sb.Append("<br/>");
sb.Append(st());
sb.Append("</div>");
return sb.ToString();

Take your itchy little trigger finger off that compile key and think about this for a minute. Which one of these methods will be faster?

Got an answer? Great!

And.. drumroll please.. the correct answer:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

It. Just. Doesn't. Matter!

We already know none of these operations will be performed in a loop, so we can rule out brutally poor performance characteristics of naive string concatenation. All that's left is micro-optimization, and the minute you begin worrying about tiny little optimizations, you've already gone down the wrong path.

Oh, you don't believe me? Sadly, I didn't believe it myself, which is why I got drawn into this in the first place. Here are my results -- for 100,000 iterations, on a dual core 3.5 GHz Core 2 Duo.

1: Simple Concatenation606 ms
2: String.Format665 ms
3: string.Concat587 ms
4: String.Replace979 ms
5: StringBuilder588 ms

Even if we went from the worst performing technique to the best one, we would have saved a lousy 391 milliseconds over a hundred thousand iterations. Not the sort of thing that I'd throw a victory party over. I guess I figured out that using .Replace is best avoided, but even that has some readability benefits that might outweigh the miniscule cost.

Now, you might very well ask which of these techniques has the lowest memory usage, as Rico Mariani did. I didn't get a chance to run these against CLRProfiler to see if there was a clear winner in that regard. It's a valid point, but I doubt the results would change much. In my experience, techniques that abuse memory also tend to take a lot of clock time. Memory allocations are fast on modern PCs, but they're far from free.

Opinions vary on just how many strings you have to concatenate before you should start worrying about performance. The general consensus is around 10. But you'll also read crazy stuff, like this:

Don't use += concatenating ever. Too many changes are taking place behind the scene, which aren't obvious from my code in the first place. I advise you to use String.Concat() explicitly with any overload (2 strings, 3 strings, string array). This will clearly show what your code does without any surprises, while allowing yourself to keep a check on the efficiency.

Never? Ever? Never ever ever? Not even once? Not even if it doesn't matter? Any time you see "don't ever do X", alarm bells should be going off. Like they hopefully are right now.

Yes, you should avoid the obvious beginner mistakes of string concatenation, the stuff every programmer learns their first year on the job. But after that, you should be more worried about the maintainability and readability of your code than its performance. And that is perhaps the most tragic thing about letting yourself get sucked into micro-optimization theater -- it distracts you from your real goal: writing better code.

Posted by Jeff Atwood
156 Comments

'It's certainly reasonable to avoid preoptimization, but it's still no excuse to consciously decide to use a potentially very slow algorithm simply for the sake of laziness or readability when a better option is only a few extra minutes of work.'

Actually, the slower methods are _more_ work, because either:

1. you have to write and run, for each such piece of code, a system performance regression test to prove they make no real difference.

2. without testing, at least one time in a hundred your assumption they will be more or less ok will be wrong, or become wrong due to other changes. Then you have to read a customer bug report of 'system running incredibly slow', reproduce, isolate, fix and regression test the problem.

Now, maybe you hate your customers enough that you are prepared to do all that extra work just to slow them down by a few milliseconds. But, even if so, wouldn't adding a few _sleep_ calls be easier?

It would be an interesting study to see type of bug was more common, in typical string code:

1. avoidable O(N**2) code, for meaningful values of N
2. C/C++-style memory overwrite, in memory that gets used for something meaningful.

Either way, the easiest way to deal with things is to use a coding style where that bug simply can't happen.

soru on January 31, 2009 5:47 AM

I am not sure what the topic is anymore!

Jeff,

Sometimes as a senior engineer/architect, you have to provide guidance such as “never use += string concatenation” – depends on the makeup of the development team. The more senior engineers understand when “never”, which would probably not be the chosen word, applies. (more likely the phrase would be something like avoid string concatenation with +). Many software engineers may not even have a clue as to where this “rule” came from. For them, never is the best advice.

That, my friend, is not pre-mature optimization. It is simply providing a framework in which a large group of software engineers with disperate skills and experience levels can produce adequate results.

- Bill

Bill on January 31, 2009 7:51 AM

As with others, I really question the conclusion you're reaching here. Real life example: I was writing a plugin for the Neverwinter Nights 2 editor. the purpose of this plugin was to travel across all open forms and report each and every single control, what it's name was, and its type. Now I was in a bit of hurry, and I just wanted the information quickly, so I did it naively. It looked something like this:

(warning, horrifically abused C# pseudo-code)

foreach control in existence:
output += control.name + : + control.type + newline
next control

I knew it would be slow, but not 2 minutes slow to go over several hundred controls. Changed it to use StringBuilder, and it took about 2-4 seconds to run :p I've since trained myself to always use String.Format or StringBuilder, for various reasons. Does this mean I'm micro-optimizing? No, it means I'm taking Joel's suggestion of 'making wrong code look wrong'. I see a += or a + in a string operation, and a little red flag goes up in my mind. Very helpful for when I'm working with students and trying to help them figure out why their game isn't running very fast.

So I suppose you change the point of the article from a blanket 'It doesn't matter' to 'It doesn't matter in this situation', but as others have said, we don't have any GC figures, so I have to reserve judgment on that as well.

OK, ramble over.

Rick on January 31, 2009 8:57 AM

I wonder if most religious-type wars are fought over what just does not matter?

JamesR on January 31, 2009 9:18 AM

I am astounded by the number of people who have posted who obviously did not bother actually *reading* before posting. There's been a bunch of postings that ignore one of two things that Jeff gave us all right up front:

1) It. Just. Doesn't. Matter! We already know none of these operations will be performed in a loop,

YOU ARE OPTIMIZING THE WRONG BLEEPING THING. It's *not* in a tight loop, it's *not* 4 million byte strings, it's *not* any of the other 3,482 oh, but in *this* case... postings people made. You're optimizing an assignment of 131 or so characters.

2) And then to give it the benefit of the doubt, he *still* put it inside a loop - and we would have saved a lousy 391 milliseconds over a hundred thousand iterations. Ponder that. If it hits 60 times per page, and 100K pages a day, the difference is a whole 20 seconds of CPU *per day*. And almost *NOBODY* has performance constraints *that* tight - if the 20 seconds per day is a show-stopper, you got *bigger* problems - like what happens the day you get 101K hits.

Figure out how to make one SQL call less per page hit - that will buy you some 47 times as much bang for the buck.

And for the person who said If we optimized 500 points in the Linux kernel - Linus has *often* rejected micro-optimizations that result in less readable code. Unless it's in the 5% or so of hot paths like the core scheduler code or some of the interrupt handlers, the micro-optimizations *don't matter*. The code doesn't execute enough to get the savings out of the statistical noise if you try to benchmark it. You're *way* better off writing *clear*, *correct* code and looking for algorithmic wins instead - doing it 0.3% faster isn't as good as doing it 3% less.

Read that last sentence again. And repeat until it sticks.

Valdis Kletnieks on January 31, 2009 11:46 AM

@dennis the first solution is the slowest. That is why microsoft is pointing out that you should use the stringbuilder and no lame += or + to build strings (+ and += may compile to the same, but they are lame compared to strinbgbuilder).

And the use of a stringbuilder doesn't cost time to consider.

You should try out FXCop and StyleCop and think about the rules that are predefined.


No wonder that so much bad code is out in the world.

Artor on January 31, 2009 11:48 AM

Did you do the memory consumption tests over that 100k iteration of yours? Well try it.

Anonymous on January 31, 2009 12:41 PM

I agree with Steven and Donal above. This is a situation where using a templating language can be really helpful. Right off the bad you can make better decisions about allocation because you will know in advance how much space you need *for a large chunk of your page*. As such, for your example:

@div class=user-action-time + st() + st() + @/div
div class=user-gravatar32 + st() + @/div
div class=user-details + st() + br/ + st() + /div;

space can be allocated for all of the static strings at once. From there, all that the templating code needs to do is make a reasonable guess about the size of the strings that you are inserting. For this small example, you might save 3 or 4 allocations, but this will quickly add up.

Of course, there is overhead to adding an extra block of execution to your code, but in my experience, this is partly offset since you can move a lot of your View in to the template (when using MVC). Also, templates can be precompiled and cached fairly effectively, further improving their performance.

Dana The Sane on February 1, 2009 1:07 AM

Whoops, quick clarification. I notice that the compiler may do this already for the one string in the example. The important thing though, is that the space for *All* of the static strings in the page can be allocated in one operation.

Dana The Sane on February 1, 2009 1:10 AM

@Valdis Kletnieks,

Sometimes it DOES matter even if We already know none of these operations will be performed in a loop. I am talking about when you know that you are working with a string of about the length of this blog post for instance (not 4 million characters, just this blog post).

It takes almost no time and no thinking, to use stringbuilder instead of += and the performance difference will be of 1/1000 or 1/10000 or whatever.

So OK, don't spend even 3 seconds of your time micro optimizing, but at least know the facts.

Santiago on February 1, 2009 3:03 AM

TOTALLY agree ! I remember once working with a chap who cared so little about the clumsy program structure or flawed business logic... but cared so much about ^^*#^*@! string concatenations (pardon my language, oh how I hate them ;-)!

Preeti Edul on February 1, 2009 4:03 AM

This is rather language-dependent. Python 2.5 optimized this very common idiom (before, it was in the terrible N^2 realm), leading to linear performance when concatenating strings. Same for 's = s + x' and 's += x', too, so let's not have that little debate.

$ cat bensmark.py
#!/usr/bin/python

def main(reps):
s =
for i in range(reps): s = s + x

if __name__ == '__main__':
from sys import argv
reps = int(argv[1])
main(reps)
$ for i in 10000 100000 1000000 10000000; do printf %10d $i; /usr/lib/python2.5/timeit.py -c import bensmark; bensmark.main($i); done
10000 100 loops, best of 3: 2.2 msec per loop
100000 10 loops, best of 3: 26 msec per loop
1000000 10 loops, best of 3: 257 msec per loop
10000000 10 loops, best of 3: 2.66 sec per loop

Jeff Epler on February 1, 2009 4:42 AM

Actually, this is something the programming language or String class should take care of.

Coding-wise, it takes the least time to write += to concat strings together, so the base class (which is shipping in approximately 2.3 gazillion boxes) really should have a better performance profile than that.

I understand *why* it is in Java -- Strings are immutable, you have to use StringBuffer if you ever want to change something -- but I actually consider that a failure of the language and core API. You never saw quadratic performance when appending strings in C.

Bill on February 1, 2009 5:53 AM

post the whole source code of your benchmark!
if you can't see any difference your benchmark must be flawed.

fgthf on February 1, 2009 6:19 AM

Jeff, micro-optimizations can work for the critical parts of the app. Things like using stringbuilders, and for loops instead of foreach can speed things up massively. But you have to measure first! Find the bottlenecks!

Have you used a profiler Jeff? If the page request is waiting for the DB result 99% of the time, you are right that using stringbuilder is useless.

Micro-optimizations are not a tragedy. Premature optimizations are. Optimizing only works if you know what to optimize.

And what is the weird st() method for? Is it expensive? Maybe you have to optimize the st() method instead? Measure that!

alwin on February 1, 2009 6:20 AM

Hi Jeff

On the previous podcast I thought you did something where Joel accused you of just this - micro-optimizing. Maybe it was the spriting of images with CSS? Something where you were pursuing an optimization and Joel basically asked you, does it really matter? and you responded with something about being a purist around this stuff.

Does this contradict your post?

Thanks and regards

Matt on February 1, 2009 8:07 AM

I've somtimes wondered about otimisation.... In a loop things are clear but outside a loop does it still impact on productivity?

For example:

If I have a string concatenation that improves the speed of a program by 100 ms and it ISN'T in a loop - should I bother?

What if my program is used by approximately 100,000 people around the world?
What if my program is used most of the day - let's say 1,000 times per user.

Does that actually mean I am reducing the productivity of the world by 10000000 seconds (nearly 115 working days)??? Or, would utilising the faster string code generate no noticable difference in productivity? That is - the 100,000 people each save about 1 1/2 minutes a day - is this saving never realised/noticed?

Okay - those stats are high... but for something like an operating system that might be valid.

My personal thoughts are that the less optimal everyone's code is, the more fractions of a second we loose to slightly slow load times, the less productive the world is. That is - the lost fractions of a second DO matter. Optimised code DOES matter. But that's just a gut feeling. Am I right?

Philip on February 1, 2009 11:15 AM

Of course it doesn't matter. Only complete retards do concatenation of unescaped strings with HTML. This is just begging to be exploited.

DMB on February 1, 2009 12:09 PM

Well it's also coding for style and intention. Just because people can use + and += they go nuts, and it gets damn difficult to see what they're doing because they are doing a hell of a lot in such an ugly and unmaintainable way.

I see sh1t like this all the time. I'm making this up because I can't normally write such bad code. In reality it's worse.

s = div + a.ToString(d) + /divdiv + something + else;
s += (b 2) ? 3 : b.ToString() + , + c.ToString();
s += /div;
...
if( r == 3 )
s += more crap;

So I think when people complain about using the concatenation operator all the time, it's also an implicit cry for readable code. You can't always raise a stink about readability, but you can raise one about performance.

The only time I don't care about concatenation is when I'm throwing an exception because I really don't care about performance of an exception, and I don't care if the runtime optimizes it, because I want it to be a short 1-liner and not take up space in code.

if( x 3 )
throw new ArgumentException(x,The value x = + x + is too low and should match the other parameter p = + p);


PS I always hate finding these string benchmarking where some dolt takes a + b + c + d and determines that the compiler is smart and this is the fastest method possible. If you benchmark something, benchmark it properly. Do you write 1 function web pages? If you don't then the compiler/runtime probably isn't going to be smart enough to optimize those concatenations. Do a real test already.

new sb on February 2, 2009 3:00 AM

You can almost always tell a rookie from how much emphasis they place on micro-optimization. This is understandable considering the kind of projects you tackle in college. The inputs are tightly controlled, there's plenty of time for implementation and the goal doesn't change after the assignment has been posted.

Contrast this with the real world where the inputs are often widely varying, deadlines are tight and the goal is almost certainly guaranteed to change (assuming the goal was specific enough to warrant updating!).

Raymond Chen has an excellent series of articles he did with Rico Mariani going through the steps to optimizing a chinese-english dictionary. The post-mortem is relevant here; the optimization bonanza introduced several hard-to-detect bugs.

I try to live by the maxim Sourcecode is for humans!

arnshea on February 2, 2009 3:45 AM

After reading first 20 pages of Jeff Atwood for president, I also want my code just to be readable, I gave up. This is exactly what Joel wrote 7 years ago in his original Shlemiel article:

... Generations of graduates are descending on us and creating Shlemiel The Painter algorithms right and left and they don't even realize it, since they fundamentally have no idea that strings are, at a very deep level, difficult, even if you can't quite see that in your perl script...

While premature optimization is evil, premature optimization for readability is _way_ worse.

Great Wall of China was built from bricks. There were always 3D demos that ran on 12MHz machines and text editors that are slow on modern dual cores. You are either on one side or the other.

Gary Schubert on February 2, 2009 4:02 AM

Jeff and 3/4 of his readers are completely missing the point: That being the dramatic difference between concatenating piecewise:

a = ; a += b; a += c; a += d;

as opposed to concatenating in one swell foop:

a = b + c + d;

The first is criminally stupid, while the second is practically unavoidable if you want to get the work done. Completely different scenarios. Jeff explored (unnecessary) variations of the first scenario while neglecting the real problem posed by the first.

Understand what concatenation is all about and what it does, THEN consider optimizing!

Carl Smotricz on February 2, 2009 4:09 AM

I can highly recommend Rico Mariani's Performance Tidbits
(a href=http://blogs.msdn.com/ricom/default.aspxhttp://blogs.msdn.com/ricom/default.aspx/a)">http://blogs.msdn.com/ricom/default.aspx/a)">http://blogs.msdn.com/ricom/default.aspxhttp://blogs.msdn.com/ricom/default.aspx/a) to all folks taking part here.

It would be really helpful if his work on Perfomance Signatures: (a href=http://blogs.msdn.com/ricom/archive/2007/02/07/performance-signatures-cmg-2006-paper.aspxhttp://blogs.msdn.com/ricom/archive/2007/02/07/performance-signatures-cmg-2006-paper.aspx/a)">http://blogs.msdn.com/ricom/archive/2007/02/07/performance-signatures-cmg-2006-paper.aspx/a)">http://blogs.msdn.com/ricom/archive/2007/02/07/performance-signatures-cmg-2006-paper.aspxhttp://blogs.msdn.com/ricom/archive/2007/02/07/performance-signatures-cmg-2006-paper.aspx/a) could finally put an end to discussions like this one.

Yet another thought: after so many years in the field, it's simply astounding how many simple tasks still don't have a agreed and simple solution... come on guys, this is about concating strings! How comes this is not covered once and for all by the framework (be it java or .net)? If you are the framework/runtime designer and give them several ways to solve a standard task, they will choose the wrong one, they will not notice until your framework/runtime gets blamed for not being scalable and having bad performance signatures.

And that's happening, because despite Google, blogs and pages like Coding Horror and Stackoverflow there is a majority of developers/programmers who simply want to get their jobs done. Their solutions to certain problems are not meaningful and visible until they get problematic. And by then it costs an aggregated fortune to fix them.

pointernil on February 2, 2009 4:10 AM

To Carl Smotricz:
Your first example is faster than second.

Gary Schubert on February 2, 2009 5:26 AM

Carl
Jeff and 3/4 of his readers are completely missing the point: That being the dramatic difference between concatenating piecewise:
a = ; a += b; a += c; a += d;
as opposed to concatenating in one swell foop:
a = b + c + d;

Gary
To Carl Smotricz:
Your first example is faster than second.

I get the feeling there's something fundamental here I should be understanding. Why is example #1 faster? Is this some compiler trick when the string values are known in advance, or something else?

Matt on February 2, 2009 9:28 AM

My take on this is use a string builder because its a method already optimized for building strings. Turns out thats mostly right. Your point is well taken though. Along those same lines don't optimize until you have a performance problem.

razmaspaz on February 2, 2009 10:43 AM

Jeff -
Since so much computing is actually string manipulation these days, it might behoove someone (like you) to encourage the hardware manufacturers to include string-manipulation functions at the machine level. I have a vague memory of IBM adding an instruction to the 360 (maybe 370) to efficiently perform the Snobol break function (the Granddady of regex for you whippersnappers). DEC added hardware square root to the EV6 at the request of customers. It can be done.

- Lepto

Lepto Spirosis on February 2, 2009 11:05 AM

Wow; each of those timings is more than 20x the budget for my own application's entire main loop.

Alan on February 2, 2009 12:13 PM

Yes but ...
1. You can't tell it's a micro optimisation until you try to optimise it. (assuming you measured it first)
2. the 'no .. premature optimisation' Knuth quote is still no excuse for inefficient code.

Readable and good performance coding guidelines is all anyone asks.

And you're right, in a sense you didn't intend, if you've properly coded everything and your bottleneck is a few string concatenation optimisations, you're probably wasting your time.

One thing that never seemes to be measured in all these 'tests' that live in a GC'd memory system, is while you might not shave the milliseconds off the request when you create 100's of new objects in a loop, how does it affect performance of the GC itself? Does it run more often and for longer? Does your app pause every so often because there are millions of strings (or other transient objects) that were totally un-necessarily created?

noptimizer on February 3, 2009 2:59 AM

Well it doesn't matter, because all the solutions tested here are intrinsically inefficient.
It's bit like debating if you would get faster from London to Los Angeles wearing baskets to run faster or wearing fins to swim faster. Doh, take an airplane.

John on February 3, 2009 6:12 AM

Tragedy is suppsed to be sad. 'Sad tragedy' is a tautology.

purist on February 3, 2009 8:03 AM

Phillip: If I have a string concatenation that improves the speed of a program by 100 ms and it ISN'T in a loop - should I bother?

If you have a single string concatenation so big that it takes 100ms to execute on modern hardware, you have bigger problems.

If you have a single string concatenation that the *difference* between two micro-optimized methods is 100ms (which, if worst-to-best saves you 10%, means that single concatenation is taking about a full
second to execute), you have problems well into the vastness where that concatenation doesn't even matter. Think - if that one concatenation takes a second, how long does the entire program take? How many hits/day can you do on a sanely sized server farm (say, less than 3 or 4 42U racks full of servers)?

Valdis Kletnieks on February 3, 2009 8:55 AM

As micro-optimization does, it matters when you develop a framework to reuse. So you much do things that do not add much weight to the apps or things can go bad.

[On the test]
Get it on a decent compiled language and string concats performance will not be a worry.
Given your example:
stuff
stuff
stuffstuff

I substituted stuff with a conversion from date to string, and done a 1 million iterations (10x your example) in Delphi. The algorithm is translation of string concatenating example you've given.
Result? 300ms.
And the machine I ran it is modest compared the one you ran your test (is a Pentium 4Ht, AFAIR 1,5Ghz). And yes, the conversion from date to string was executed 3 million times (3 times per iteration).

Fabricio on February 3, 2009 10:32 AM

Ah, your blog engine eat the html example, but is a copy from the one in your post.

Fabricio on February 3, 2009 10:34 AM

Oops, there is an error on my code. I corrected the code, but my point is still the same.
Corrected results:
93 ms (100k iterations)
1265 ms (1 M iterations)

If you can trust that your compiler/framework is fast enough, you don't have to bother.. ;-)

Fabricio on February 3, 2009 10:55 AM

Since the html/xml/etc explosion I've been waiting to see the
emergence of a string-based language, such as a modernized Snobol.
- Lepto

...and your answer is Unicon ....

Jaster on February 3, 2009 12:43 PM

Jeff (and others),

For your information, the fastest way for simple concatenations (loops excluded, of course) is to use
s = My value + value.ToString() + some other text + otherValue.ToString();

This marginally beats using s = text + value + other text + othervalue; version. Why? Both get optimized by the compiler into a call to String.Concat. But the first one goes to the String.Concat(params string[] values), while the second goes to String.Concat(params object[] values) overload. And the first overload is marginally faster.

Coming back to the subject, I don't think it's about micro-optimization. I'd call it micro-optimization if one version, while being marginally faster, it's less readable or requires extra thought. Writing s = s1 + s2 + s3 instead of String.Format({0}{1}{2}, s1, s2, s3); is both faster - to run and to type - and more readable. It's zero cost optimization, so why not do it?

To summarize: s = s1 + s2 + s3 is equal to s = String.Concat(s1, s2, s3) - the IL is just the same. This is the fastest (and most readable, but this is subjective) method. StringBuilder comes the second, some 10% slower in my tests. String.Format is even slower - some 70% or so, and somewhat prone to errors (too easy to forget a parameter and get a runtime error). It's very useful for localization, though.

Having said that, people coming from a C++ background will still argue that String.Format is nicer. I know I did, for a while :)

Dan C. on February 4, 2009 5:07 AM

I assumed that the performance would be dominated by the repeated calls to st(), but you didn't comment on the efficiency of that routine (or the various functions for which it is a placeholder in the example).

James on February 6, 2009 2:13 AM

Don't underestimate the power of the virtual machine. There are a lot optimizations when it comes to String concatenation.

What you see in your code is probably not what is beeing executed by the VM. All it cares about is that end result should be the same. It might use the stack instead of allocating on the heap, use a StringBuilder-class if you don't use one etc.

When you run your code in the debugger it's not the same function you see as the one that is used when it runs with full speed.

John on February 6, 2009 3:55 AM

The concatenation test is not really exercising string concatentation.

Actually the above tests are really too simple, perhaps tests involves losts and lots of Strings of varying length would be a better more well rounded test.

No webpage is just 5 lines of text simplifying a test and calling it equivalent is nonsense.

mP on February 7, 2009 6:59 AM

See Marco Cantu's response in his post: Delphi Super-Duper Strings:
http://blog.marcocantu.com/blog/delphi_super_duper_strings.html

Louis Kessler on February 8, 2009 7:28 AM

There was (is?) an old adage about 10% of the code using 90% of the available CPU cycles.

dbasnett on February 16, 2009 12:34 PM

In C, i'd thought that strcat() should return a pointer to the null at the end of the concatenated string. The caller could remember that, and use it for the next concatenation. And so i wrote such a beast (though with a new name). And just using it made some apps noticeably faster. Though, not now that computers are 10,000+ times faster.

Doing the micro optimization as a habit is not a bad thing.

Here's another example.

In 1982, there was a Unix program (written in C) called spell. It would look up all the words in your document in a dictionary, and spit out a list of those words that weren't in the dictionary. It did it this way:
o It ran your input file through deroff, which got rid of some non-word noise and put each word on it's own line.
o It ran that through sort -u, which sorted the words, and eliminated all but one copy of each word (-u means unique).
o It ran a program which looked up the sorted word list in a sorted dictionary. To do this, it read the dictionary once, and did many, many character compares. It did this by converting both strings to lower case, and comparing that.

It turned out that in the inner loop, there was code like this:

if (isupper(c)) {
c = tolower(c);
}

Now, these functions were, at the time, functions in the C library. They were also macros in ctype.h. The functions did the exact same thing as the macros - it was the same code. But, function call overhead, which we mostly ignore as trivial, was eliminated in the macros. Macros were expanded inline. So if you include ctype.h then you get macros. And spell didn't do it. But adding the include doubled the speed of spell.

The last time i ran my benchmarks, Java came out as only 3 times slower than C. Java has come a long way. But a factor of 3 is two generations of hardware. That's at least 3 years. More, now that we're going with multiple cores rather than significantly faster CPUs. But next time you're in the market for a new laptop, are you going to buy a 3 year old model?

Stephen on February 18, 2009 10:18 AM

I hope my boss is reading this.

Brian on February 23, 2009 12:53 PM

@Matt:

I get the feeling there's something fundamental here I should be understanding.
Why is example #1 faster? Is this some compiler trick when the string values
are known in advance, or something else?

It's not faster. Gary is mistaken.

MonkeeSage on March 4, 2009 4:23 AM

Thanks!
It doesn't matter!!!
:P

It doesn't matter on May 27, 2009 10:25 AM

i was in a similar situation, i guess the only way to resolve a problem is to test as you each bit as you go towards the finish. again sorry to hear your story.

Luke on July 22, 2009 5:05 AM

sad indeed, hope you came across this problem

Luke on July 22, 2009 5:07 AM

Hooray, we've been seeing a lot of mirco optimization and micro performance testing articles lately (yes Jon Skeet, sadly I'm looking at you as well). Let's hope we'll see no more because the only way to optimize is to benchmark.

On a side note, I'd pass a StringBuilder to the method from outside, thus ensuring I didn't have to construct a single String in the method itself. I always do this when building a string-heavy view and it's annoying, but much faster than any other method I know of.

Ran Biron on February 6, 2010 11:13 PM

As Dennis pointed out, the original concatenation-versus-StringBuilder (or StringBuffer if you live in the Java world) debate was all about the immutability of strings. Programmers from C/C++ backgrounds would concatenate over and over again to the same variable reference assuming that it would work like strcat/strncat, which actually DID operate on a mutable buffer, and VB programmers would do it because, if I recall correctly, there wasn't any other option to begin with. And of course, with the immutable string implementations in Java and .NET, the memory and CPU performance of these algorithms would be horrendous.

Using a StringBuilder in place of ordinary + operators instead of the += statement is not what I'd call an example of micro-optimization, but rather an example of cargo-cult programming. In order to believe that there's actually any difference, you'd need to have totally misunderstood the actual implementation issue, namely the __constant allocation of new strings that take place when concatenating in a loop__, and instead have picked up some little tidbit from someone somewhere about how you should always, like, use a StringBuilder or something.

A trivial example illustrating this point would be the following (stupid) implementation:

string s = string.Empty;
for (int i = 0; i 100000; i++)
{
string st = abcdefghijklmnopqrstuvwxyz1234567890;
StringBuilder sb = new StringBuilder(s, s.Length + st.Length);
sb.Append(st);
s = sb.ToString();
}

At no point does this use ordinary string concatenation, but it's going to have the same perf characteristics as Shlemiel the Painter. The problem has nothing to do with the concatenation itself and everything to do with the fact that each and every assignment to s must allocate a new string.

Unfortunately, the majority of readers probably aren't going to make it to this comment and will look at Jeff's insane example as further evidence of the widely-misunderstood premature optimization is the root of all evil principle, which most of them use as a lame defense for the fact that they don't know how to write efficient code.

Aaron G on February 6, 2010 11:13 PM

I've gotten into the habit of using stringbuilder. Honestly its not a sad tragedy of micro-optimization theater. A little dramatic?

I think it takes more time to consciously think... Ok so should I use += or stringbuilder? I mean, jeff said typing stringbuilder would take more time....like 2 seconds more time

Anyway, just use stringbuilder and get on with life. As you proved += is slower.

Brian Bolton on February 6, 2010 11:13 PM

I was actually looking into this subject in Java a month or so ago, following some debate at work over people making recommendations based on stuff they'd heard that if ever valid, hadn't been for several years.

For Java, the conclusion I came to was that since the compiler uses StringBuffer behind the scenes anyway, simple string concatenation was almost always the right way to go for readability reasons. The exception being cases where large strings are being built up as a result of looping logic - in this case the concatenation version suffered somewhat from unnecessarily StringBuffer creation and conversion inside the loop.

Simon on February 6, 2010 11:13 PM

I don't think there is anything wrong with doing:

string a = Hi;

a+= There;

Provided it's not in some kind of loop.

Creating a stringbuilder class puts a stringbuilder object in memory. Certainly we don't need another object to concat two string together.

Now with many string (such as putting togther an HTML) page StringBuilder would be the choice to use as you are building a fairly large string.

I think bigger gains can be had in the data layer or queries before looking at string optimization in an application.

Jon Raynor on February 6, 2010 11:13 PM

This site is hilarious for the way in which the comments are always filled with people obsessing over the example used to demonstrate the point, rather than the actual point itself.

Rock Hudson on February 6, 2010 11:13 PM

Early in my programming career I just wanted to make things work. Once I had that down I started worrying--a lot--about performance. I wanted to be the super cool guy with the fast code, so I optimized everything. I even optimized things that didn't make it faster if it made the code consume fewer bytes: efficiency! I felt so awesome.

And then I had to maintain my code. In short order I grew to loath optimizations. After a very brief time I took an about-face and adopted a new programming mantra which I still employ to this day: Screw performance. Your time, attention and brain cells are the biggest bottleneck you will have. If there's an actual runtime performance issue there's plenty of time to identify it and optimize it once the program is pretty well finished.

Sorpigal on April 5, 2010 10:51 AM

«Back

The comments to this entry are closed.