I <3 Steve McConnell*
Coding Horror
programming and human factors
by Jeff Atwood

Sep 1, 2007

Choosing Dual or Quad Core

I'm a big fan of dual-core systems. I think there's a clear and substantial benefit for all computer users when there are two CPUs waiting to service requests, instead of just one. If nothing else, it lets you gracefully terminate an application that has gone haywire, consuming all available CPU time. It's like having a backup CPU in reserve, waiting to jump in and assist as necessary. But for most software, you hit a point of diminishing returns very rapidly after two cores. In Quad-Core Desktops and Diminishing Returns, I questioned how effectively today's software can really use even four CPU cores, much less the inevitable eight and sixteen CPU cores we'll see a few years from now.

To get a sense of what kind of performance improvement we can expect going from 2 to 4 CPU cores, let's focus on the Core 2 Duo E6600 and Core 2 Quad Q6600 processors. These 2.4 GHz CPUs are identical in every respect, except for the number of cores they bring to the table. In a recent review, Scott Wasson at the always-thorough Tech Report presented a slew of benchmarks that included both of these processors. Here's a quick visual summary of how much you can expect performance to improve when upgrading from 2 to 4 CPU cores:

Task Manager CPU Graph improvement
2 to 4 cores
CPU graph The Elder Scrolls IV: Oblivion none
CPU graph Rainbow 6: Vegas none
CPU graph Supreme Commander none
CPU graph Valve Source engine particle simulation 1.8 x
CPU graph Valve VRAD map compilation 1.9 x
CPU graph 3DMark06: Return to Proxycon none
CPU graph 3DMark06: Firefly Forest none
CPU graph 3DMark06: Canyon Flight none
CPU graph 3DMark06: Deep Freeze none
CPU graph 3DMark06: CPU test 1 1.7 x
CPU graph 3DMark06: CPU test 2 1.6 x
CPU graph The Panorama Factory 1.6 x
CPU graph picCOLOR 1.4 x
CPU graph Windows Media Encoder x64 1.6 x
CPU graph Lame MT MP3 encoder none
CPU graph Cinebench 1.7 x
CPU graph POV-Ray 2.0 x
CPU graph Myrimatch 1.8 x
CPU graph STARS Euler3D 1.5 x
CPU graph SiSoft Sandra Mandelbrot 2.0 x

The results seem encouraging, until you take a look at the applications that benefit from quad-core-- the ones that aren't purely synthetic benchmarks are rendering, encoding, or scientific applications . It's the same old story. Beyond encoding and rendering tasks which are naturally amenable to parallelization, the task manager CPU graphs tell the sad tale of software that simply isn't written to exploit more than two CPUs.

Unfortunately, CPU parallelism is inevitable. Clock speed can't increase forever; the physics don't work. Mindlessly ramping clock speed to 10 GHz isn't an option. CPU vendors are forced to deliver more CPU cores running at nearly the same clock speed, or at very small speed bumps. Increasing the number of CPU cores on a die should defeat raw clock speed increases, at least in theory. In the short term, we have to choose between faster dual-core systems, or slower quad-core systems. Today, a quad-core 2.4 GHz CPU costs about the same as a dual-core 3.0 GHz CPU. But which one will provide superior performance? A recent Xbit Labs review performed exactly this comparison:

3.0 GHz
Dual Core
2.4 GHz
Quad Core
improvement
2 to 4 cores 
PCMark05  9091 8853 -3%
SysMark 2007, E-Learning 167 140 -16%
SysMark 2007, Video Creation 131 151 15%
SysMark 2007, Productivity 152 138 -9%
SysMark 2007, 3D 160 148 -8%
Quake 4 136 117 -15%
F.E.A.R. 123 110 -10%
Company of Heroes 173 161 -7%
Lost Planet 62 54 -12%
Lost Planet "Concurrent Operations" 62 81 30%
DivX 6.6 65 64 0%
Xvid 1.2 43 45 5%
H.264 QuickTime Pro 7.2 189 188 0%
iTunes 7.3 MP3 encoding 110 131 -16%
3ds Max 9 SP2 4.95 6.61 33%
Cinebench 10 5861 8744 49%
Excel 2007 39.9 24.4 63%
WinRAR 3.7 188 180 5%
Photoshop CS3 70 73 -4%
Microsoft Movie Maker 6.0 73 80 -9%

It's mostly what I would expect-- only rendering and encoding tasks exploit parallelism enough to overcome the 25% speed deficit between the dual and quad core CPUs. Outside of that specific niche, performance will actually suffer for most general purpose software if you choose a slower quad-core over a faster dual-core.

However, there were some surprises in here, such as Excel 2007, and the Lost Planet "concurrent operations" setting. It's possible software engineering will eventually advance to the point that clock speed matters less than parallelism. Or eventually it might be irrelevant, if we don't get to make the choice between faster clock speeds and more CPU cores. But in the meantime, clock speed wins most of the time. More CPU cores isn't automatically better. Typical users will be better off with the fastest possible dual-core CPU they can afford.

Posted by Jeff Atwood    View blog reactions
« Falling Into The Pit of Success
Keeping The Menu Simple »
Comments

Thanks for that! What about servers? web applications, database... Will quad cores systems add benefit there?

Pierre on September 3, 2007 10:36 AM

depends if you want a single application to go faster or you have several apps you want to go faster.

Say.... Running Several instances of Visual Studio and a VMWare... etc etc

Keith Nicholas on September 3, 2007 10:44 AM

Stuff like 3D rendering, or compositing applications, or pretty much anything dealing with processing images, can very easily be split into regions, for rendering by separate cores.
Modo (3d app) is the most obvious example of this.
When you render something with two cores, you see two little blue boxes processing a segment of the image. If you have 4, you see four little boxes. If you have two quad-core machines doing network-rendering, you see four blue boxes (local cores rendering), and four orange boxes (the remote box rendering)

Even me, who isn't the best coder in the world, could work out how to write a render to take advantage of multiple cores.
Where as with games, I can't think of anything that could utilize the spare CPU cores..
I wonder if it's even remotely possible, but: To use extra cores as "software-graphics-cards". Since graphics are the only thing that really needs lots more processing power in games, it'd make sense to say divide the screen up between them, and use the remaining two cores to process extra effects on their area on screen. Biggest problem being the CPU's aren't as fast as drawing stuff to the screen as graphics cards are...

But, yeh.. For gaming, dual (or even single) core processors are more than enough. CPU's are generally not the bottle-neck for games.
Buut... For 3D/compositing workstations, a quad-core CPU (or dual-CPU quad-core) does substantially speed up rendering.

Another thought, to add to this slightly rambling comment:
MP3 encoding. Instead of speeding up a single-MP3 encoding, why not have the application process 4 different files at once. It'd me much simpler to code (Since you don't need to worry about parralizing(?) the encoding process, you just basically need to fork the encoding once for each core..
Since encoding the same MP3 over 4 cores probably wouldn't speed it up that much (The code would spend more time starting the next file than actually processing bits), completing four files at a time would complete the task faster.

dbr on September 3, 2007 10:49 AM

Please don't use red and green text that are otherwise identical (same saturation, value, font, and so on), to differentiate positive and negative results. Yes, there is a minus sign in front of the negative results, but this is slow for the brain to latch on to, especially since the font is rather thin.

There are many other ways to visually separate good and bad results, almost all of which are better than just red and green and no other differentiation. I've seen some beautiful and effective choices, though many tend to bias the reader (bolding the bad results, for instance). Personally I find just replacing the green with blue to be quite effective.

Geoff Broadwell on September 3, 2007 10:53 AM

This is the sort of comparison you see all the time, and it may be an incredibly stupid question, but instead of seeing how one application does across multiple CPU cores, I'd like to know how the Operating System goes distributing several applications across cores.

Or is that not how it works?

Because if I can get four apps working at higher performance, sometimes that's a better scenario.

Of course there is also the point to be made of how many apps have ever been written to take advantage of multiple cores yet?

A lot of the ones I work with there just isn't the need.

Actually with all the background processing and everything I'd love to see how some of the WPF apps coming out are going to go.

Andrew Tobin on September 3, 2007 10:59 AM

The issue is certainly the software. I think that pretty much everything is parallelizable. The issue is that they aren't within our current programming paradigms. I think the question you should be asking is why we really need more powerful computers. The answer, I believe coincides much with the list of things multiple cores are good for!

You say "Unfortunately, CPU parallelism is inevitable.". Unfortunately? I think not! This is the opening for revolution in computer architecture and programming languages.

dbr - Games can actually be very easily parallelizable, and could easily take advantage of all the power offered by quad, 8, 16 cores, etc. The issue is that the game engine would have to be written with parallelization in mind (looks live Valve is doing this), and the benefit isn't huge when not too many have dual/quads. It wouldn't be unreasonable to devote one core to managing the graphics card(s), push load content, etc. One or two cores could do physics stuff, in the absence of a PPU. I'd love to see a game that actually benefits from an entire core devoted to AI and game play. It is nearly impossible to design a game such that it smoothly scales from single core to several cores, though. It changes the game too much.

Michael Sloan on September 3, 2007 11:14 AM

@dbr:

There is actually quite a bit that can be sped up in games, outside of the pure rendering aspect. For instance, depending on the algorithms used, AI can often be separated into global and individual "thinking" -- the latter can be distributed across cores. Even with a purely global AI design, simply moving the entire AI subsystem to a separate core may work well.

Then of course there's the sound subsystem, which can decently chew CPU when a great many environmental sound effect tracks are mixed by a 3D audio engine. Again, that can be thrown in its own thread.

And then there is physics. Some of it can be parallelized, and other pieces can't -- but certainly physics can be overlapped with rendering. Because the non-destructible portions of the environment will be unaffected by the results of the physics calculations, those can be rendered while physics is still being run for a given frame. Also, once the physics is completed for a given frame, the results can be passed off to the renderer while the physics computation begins on the *next* frame.

Weather and other complex but slowly-changing environmental effects can be computed in separate threads that post results asynchronously to the main engine. Networking/world synchronization can run in a thread of its own.

And the list goes on .... Mind you, such a heavily threaded design is not necessarily *easy*, but it certainly is worth the work, as Valve has been quick to point out.

Geoff Broadwell on September 3, 2007 11:16 AM

Yep. Main benefit of multi-core on the desktop is avoiding excessive context switching. As you say, diminishing returns above two unless the software has been explicitly parallelized.

Evan on September 3, 2007 11:34 AM

What about servers?

Servers are totally different scenarios. There are plenty of users who believe their desktop usage scenarios are similar to servers, but it's utter wishful thinking on their part..

if you want a single application to go faster or you have several apps you want to go faster.

Within reason, yes, but dual-core gets you 99% of the benefit of (n) core. If you're not careful, this becomes the wishful thinking scenario I just described. No matter how much of an ultra-elite-ninja single user you are, I guarantee you're not generating anything close to the kind of load that a server would experience under even the mildest of loads. Desktops aren't servers.

Where as with games, I can't think of anything that could utilize the spare CPU cores..

http://news.zdnet.com/2100-9584_22-6119913.html
--
One such company is Remedy, which demonstrated a game called "Alan Wake" at the Intel show.

The game is designed to farm tasks to different processor cores, said Markus Maki, director of development, in an interview. There are three major program threads and each can occupy a core of its own: one for the main game action, one for simulating physics of game objects and one for preparing terrain information that's later sent to the graphics chip for rendering. A fourth core can handle other threads, including playing sound and retrieving data from a DVD, Maki said
--

I have yet to see a single game that shows *anything* close to the kind of scaling that we regularly see with rendering or encoding.

Approaches like this sound good on paper, but developers are seriously hobbled by the existing market of single and dual core CPUs. They have to write AI that can scale between an entire core on a quad-core machine, 1/2 of a core on a dual, or 15% of CPU time on a single.

Jeff Atwood on September 3, 2007 11:34 AM

Why look at today's programs performance with tomorrow's cpu setups? Surely after time programs will be written to take advantage of multiple cores. Remember there was a time when "no user of a pc" would need more than 637k of RAM ;)

kenny on September 3, 2007 11:45 AM

Where's the 2 to 4 core comparison for Visual Studio and other compilers?

If you can find these kinds of benchmarks, then godspeed. They're rare. The very first link in this post contains one compilation benchmark, but it's dual-core:

http://www.codinghorror.com/blog/archives/000285.html

This review shows no scaling improvement for quad-core in Visual Studio 2005 compilation:

http://xtreview.com/review212.htm

The gcc compiler does support multiple cores and seems to scale fairly well:

http://www.phoronix.com/scan.php?page=articleitem=585num=4

Cheat sheet for the last graph: E5320 is quad 1.86 Ghz; E5150 is dual 2.66 Ghz.

single E5150 -- 12.06 sec
single E5320 -- 11.08 sec

Jeff Atwood on September 3, 2007 11:49 AM

http://techreport.com/articles.x/11237

I think there's a better article though, I don't have time to find it now. One of the interviews with Valve about multi-core support explains some of the benefits and difficulties with programming for multi-core (or more adapting existing code for multi-core).

[ICR] on September 3, 2007 11:52 AM

Con: Amdahl's Law
http://en.wikipedia.org/wiki/Amdahl%27s_law

Pro: Reevaluating Amdahl's Law
http://www.scl.ameslab.gov/Publications/Gus/AmdahlsLaw/Amdahls.html

Jeff Atwood on September 3, 2007 12:07 PM

The Xbit Labs review can't have activated the "threads=x" option for xvid. xvid encoding on a quad core Mac Pro either from command line or from Handbrake maxes out all four cores, and hits about 95fps encoding rate (with all the quality options on).

Matthew on September 3, 2007 12:12 PM

But in the meantime, clock speed wins most of the time. More CPU cores isn't automatically better.

More CPU cores still allow you to run more applications with less contention for CPU resources (you may get starved for memory bandwidth though).

In this day and age of Firefox and other IntelliJ/Eclipse/Visual Studio (while I do love them you can't consider them lightweight on either memory or resources), having more CPU cores allows your computer to still be responsive even though you're running Firefox *and* your IDE *and* some expensive compilation *and* even more without having to rely on nicing processes.

Masklinn on September 3, 2007 12:13 PM

.NET compilation gets some multi-core love with the 3.5 Framework[1]. I've been using this for a while on my home projects. It helps a bit, but not a ton. If you have a lot of projects and a clean dependency graph, it can shave a decent amount of time off the total build, but it varies a lot.

1. http://blogs.msdn.com/msbuild/archive/2007/04/26/building-projects-in-parallel.aspx

Now, even there the drop-off is significant after /m:2. On my Q6600@3.3Ghz, running with 4 build nodes (/m:4) is rarely any faster than running with 2 (/m:2). Here are some fresh timings for a clean build on a small-to-medium size project:

/m:1 - 4.39s, 4.24s, 4.71s (4.45s avg)
/m:2 - 3.58s, 3.65s, 3.60s (3.61s avg)
/m:3 - 3.86s, 3.52s, 3.74s (3.70s avg)
/m:4 - 3.19s, 3.75s, 3.86s (3.60s avg)

This is around 2.5 MB of source code spread out over 16 projects.

Even so, I'm pleased with my Q6600. It wasn't very much more expensive than the dual core, and usually there are quite a few things going on besides compilation to take advantage of the extra power.

Derek on September 3, 2007 12:23 PM

I'm very surprised that the Erlang fan boys haven't jumped in here yet.

They don't care, their software makes use of multiple cores just fine already.

I wonder, what about Stackless Python?..

Vladimir on September 4, 2007 2:06 AM

Re. Visual Studio, isn't it a breach of the license terms to publish performance information? It certainly is with SQL Server! This would explain why there isn't any data out there.

Syd on September 4, 2007 2:07 AM

One area where quad is definitely better is for optimization. With dual core, it's difficult to optimize for quad, but with quad you can optimize for quad, dual, uni-core systems.

Anytime you're optimizing parallel code for a wide audience, you want more cores. And with quads about to become the baseline (and already very cheap), choosing dual cores is probably no longer the right choice.

Andrew Binstock on September 4, 2007 2:10 AM

Hi, first time poster here. I like the blog, very interesting and thought provoking (although I don't always agree!).

I think the problem with current software and multicore CPUs is the threading model the OSes use. It's not easy to scale threads dynamically, let alone load balance them. I've been toying around with the idea of a system based on small chunks of work which are given to whichever processor is free - i.e. a single queue with multiple servers. It's not easy. Incidentally, I came up with the idea whilst working on PS3 hardware.

Skizz

Skizz on September 4, 2007 2:24 AM

The implications of quad and more core processors are limitless.

Someone above said how they couldn't imagine games taking advantage of multi-cores. How about 'bad guys' running in one process (ie: on one core) and you/good guys running in another? Talk about awesome AI. They could respond and learn in real time. How about running a game 'server' locally while multiple people are connected to you?

As far as these benchmarks they're silly. someone already said it but...

Yesterdays software on tomorrows technology. It just doesn't matter much. I guarantee you there are aspects of almost every piece of software that could be improved by spreading the workload across multiple cores.

It is physically impossible for single-core processors to execute more than one instruction at a time. With the theoretical ceiling of clock speed fast approaching this means we may be at our limit of speed... but WAIT! we can now process more than one instruction at a time due to multiple cores (CPU's).

How about multi-tasking? how did the quad core hold up against the dual core when doing 4 things at once? how about the quad core vs the dual core with hyperthreading?

Burning a Disk, Encoding a DVD, Playing FarCry and streaming music via Rhapsody?

Those are the sort of comparisons that show the true potential of the multi-core processors. We will eventually hit a core-ceiling where it just doesn't make practical sense to go further but just like the clock speed was 15 years ago that's a long way off.

I say bring on the Cores!

Randy Aldrich on September 4, 2007 2:37 AM

I have yet to see a single game that shows *anything* close to the kind of scaling that we regularly see with rendering or encoding.

Main gaming platforms (Xbox360, PS3) have been multicore for quite some time (XBox has 3 cores/6 hw threads. PS3 has a PPU and 6 SPUs). Most of the games these days are multithreaded on these platforms, simply because they have to be in order to survive! "Free GHz ride" never existed on consoles.


Boyan on September 4, 2007 2:44 AM

Erlang fanboys think their software uses multiple cores, but in fact you need to have multiple Erlang interpreter processes running to do that. The code to start Erlang processes in different native processes is different to just starting them in one process, and Erlang cannot move its processes from one native process to another it one is busy and the other is idle.

Asd on September 4, 2007 3:09 AM

If Intel is going to be pumping these XX-cores out, I'd imagine their friends at Microsoft and elsewhere would feel a push to writing software to fully utilise these cores so people will still enjoy faster experiences when buying their shiny new computers.

Else I'm sure word would get around pretty quickly, from friends and family, that those new XX-cores on the TV from dell aren't much faster than the box under their desk.

transcriber on September 4, 2007 3:32 AM

What about statistical software - "GNU R project" and such - these are the power-hungry applications and they are used by universities, who are always trigger happy to upgrade (and waste money in the process).

jinxs on September 4, 2007 3:35 AM

The noises coming from Intel for their 45nm process generation suggest that they may have cracked the leakage current problems that plagued the 90nm and 65nm generations, which I believe was largely the reason that the clock speed couldn't be ramped up (without causing massive heat dissipation problems).

If they really have fixed it, we could start seeing clock speeds going up again, although we might see lower-voltage, lower-power parts at current clock rates as well. Of course, now that multi-core has been introduced, it won't be taken away - the transistor budget for an out-of-order superscalar processor core is already well below the number of transistors it's possible to put on a chip-sized piece of silicon, so those transistors are effectively going free. What else are you going to do with them, add even more cache?

Mike Dimmick on September 4, 2007 3:51 AM

Jeff,

One can try OpenMP from http://www.openmp.org to parallelize C/C++ applications. I've never tried it, but it would be a cool blog to test it out.

Kashif

Kashif Shaikh on September 4, 2007 3:54 AM

I just like the idea that I wouldn't have to worry about my processor getting bogged down by background tasks while I'm gaming. On the other hand, I can't think of a lot of things I'd want to do in the background that wouldn't be using much more precious game resources, like networking and RAM...

My experience so far with the Dual Cores is that they're impressive when compared with a single core, but not so impressive that I'd avoid going with a quad core on a gaming rig, even if just playing a game doesn't stress all 4 cores. I'd rather have the extra overhead available and stop spending so much time optimizing my gaming systems, especially when it looks like many of the actual game benchmarks are making good utilization of 2 of the available cores (something you wouldn't have been able to say of a dual CPU system 5 years ago).

By the time I can afford the quad-core systems, someone will have figured out how to get some use out of it in the big 3 game engines (and many of the others).

Vizeroth on September 4, 2007 4:05 AM

If I were you I would repeat DivX 6.6.1.4 test to make sure multi-threading is enabled (not by auto but by setting the number of thhreads manually in encoder properties). I am not sure what you used as a host application but I suggest VirtualDub.

Lame test is also suspicious, I would repeat that too.

Quake 4 should (obviously) be tested with MT patch.

Photoshop is generally a bandwidth bound application. You need to make sure that you perform operations which are not bandwidth but compute bound to see the effects of more cores.

Add some audio processing application (Sonar or Cubase + many VST software synths and effects come to mind). Sound Forge 9.0 too, then flac or MonkeysAudio lossless compression.

Igor on September 4, 2007 4:12 AM

Two things:
1, Quad Core 2.4 Ghz prices are roughly comparable with Dual Core 3.0 Ghz, so you would expect similar performance. You'd also expect older application that weren't built with parallelism in mind to not take full advantage of the full complement of cores. That's likely to change drastically over the next 18m - 2y.

2, From memory the 2.4 Ghz Quad is far more overclockable than the 3.0 GHz Dual (all other things being equal) - indeed, in your own blog : http://www.codinghorror.com/blog/archives/000908.html : you overclocked a Quad to 3.0 GHz. I'd be interested how a 2.4 clocked to 3.0 fared in the comparisons.

Dino on September 4, 2007 5:01 AM

@Asd
Erlang fanboys think their software uses multiple cores, but in fact
you need to have multiple Erlang interpreter processes running to do
that. The code to start Erlang processes in different native processes
is different to just starting them in one process, and Erlang cannot
move its processes from one native process to another it one is busy
and the other is idle.

That is not entirely true.

The Erlang VM automatically starts a thread for each core, each of which handles execution of one of the upcoming scheduled Erlang processes. Erlang code which have been written with concurrency in mind (that is, several processes that do different things at the same time, which is normal procedure in Erlang) will scale perfectly well on multi core systems without any modification at all.

See:
http://www.ericsson.com/technology/opensource/erlang/news/archive/erlang_goes_multi_core.shtml

Adam on September 4, 2007 5:14 AM

That's nice... generalization based on one type of multi-core CPU.

How much of this is directly due to:

1 - non-parallelized code (granted, this is acknowledged)

and

2 - bus saturation

Cross comparisons using only parallelized code between different multi-core architectures PLEASE!

anonymoustroll on September 4, 2007 5:24 AM

Well it is at least good to see that Valve has their code going in the right direction. Let's hope that makes TeamFortress 2 that much more fun!

As for compile speed, disk is still the slow part of that process. I would rather have an SSD for a build drive any day for their high I/O per second compared to spinning platters. Processors, dual or quad, are still so far ahead of storage it is sick. Even tons of RAM for cache didn't seem to help our builds with disk continuing to be the bottleneck.

Even large scale vm systems are using multiple host adapters and more to get a performant virtual array that doesn't choke the vm. In fact an HBA per vm is not uncommon to get speeds acceptable.

And like another poster saidi the cost difference is so low why not just get the quad.

Chris Patterson on September 4, 2007 5:42 AM

your data seems contrary to something the inquirer posted this morning

http://www.theinquirer.net/default.aspx?article=42114

This suggests that infact lost planets performance increases considerably with a quad core compared to a dual core

abhorsen666 on September 4, 2007 5:47 AM

I believe that most of the next batch of games will take advantage of quad-cores. Bioshock already uses about 50-60% of each core in my quad Q6600. It probably means it would run about equally well at 100% in a dual core, but still, it parallelizes remarkably well.

Game developers are way too addicted to adding cool stuff into their games for them to ignore a source of computing power for long. They will eventually find something to take advantage of it. Take for instance how Valve's Gabe Newell went from complaining about the enormous difficulty of programming for multiple cores to loving quad-cores:

http://www.next-gen.biz/index.php?option=com_contenttask=viewid=510Itemid=2
http://www.extremetech.com/article2/0,1697,2050558,00.asp

So, I can't be sure of how much will games take advantage of quad-cores in the next year, but I don't think it's a bad moment to get one.

Valls on September 4, 2007 5:55 AM

"Their answer? H.264 blu-ray video playback while "doing something else". Lame. How do you watch a movie and do something else at the same time?"

Well.. not that I disagree with the article, but I do this all the time. I'll catch up on a TV show or watch a movie while I'm doing some photo editing. Lightroom Photoshop on one monitor, player on the other.

Ben on September 4, 2007 6:10 AM

One thing about LAME encoding: I realize they were evidently using a multithreaded version of LAME, but even single-threaded LAME can see an increase, as you can run multiple instances of it. Not "real" multithreading, but I use foobar to convert things to MP3 all the time, and it spawns as many LAME instances as I have threads, and the speedup from a C2D to a C2Q was huge; not perfectly linear scaling, but a large increase.

Hotdog on September 4, 2007 6:15 AM

Somebody should compare dual vs quad core performance advantages using development tools such as Visual Studio 2005, SQL Server 2005 (Analysis, Reporting, etc., Services) and running multiple instances of Virtual Machines (VMware and/or Virtual PC). This is more meaningful to many of us who use computers for a living... yes we do play games and encode audio/video but we spend more time developing applications.

I myself am wondering whether there is any advantage in going quad core using these dev tools... or whether I am just wasting electricity (quad cores are rated 135 watts vs 65 watts for dual cores) generating heat by running one.

cyclo on September 4, 2007 6:23 AM

there's also multi threaded applications designed to run certain aspects of a program on a certain core, that should significantly increase performance IMO

Danny on September 4, 2007 6:28 AM

The newer game engines (e.g. Crysis) are already starting to utilize quad cores so I would add games to the list.

Josh on September 4, 2007 6:31 AM

Real world applications would be Nikon Capture NX on both XP and Mac and the Canon equivalent, IrfanView and Bibble transcoding and tweaking directories of 12 MP files, using something like MS Publisher to add and delete pages. On a Mac run Tiger and XP at the same time.

Another stress test would be to have all these apps open, plus FireFox with a dozen tabs, throw in QuickBoooks, Windows Explorer, a couple of large spreadsheets, and burn a DVD.

I'm looking to replace a G4 Mac and a 3 year old XP box. Should a get a Mac mini, or go for an XServe or MacPro?

Help me, folks.

Mark on September 4, 2007 6:33 AM

Compiling a C++ project in Visual Studio 2005 with a Quadcore requires to use Incredibuild, otherwise 3 CPUs sit idle for 80% of the time.

Visual Studio's C++ compiler doesn't scale out-of-the-box for projects with a lot of dependencies, because building projects in parallel doesn't fit here. but by adding the undocumented /MP(X) compiler option under "advanced command-line options" there is a multi-cpu performance gain even at the project level:
/MP2 for dual-core, /MP4 for quad-core

It utilizes 100% of all available cpus over here, so the statement that the cpu's sitting idle is not correct, at least with vc++ 8.0.


_tobi on September 4, 2007 6:37 AM

As Mark already commented on 12:58 AM: the last 4 entries in the XBit comparison have their sign reversed -- Excel is actually 63% SLOWER on 4 cores.

Plus, while I understand the reasoning behind comping 2.4GHz to 3.0GHz, I find it worth noting that in many cases in which the quad performs worse, it's still performing better than the CPU ratio would predict.

Hex Err on September 4, 2007 6:38 AM

Compiling a C++ project in Visual Studio 2005 with a Quadcore requires to use Incredibuild, otherwise 3 CPUs sit idle for 80% of the time.

Visual Studio's C++ compiler doesn't scale out-of-the-box for projects with a lot of dependencies, because building projects in parallel doesn't fit here. but by adding the undocumented /MP(X) compiler option under "advanced command-line options" there is a multi-cpu performance gain even at the project level:
/MP2 for dual-core, /MP4 for quad-core

It utilizes 100% of all available cpus over here, so the statement that the cpu's sitting idle is not correct, at least with vc++ 8.0.


tobi on September 4, 2007 6:38 AM

Massive parallelism is coming. What will drive it is robotics, and the need for whatever the desktop evolves into to program and perhaps control it.
In ten years machines with hundreds, perhaps even thousands of cpus will be either on the drawing boards or in production designed specifically for this appication. The massively parallel revolution is coming, and it's coming in a big way.
Massively parallel machines are probably the only way to simulate cognitive functions.
Even at the level of an insect’s cognitive abilities today’s supercomputers still fall fare short in comparative processing power. In order to build robotic applications that are truly useful they will have to be at least as smart as your typical insect and massively parallel machines at the micro level are the only way to achieve this goal.

Mac on September 4, 2007 6:56 AM

I run lots of apps at the same time, not being on the same core is very helpful to me and improves latency. Writing for a single core isn't that bad.

Deathbyfire on September 4, 2007 7:00 AM

One of the tests I'm personally interested in (and which is always missing) is using a sequencer (Cubase, Ableton, Logic, etc.) and running VST plugins. For instance, something like Native Instruments' Massive totally bogs down my old (XP2400+) CPU if it's set to high quality mode, and the number of instances playing simultaneously or doing neat tricks like convolution reverb would be a good workout. With the software studio, lots more people are making music.

Today I'll get a Quad system, so it's interesting to see how the workload is divided; simultaneous audio streams should be parallelized rather easily.

Rob Janssen on September 4, 2007 7:17 AM

What about increasing the cache size (as mentioned without detail by others above), increasing the register set, or adding special-purpose instructions such as matrix manipulations?

On a related note, has anyone ever done a post-mortem on the RISC vs. CISC wars? Lessons learned?

Michelle on September 4, 2007 7:21 AM

"Thanks for that! What about servers? web applications, database... Will quad cores systems add benefit there?"

Pardon if someone's already covered this, but applications that can handle more simultaneous threads of execution will benefit, otherwise not. A database I use with my day job, Progress, can start up multiple server processes. The last several places I've worked have had quad-cpu machines, and the database will cheerfully spawn multiple servers to spawn user requests, and typically use 3 or 4 of those cpus. A multithreaded web server would probably see benefits, for the same reason.

Rick C on September 4, 2007 7:30 AM

It won't make that much of a difference on current systems, but that's because games specifically are normally not very multithreaded. This is changing.

Every engine that is designed around the PS3 or XBOX 360 will most likely feature a very multithreaded design that will benefit significantly from a quad-core PC.

So I wouldn't buy a quad core now, but I'd expect those utilitization numbers to change over the next year or two.

Yrro on September 4, 2007 7:33 AM

I can't really see a "negative" to getting a quad-core processor beyond price, and I think it's a poor arguement. The difference in price between clock-speed and core number is negligable, as is the performance difference.

Having recently had to purchase an "emergency" replacement for the home computer, I had "no choice" but to spring for a quad-core (the Q6600 mentioned above) from Gateway because the gross price of the system was phenominally lower than anything I could have assembled by myself ($1000 flat for a complete multimedia PC with decent RAM video and hard drive components!)

I'm counting on the fact that this PC will be in use in three years, at which time more mainstream software should leverage multi-core.

Rick Cabral on September 4, 2007 7:51 AM

As is the case for so many other things with me, it all comes down to actual performance. I love reading benchmarks and watching the wars that start between fanboys/girls over the accuracy of the results and the ensuing "My hardware is better than your hardware" mud-slinging.

The way I see it, your average Joe (at this point) doesn't need a quad-core system. We have a situation where the hardware in question is actually ahead of the software being developed to use that hardware. At this point, I can't think of a single scenario where an average computer user would need that much power.

Is this -really- a problem, though? If we, as developers, know that these types of processors are available, then why not write to take advantage of that power? Let's face it...single-core CPUs are on the way out, unless clock speeds start improving significantly with the next generations of CPUs.

Back to my starting comment, though. I've owned my share of machines in my relatively short lifetime, and I've done development on every one of them. As it stands now, I wouldn't dream of having anything less than a dual-core, and the next machine I plan on building will feature a quad-core. It's not that I need that power -right now-, but when the time comes that I will need it, it will already be there.

My old machine (which at this point, is essentially my test machine), has a 2.0 GHz AMD 2000+. I installed the VS Orcas Beta this weekend, and it runs like a pig through molasses. I know if I put it on my brother's dual-core 2.0 it would run better. And if I were to install it on a quad-core, then it would run even better than that, as well as the 40,000 other things I'm doing while hammering on my keyboard. Moral of the story? There -are- people who can take advantage of that technology who exist, so why not harness it?

Benchmarks are helpful, but not what I base my hardware purchases on. Now, gimmie one of them quad-cores. :)

James on September 4, 2007 8:15 AM

Core Unaffinity:

Some multicore processors have a shared L2 cache, while some keep them separate for each core. Skiz's idea of "system based on small chunks of work which are given to whichever processor is free" (Skizz on September 4, 2007 01:24 AM) is fine when there is a shared cache. Otherwise, bouncing a thread's stack and active data among the cores' caches will waste a lot of cycles.

Running multiple applications:
Who runs two processes?

Whether with Windows task manager (alt-ctrl-del) or linux ps -ef see how many processes are running, many of which have multiple threads.
Typically, Windows desktop users are probably running 50-100 processes and 500+ threads.

Affinity:

Unless you force core affinity, I find that Windows does not do a good job keeping a CPU intense process on one core (even without any Windows calls in the intense loop)

Power Savings:

In the future, when the HW and OS work together to shut down cores that are not needed that microsecond, multicore will be very helpful in lowering the average power consumption of the processor.

Responsiveness:

I have had a Pentium D machine for two years, and have been very happy with it. It stays very responsive even when doing interactive work while there is a CPU intensive application running, e.g. MP3 encoding, audio filtering, etc.

See more about the multicore topic at http://www.2cpu.com/

David

David on September 4, 2007 8:31 AM

Maybe we'll measure performance in cores and not Hz in a not so distant future. "I have a 2 MegaCore computer, what do you have?"

Adam on September 4, 2007 8:32 AM

All these benchmarks seem to focus on running a single application, which is hardly ever the case! I usually have at least 4 applications open (web browser, email, IM, download manager etc), not to mention all the OS processes that run in the background.

Surely these different applications can be run on different cores? While the performance of an individual application may not be improved by additional cores surely the performance of the whole system will be?

Ben on September 4, 2007 8:32 AM

As others pointed out, single applications may be slow in handling more than a two processors at once (which is what a dual core chip really is). That will depend upon the software writers taking advantage of multiple core processes and using threads. However, that doesn't mean there isn't improvement if multiple applications are running.

One of the biggest pains for a developer is building an application in parallel. I know that XCode and gcc on Linux can take advantage of quad core systems, and it may simply take Windows a while to catch up. Visual Studio 2008 "Ocras", will be able to handle parallel builds, and probably handle quad cores.

As for games, as more and more people start using quad core systems, games will be rewritten to take advantage of them. Rendering engines certainly could be optimized to allow more than a single object at a time to render at once (think of the cell processor). I am not too familiar with the gaming environment, but I expect that most games will start to take advantage of the new hardware with in a year.

As one of the department heads once told me, "The Hardware Fairy only comes once every few years, so always overspec what you need because you may be stuck with that system for five years." You may be right now that today's Windows software may be unable to take advantage of quad core systems, but what about the next twelve months?

I suspect that Visual Studio 2008 "Ocras" will be able to once it finally comes out, and that most games will quickly put out newer revisions that will take advantage of the quad cores. There may even be a service pack for Vista that will take better advantage of quad cores in the next 12 months. I personally would opt for a quad core system based upon my experience with software and operating systems. I suspect that even if there is now improvement in speed now, there certainly will be with in six to twelve months.

David on September 4, 2007 8:35 AM

Mark posted:
'Your bottom four comparison percentages seem to have the wrong "polarity".'
---

Actually, if you look at the source article, the bottom 4 comparisons are measuring run-time in seconds (lower is better), whereas some of the other comparisons are measuring speed (or some other quantity such as FPS, where higher is better).

So the relative percentages are correct, but the numbers reproduced in this blog post lack important contextual information - the actual quantities being measured, and how to interpret them (i.e. which is better - higher numbers or lower numbers).

Will on September 4, 2007 8:59 AM

You are way off target here.

The reason things like games don't use the quad core is that most of them were written before quad cores. Most of them don't easily scale to more cores so you'll have to wait for the games to catch up.

There also is the very real issue of being able to do more things at once.

Someone asked how you can watch a movie and do other things--quite easily. Nothing says that what else you are doing needs a human. There are plenty of things you could leave running while you're watching that movie.

Loren Pechtel on September 4, 2007 9:54 AM

In my previous comment, I should've clarified that the given relative percentages are correct as long as you interpret a positive value to mean "better performance", and a negative value to mean "worse performance". Again, the problem is that the raw numbers, with no units, are meaningless - you have to go back to the original article to see whether "higher" or "lower" means "better performance" for any given comparison.

Will on September 4, 2007 10:07 AM

I wonder, what about Stackless Python?..

Dunno, does the stackless modification remove the GIL? If it doesn't, then the tasklets still run in a single thread.

Erlang fanboys think their software uses multiple cores, but in fact you need to have multiple Erlang interpreter processes running to do that.

Failed troll is failed, the Erlang runtime is natively multithreaded since the release of R11B-0 in May 2006. Since that time, the runtime automatically spawns a thread per core (default, you can ask for more or less) and dynamically maps your erlang processes on the OS threads.

You should update your knowledge before trying your trolls.

Masklinn on September 4, 2007 10:16 AM

You didn't review any web browsers! The majority of the time that I'm on my computer, I'm in Firefox.

I often open 5, 15, 30 or more tabs at once, after which Firefox becomes extremely unresponsive for several seconds. By far the most slowdown and unresponsiveness I ever see is in Firefox, from opening lots of tabs at once.

If more cores can help this situation, it's a definite plus for me!

James Justin Harrell on September 4, 2007 10:21 AM

"Physics" is a singular noun. "The physics don't work" don't work with me.

Howard on September 4, 2007 10:41 AM

As other people have pointed out, the build system for VC++ 2008 will build multiple targets simultaniously. Here's an article on Valve using multiple cores.
http://arstechnica.com/articles/paedia/cpu/valve-multicore.ars
executive summary:
Even if you don't do anything special, you get some benefit, because your subsystems often run in multiple threads.
The graphics subsystem can really benefit from multiple cores. Other subsystems less so.
A side benefit is that the rest of the game gets more CPU, so you can have a more complex AI.
The practical max right now is 4 CPUs. More than that, and you're running into memory starvation issues.

Mike Swaim on September 4, 2007 10:58 AM

The last 4 entries in the XBit comparison have their sign reversed -- Excel is actually 63% SLOWER on 4 cores

The numbers are correct; it's a bit confusing because some units are larger-better, others are smaller-better.

Jeff Atwood on September 4, 2007 11:15 AM

So i guess you would want a quad core for development and not so much for gaming.

I guess in scenarios where the extra cores are getting pinned to a virtual machine.

I'm sure the payoff for multiple cores has deminishing returns on a normal desktop scenario.

brian on September 4, 2007 11:23 AM

Ok, now try running an Effect in Paint.NET (which is heavily optimized for "N" cores)!

http://www.getpaint.net/misc/pdn_4x_faster.png

Rick Brewster on September 4, 2007 11:23 AM

Jeff wrote:"The numbers are correct; it's a bit confusing because some units are larger-better, others are smaller-better."

Maybe you should add the units to the numbers in your post, otherwise people have no way of knowing what those numbers mean without looking at the original article.

Will on September 4, 2007 11:29 AM

Even without intelligently-threaded applications, I think most users can make good use of dual-core desktops simply due to multitasking. Quad-cores definitely take more work to utilize, but certain classes of users could definitely use this... video editors, web developers, and other types of programmers and creative professionals often have several distinct intense processes at once.

Gabe da Silveira on September 4, 2007 11:35 AM

It's only a matter of time until all applications are written to take advantage of multiple cores, in which case we'll probably see a leap in software engineering. Maybe the Windows of the future won't take an eternity to boot anymore?

I should think you could occupy even more than 4 cores in game programming.

It's going to take some serious talent to harness the multitude of threads that will be available in the future and then apply them to multiple GPUs. Of course most people won't have SLI or Quad-SLI setups, but I think someone somewhere should write a "Super Game"(read up on your Nietzsche, kids) to demonstrate the shear power available without worrying about scalability across single cores. I bet John Carmack would do it, he already has enough Ferraris so he doesn't have to worry about sales volume.

Mattkins on September 4, 2007 11:41 AM

AMD issues-

AMD vs Intel - IIRC AMD "Barcelona" is "true" quad core, current intel "quad" cores are two two core units in one package, typical intel/microsoft-style marketing-over-technical trickery.

Also, AMD quad core opterons will support nested page tables, making virtualization perform significantly better. Thus, if you want to play with virtualization, getting a quad core AMD might be your best short term option.

Zonky Zizzymouse on September 4, 2007 11:47 AM

Anybody could test 7zip? Would it benefit from more cores?

MaS on September 4, 2007 12:36 PM

For something really controversial, why don't do the same test with browsers? :)

Monkey on September 4, 2007 12:56 PM

I test my own OS and play around with other OS's in emulators all the time. I'm only on a single-core CPU at the moment (to upgrade means new everything, pretty much) and a friend with a dual-core allowed me to try some emulation on his system.

First thing I noticed was the difference moving the emulator's process onto the second core (via task manager's "Set Affinity" option) made to the running of the rest of the system. Note that this isn't Virtual PC (which can run at very low CPU usage), these are emulators such as Bochs, QEMU and PearPC, all of which enjoy eating up valuable CPU time.

Why look at today's programs performance with tomorrow's cpu setups?

Agreed.

pcmattman on September 4, 2007 1:04 PM

It's almost pointless to worry about multicore performance in standard apps at this stage of the game. One can't just go back and "add in" support for multicore in any significant application, beyond trivial stuff that can be stuck in a background thread (which should be done already, for UI interactivity).

To really see the benefit of multicore, applications will not only have to be largely rewritten, but devs will have to start thinking in a completely different way. Not just in a "how can we thread this algorithm" way, but in a "what should we pre-emptively compute just in case the user wants it" way. The latter is where multicore starts to make sense, but it's much harder than the former (which is already pretty hard). As far as I know, the only people really thinking about this are at Microsoft Research...

Regardless, massive redesigns won't be justifiable to the shareholders until everyone *has* the multicore systems. It's chicken-and-egg. Hence, it's our job as forward-thinking developers to convince everyone we know to buy N-core over (N/2)-core, so that we'll have more cores to play with.

Remember, more cores == more awesome. For the children.


RMS on September 4, 2007 1:08 PM

I'm actually working on a single, dual, quad MSBUILD benchmark now...soon.

Scott Hanselman on September 4, 2007 1:10 PM

Regardless, massive redesigns won't be justifiable to the shareholders until everyone *has* the multicore systems.

In an ideal world.

In reality you start redesigning as soon as one of the shareholders has one of the multicore systems.

Stefan Scholl on September 4, 2007 1:25 PM

Here in the netherlands, the price for a quadcore 6600 is 50 euro's higher than the dual core 6600... it's really a no-brainer.

Frans Bouma on September 4, 2007 1:33 PM

I'm very surprised that the Erlang fan boys haven't jumped in here yet.

They don't care, their software makes use of multiple cores just fine already.

Masklinn on September 4, 2007 1:38 PM

I don't consider perfomance tests run in a vacuum to be a great measure of true performance.

I would think running the tests while listening to music, surfing performance tuning sites in I'd, outlook periodicly checking for mail, an im chat app running, sidebar full of gadgets, and a task bar of the normal bloat ware (adobe or steam) would be a better representation of the perfomance gains.

brian on September 4, 2007 1:42 PM

Even Anandtech has a hard time coming up with realistic multitasking benchmarks that stress a quad-core machine:

http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2879p=12

--
When we were trying to think up new multitasking benchmarks to truly stress Kentsfield and Quad FX platforms we kept running into these interesting but fairly out-there scenarios that did a great job of stressing our test beds, but a terrible job and making a case for how you could use quad-core today.
--

Their answer? H.264 blu-ray video playback while "doing something else". Lame. How do you watch a movie and do something else at the same time?

On the other hand, doing a lot of rendering or encoding in the background makes sense. But I'd argue this is an extraordinarily rare activity for mainstream computer users. Perhaps if video editing really takes off, and everyone's a star on YouTube with their own show..

Jeff Atwood on September 4, 2007 1:51 PM

Your bottom four comparison percentages seem to have the wrong "polarity".

Mark on September 4, 2007 1:58 PM

Compiling a C++ project in Visual Studio 2005 with a Quadcore requires to use Incredibuild, otherwise 3 CPUs sit idle for 80% of the time.

Going from a 7200 to a 10000 RPM boot disk has the same effect than going from a Dualcore to a Quadcore machine (from ~17 minutes down to ~13 minutes). So you will probably benefit the same or even more from a faster Dualcore AND a 10k RPM hard drive than from a Quadcore alone. Of course, if you can have both, go for it. But unless you use Incredibuild with Visual Studio - even if used only as standalone solution, you just won't get very good parallelism out of Visual Studio (C++ compiler) alone.

steffenj on September 4, 2007 1:58 PM

We've seen dual versus quads. But how about a dual processor rig with the dual core processors on it versus a single quad-core platform?
I'd guess it will be a bit slower due to the fact that the L1 cache is not shared between all the cores. But also I'd guess it will be faster when more applications start since the 2 cpus have more I/O bandwidth /controllers/magic cpu stuff.

smsorin on September 5, 2007 4:02 AM

I see the benifit when I'm actually doing a couple things at the same time, such as compiling something with -j3 on 3 cores and surfing the net or using an editor on the other core.

Chris G on September 5, 2007 4:43 AM

Why is scalability to N cores something that should concern the application developer at all? Multithreading goes hand in hand with multitasking, but it has to be supported at the OS level. The most responsive GUI I've ever used---making the Windows and pre-OS X Mac experiences seem prehistoric---was on a dual 66-MHz PowerPC with 16 Mb of RAM, running BeOS. Not coincidentally, that was a (so-called) "pervasively multithreaded" OS, where applications, including the GUI, were encouraged to spawn as many threads as necessary, which were then distributed across CPUs as needed. If you ran out of memory or cycles, performance degradation was gradual---no more of the freezes and skipping mouse pointers that still plague an overloaded Windows box. My point is that you couldn't help but write an app on BeOS that made efficient use of however many CPUs there were. For a programmer to have to think in terms of "how can I make use of N cores" is premature optimization. (Of course, a lot of people would first have to rid themselves of the mentality that their code is the only thing the user's ever going to run, so playing nice with the OS, drivers, or other apps is irrelevant.)

Alex Chamberlain on September 5, 2007 4:58 AM

re: Foxyshadis: "In fact, the difference between a fast single core and a slow dual core is pretty low (but visible) in this kind of light usage scenario."

There's a difference between queuing theory and reality here. Simple queuing theory assumes that a processor services a request until it is done, without interruption. It assumes that queue management and assignment of work to a processor is not done by one of the processors. Reality includes I/O and timeslice completion interrupts and thread dispatching managed by a kernel dispatcher that also runs on a processor.

Given this reality, and the fact that each core in a dual core processor is NOT half as fast as an equal price uniprocessor, dual-core will make most users happier.

In scientific applications on RISC processors, it is common to run the compute on one core and the I/O on the other.

See the Queue.xls at http://forum.johnson.cornell.edu/faculty/mcclain/Software/Software.htm

DAKra

DAKra on September 5, 2007 7:21 AM

Instead of speeding up a single-MP3 encoding, why not have the
application process 4 different files at once.

Yea, that also works great for mass PNG crunching (via pngout or pngcrunch and the like). With a single core this can take hours or even days. But then again it isn't much of an issue since speed isn't really critical.

With some book-keeping you can let it run every now and then with idle priority. Well, that's what I'm doing on my single core machine.

Jos Hirth on September 5, 2007 8:21 AM

Basically, you've ascertained that not many applications currently take much advantage of more than two cpu's. This is a surprise, how?

Quad core CPU's have maybe been on the market a year, if that, and cost a good bit over $1K for a good portion of that time. The market saturation for quad core PC's is probably still a fraction of 1%.

Just because you write a multithreaded program, that doesn't mean it will simply, automatically, and efficiently use all available cores. In many applications, writing code to efficiently use 1-4 cores is more effort that optimizing for just 1-2 cores. And given the low market presence yet for quad-cores, it probably hasn't been a worthwhile cost.

As the quads get cheaper and more commonplace (and eventually 8+ core cpu's appear) more and more software will take advantage of the extra CPU power.

This is no different than, say, putting 2+GB of RAM in a PC from 4 years ago: it might help a few apps, but it wouldn't make much differnce for most software since it wasn't written to need/use it.

LintMan on September 5, 2007 9:26 AM

I see usefulness in large numbers of cores when a single processor is handling many many tasks at once. But this is generally rare (in the name of making a more failsafe system, it's better to give individual devices their own processors than to have a big mainframe for each company/home/etc)

But I need to ask, why aren't they specializing these cores to specific tasks? Adding a specialized GPU can yield huge results; why not try to do the same for CPUs?

Jim Robert on September 5, 2007 10:20 AM

Very good tests, even if the results are unexpected, you've got to start somewhere.

For a more practical use of the quad core cpu's, how about building your own supercomputer ;)

http://www.clustermonkey.net//content/view/211/1/

Wesley W on September 5, 2007 12:28 PM

If you can't terminate an application that eats all CPU (rather: will take all it gets), then that is a bug in the OS, not a lack of cores.

Andreas Krey on September 5, 2007 1:32 PM

"All these benchmarks seem to focus on running a single application, which is hardly ever the case! I usually have at least 4 applications open (web browser, email, IM, download manager etc), not to mention all the OS processes that run in the background."

You would never be able to tell whether you were on a 2-core or 80-core box, trust me. In fact, the difference between a fast single core and a slow dual core is pretty low (but visible) in this kind of light usage scenario. None of the apps are ever using cpu at the same time, and barely ever use it during the course of their run; if they ever do contend, it's always for memory or disk instead.

"If more cores can help this situation, it's a definite plus for me!"

It doesn't. This is supposed to be addressed in 3.0, if it doesn't end up on the chopping block, but in 2.0 most of the tab session bookkeeping is serialized through a single thread.

"Photoshop is generally a bandwidth bound application. You need to make sure that you perform operations which are not bandwidth but compute bound to see the effects of more cores."

That's creating a synthetic benchmark; fact is Photoshop is just plain I/O bound nearly all the time and fiddling with the benchmark to remove that is disingenuous.

Note that this is also true for encoding: You can scale amazingly to 4-8 cpus, but then you hit a brick wall somewhere in there because the I/O just plain can't keep up, the disks, bus, or network are just flooded. In the specific case of video, anything that uses avisynth as a backend will be hobbled by its single-threaded execution unless it moves to experimental new versions, this especially includes the popular gordian knot/Auto GK apps, as well as newer Staxrip/Megui. Mencoder GUIs are more multi-threaded but still not optimal.

These benchmarks are good for at least figuring out which one you should buy, if you're torn between the two. Massively multicore is going to pave the way for event-driven separable software models and slowly change everything, I expect, but we're still years away from seeing that happen. The language experiments going on now are pretty heartening, I hope to see more soon. :D

Foxyshadis on September 5, 2007 1:53 PM

Geoff: "Please don't use red and green text that are otherwise identical (same saturation, value, font, and so on), to differentiate positive and negative results. Yes, there is a minus sign in front of the negative results, but this is slow for the brain to latch on to, especially since the font is rather thin.

There are many other ways to visually separate good and bad results, almost all of which are better than just red and green and no other differentiation. I've seen some beautiful and effective choices, though many tend to bias the reader (bolding the bad results, for instance). Personally I find just replacing the green with blue to be quite effective."

Personally, I had no problem comprehending the difference between the values. Traditionally, red ink is used to indicate negative, especially in the financial world (you've surely heard the phrase "in the red", haven't you?).

The problem I have is with people telling others how they should design *their own* websites. If you don't like the design, don't read the post. It doesn't matter how positives/negatives are differentiated; if you are unable to figure out the difference, you probably won't comprehend the post well enough to get any value from it anyway.

KenW on September 6, 2007 9:09 AM

KenW, did it occur to you that Geoff might be red-green colorblind?

Alex Chamberlain on September 6, 2007 10:15 AM

I work at a big game development studio, and you can trust me when I say one of the most important things we do for speed is to move things off the main thread. On console, our code already has to be highly parallelized to make use of what's available. Of course the same thing goes for PC game development, as quad cores (and soon, more) are becoming the standard for high-end gaming PC's.

We use IncrediBuild for parallelized distributed builds. It speeds up the development process a ton.

Johan on September 8, 2007 1:51 PM

1. it's been said that supercomputers (multi-multi-multi-multi-core systems) are tools for transforming cpu-bound problems into io-bound problems.
2. there's this part of maths called Mass Service Systems that basically proves that having 1 processor of speed nX is always better than having n processors of speed X, for any n 1. therefore, if there's a way to increase single core speed, manufacturers should always take it... the problem is, it's easier to just glue several cores together.

Baczek on September 10, 2007 3:19 AM

Does any rendering apps such as 3Dstudio MAX or V ray utilise 2 physical Dual Core processors? As im thinking of buying a new machine with the intel Dualcore 5160 but wondering is it worth getting the 2nd CPU?

Thanks in advance for any advice!

Gavin on September 11, 2007 12:56 PM

How about testing Audio App's such as Protool's, Cubase Sx 3 or Cubase 4. Cubase and Reason 3.0 as Rewire. Those are some great programs to test with because they use so much CPU power. Not just the application itself but running several instances of Plug ins (Compressors, Synthesizers, Distortions, Etc...) using multiple tracks of audio and midi. That will be a great test. Let me know if you do a test like that ever. I want to know what the best is.

Khhryst on September 12, 2007 5:44 AM

"Where as with games, I can't think of anything that could utilize the spare CPU cores..
I wonder if it's even remotely possible, but: To use extra cores as "software-graphics-cards". Since graphics are the only thing that really needs lots more processing power in games, it'd make sense to say divide the screen up between them, and use the remaining two cores to process extra effects on their area on screen. Biggest problem being the CPU's aren't as fast as drawing stuff to the screen as graphics cards are..."

There is a relation that can be used as an example here. Systems that render the PhysX engine without a PhysX Pci-card (software rendering) installed take a huge ditch in performance in PhysX supported games. One such title coming out is Fury, however it is still in beta, mind you. When running this game without a PhysX Pci-card installed, most rendering, if not all, is based specifically off your CPU. One way to test this theory on a dual-core (or possibly quad-core+) is to Crtl-Alt-Del and set the game process's affinity to only one CPU. You will notice that your framerate will drop to roughly half compared to running both processors. Hands down, PhysX based applications force a huge payload onto a CPU due to software rendering. If I could get my hands on a quad-core system, it'd be great to test this theory further. However, take into consideration that I'm 'assuming' Fury is coded to support the relevance of this.

Josh Vining on September 14, 2007 2:50 AM

Thank you for your nice article.

Abel on September 28, 2007 2:19 AM

More comments»

The comments to this entry are closed.

Content (c) 2012 . Logo image used with permission of the author. (c) 1993 Steven C. McConnell. All Rights Reserved.