I <3 Steve McConnell*
Coding Horror
programming and human factors
by Jeff Atwood


16 posts from July 2008

July 30, 2008

Alpha, Beta, and Sometimes Gamma

As we begin the private beta for Stack Overflow later this week, I wondered: where do the software terms alpha and beta come from? And why don't we ever use gamma?

alpha character beta character

Alpha and Beta are the first two characters of the Greek alphabet. Presumably these characters were chosen because they refer to the first and second rounds of software testing, respectively.

But where did these terms originate? There's an uncited Wikipedia section that claims the alpha and beta monikers came, as did so many other things, from the golden days of IBM:

The term beta test comes from an IBM hardware product test convention, dating back to punched card tabulating and sorting machines. Hardware first went through an alpha test for preliminary functionality and small scale manufacturing feasibility. Then came a beta test, by people or groups other than the developers, to verify that the hardware correctly performed the functions it was supposed to, and that it could be manufactured at scales necessary for the market. And finally, a c test to verify final safety. With the advent of programmable computers and the first shareable software programs, IBM used the same terminology for testing software. As other companies began developing software for their own use, and for distribution to others, the terminology stuck -- and is now part of our common vocabulary.

Based on the software release lifecycle page, and my personal experience, here's how I'd characterize each phase of software development:

  1. Pre-Alpha

    The software is still under active development and not feature complete or ready for consumption by anyone other than software developers. There may be milestones during the pre-alpha which deliver specific sets of functionality, and nightly builds for other developers or users who are comfortable living on the absolute bleeding edge.

  2. Alpha

    The software is complete enough for internal testing. This is typically done by people other than the software engineers who wrote it, but still within the same organization or community that developed the software.

  3. Beta

    The software is complete enough for external testing -- that is, by groups outside the organization or community that developed the software. Beta software is usually feature complete, but may have known limitations or bugs. Betas are either closed (private) and limited to a specific set of users, or they can be open to the general public.

  4. Release Candidate (aka gamma or delta)

    The software is almost ready for final release. No feature development or enhancement of the software is undertaken; tightly scoped bug fixes are the only code you're allowed to write in this phase, and even then only for the most heinous and debilitating of bugs. One of the most experienced software developers I ever worked with characterized the release candidate development phase thusly: "does this bug kill small children?"

  5. Gold

    The software is finished -- and by finished, we mean there are no show-stopping, little-children-killing bugs in it. That we know of. There are probably numerous lower-prority bugs triaged into the next point release or service pack, as well.

These phases all sound perfectly familiar to me, although there are two clear trends:

  • The definition of beta grows more all-encompassing and elastic every year.
  • We are awfully eager to throw alpha quality code over the wall to external users and testers.

In the brave new world of web 2.0, the alpha and beta designations don't mean quite the same things they used to. Perhaps the most troubling trend is the perpetual beta. So many websites stay in perpetual beta, it's almost become a running joke. GMail, for example, is still in beta after over four years!

Although I've seen plenty of release candidates in my day, I've rarely seen a "gamma" or "delta". Apparently Flickr used it for a while in their logo, after heroically soldiering on from beta:

flickr: beta, gamma, love

"loves you" is certainly more fun than "gold", but I'm not sure it's ever the same as done. Maybe that's the way it should be.

Posted by Jeff Atwood    103 Comments

July 28, 2008

Is Money Useless to Open Source Projects?

In April I donated $5,000 of the ad revenue from this website to an open source .NET project. It was exciting to be able to inject some of the energy from this blog into the often-neglected .NET open source ecosystem.

As I mentioned at the time, I used a very hands off approach. While I did have some up-front criteria for the award (open source license, public source control, accepts outside source contributions) it's basically a no-strings grant.

The real money is being sent via wire transfer to Dario Solera, the ScrewTurn Wiki project coordinator. What's Dario going to do with this money? You'll have to ask him. That's not for me to decide. There are no strings attached to this money of any kind. I trust the judgment of a fellow programmer to run their project as they see fit.

When I said the project could do whatever they saw fit with the money, I meant it. Buy liquor and cigarettes, throw a huge party, play it on the ponies. I'm not kidding. As long as the project team believes it's a valid way to move their project forward, whatever they say goes. It's their project, and their grant.

I hadn't heard anything from Dario, and I was curious, so I followed up with him via email. He sent back this response:

The grant money is still untouched. It's not easy to use it. Website hosting fees are fully covered by ads and donations, and there are no other direct expenses to cover. I thought it would be cool to launch a small contest with prizes for the best plugins and/or themes, but that is not easy because of some laws we have here in Italy that render the handling of a contest quite complex.

What would you suggest?

I was crushingly disappointed to find out the $5,000 in grant money has been sitting in the bank for the last four months, totally unused. That's painful to hear, possibly the most painful of all outcomes. Why did we bother doing this if nothing changes?

My friend Jon Galloway warned me this might happen. I didn't believe him. But what other conclusion can I draw at this point? He was right:

Open Source is to Traditional Software as Terror Cells are to Large Standing Armies – if you gave a terrorist group a fighter jet, they wouldn't know what to do with it. Open source teams, and culture, have been developed such that they're almost money-agnostic. Open source projects run on time, not money. So, the way to convert that currency is through bounties and funded internships. Unfortunately, setting those up takes time, and since that's the element that's in short supply, we're back to square one.

I had hoped that $5,000 grant money would be converted into something that furthered an open source project -- perhaps something involving the community and garnering more code contributions. But apparently that's more difficult than anyone realized.

Jon offered these ideas:

  1. Can they turn the money over to a company or organization who is familiar with this kind of thing, like the Google Summer of Code, etc.?
  2. Often times, documentation and marketing are in really short supply. Could they just hire a technical writer and / or marketing expert with the $5k?
  3. SourceForge has a donations program in which people can make donations to pay developers. Maybe he can run the money through there?

I must admit I'm at a bit of a loss here. Do you have any ideas for how the Screwturn Wiki project can use their $5,000 open source grant effectively? If so, please share them in the comments here, or on the ScrewTurn forum -- in the Suggestions and Feature Requests area.

Even I'm not naive enough to suggest that money can solve every open source software problem. But I don't have a lot of time to contribute; I only have advertising revenue. I'm absolutely dumbfounded to learn that contributing money isn't an effective way to advance an open source project. Surely money can't be totally useless to open source projects... can it?

Posted by Jeff Atwood    269 Comments

July 26, 2008

Understanding The Hardware

I got a call from Rob Conery today asking for advice on building his own computer. Rob works for Microsoft, but lives in Hawaii. I'm not sure how he managed that, but being so far from the mothership apparently means he has the flexibility to spec his own PC. Being stuck in Hawaii is, I'm sure, a total bummer, dude.

Rob and I may disagree on pretty much everything from a coding perspective, but we can agree on one thing: we love computers. And what better way to celebrate that love by building your own? It's not hard. This industry was built on the commodification of hardware. If you can snap together a Lego kit, you can build a computer.

Maybe this is a minority opinion, but I find understanding the hardware to be instructive for programmers. Peter Norvig -- now director of research at Google -- appears to concur.

Understand how the hardware affects what you do. Know how long it takes your computer to execute an instruction, fetch a word from memory (with and without a cache miss), transfer data over ethernet (or the internet), read consecutive words from disk, and seek to a new location on disk.

In my book, one of the best ways to understand the hardware is to get your hands dirty and put one together, including installing the OS, yourself. It's a shame Apple programmers can't do this, as their hardware has to be blessed by the Cupertino DRM gods. Or, you could build a frankenmac, though you'll run the risk of running a "patched" OS X indefinitely.

As Rob and I were talking about the philosophy of building your own development PC -- something I also discussed on a Hanselminutes podcast -- he said you know, you should blog this. But Rob -- I already have, many times over! Let's walk down the core list of components I recommended for Rob, and I'll explain my choices with links to the relevant blog posts I've made on that particular topic.

ASUS P5E Intel X38 motherboard ($225)

I'm a big triple monitor guy, so I insist on motherboards that are capable of accepting two video cards -- in other words, they have two x8 or x16 PCI Express card slots suitable for video cards. I also demand quiet from my PC, which means a motherboard with all passive cooling. Beyond that, I don't like to pay a lot for a fancy motherboard. After spending the last five years with motherboards packing scads of features I never end up using (two ethernet ports, anyone?), I've realized there are better ways to invest your money. People tend to respect ASUS as one of the largest and most established Taiwanese OEMs, so it's usually a safe choice. I'd go as far down on price on the motherboard as you can without losing whatever essential features you truly need. Save that money for the other parts.

Intel Core 2 Duo E8500 3.16 GHz CPU ($190)
Intel Core 2 Quad Q9300 2.5 GHz CPU ($270)

Ah, the eternal debate: dual versus quad. Despite what Intel's marketing weasels might want you to believe, clock speed still matters very much. Here's an example: SQL Server 2005 queries on my local box, a 3.5 GHz dual core, execute more than twice as fast as on our server, a 1.8 GHz eight core machine. Sadly, very few development environments parallelize well, with the notable exception of C++ compilers. Outside of a few niche activities, such as video encoding and professional 3D rendering, most computing tasks don't scale worth a damn beyond two cores. Yes, it's exciting to see those four graphs in Task Manager (and even I get a little giddy when I see sixty-four of 'em), but take a look at the cold, hard benchmark data and the contents of your wallet before letting that seductive 4 > 2 math hijack the rational parts of your brain.

It's also smart to buy a little below the maximum, with the ultimate goal of upgrading to a whizzy-bang 4 GHz quad core CPU sometime in the future. One of the hidden value propositions in building your own PC is the ability to easily upgrade it later. CPU is one of the most obvious upgrade points where you want to intentionally underbuy a little. Give yourself some room for future upgrades. Until a quad costs the same as a dual at the same clock speed, my vote still goes to the fastest dual core you can afford.

Kingston 4GB (2 x 2GB) DDR2 800 x 2 ($156)

Memory is awesomely cheap. When it comes to memory, I like to buy a few notches above the cheapest stuff, and Kingston has been a consistently reliable brand for me at that pricing level. There's no reason to bother with anything under 8 GB these days. Don't get hung up on memory speed, though. Quantity is more important than a few extra ticks of speed. But don't take my word for it. As an experiment, Digit-Life cut the speed of memory in half, with a resulting overall average performance loss of merely three percent. By the time your system has to reach outside of the L1, L2, and possibly even L3 cache -- it's already so slow from the system's perspective as to be academic. Memory that is a few extra nanoseconds faster isn't going to make any difference. This is also why I specified the latest and greatest Intel CPUs with larger 6 MB L2 caches. Remember, kids, Caching Is Fundamental!

Western Digital VelociRaptor 300 GB 10,000 RPM Hard Drive ($290)

This is arguably the only indulgence on the list. The Velociraptor is an incredibly expensive drive, but it's also a rocket of a hard drive. I'm a big believer in the importance of disk speed to overall system performance, particularly for software developers. At least Scott Guthrie backs me up on this one. Trust me, you want a 10,000 RPM boot drive. Buy a slower large drive for your archiving needs. You want two drives, anyway; having two spindles will give you a lot of flexibility and also help your virtual machine performance immensely.

This new raptor model is the best of the series. It's much quieter, uses less power, generates less heat, and is by far the fastest -- embarrassingly fast. It's expensive, yes. I won't hold it against you if you decide to disregard this advice and go with a respectably fast, less expensive hard drive. But to me, it's all about putting the money where the most significant bottlenecks are, and considered in that light -- man, this thing is so worth it. As Storage Review said, "[its] single-user scores .. blow away those of every other [hdd]".

Radeon HD 4850 512MB video card ($155 after rebate)

Even if you're not a gamer, it's hard to ignore the charms of this amazing powerhouse of a video card. The brand new ATI 4850 delivers performance on par with the very fastest $500+ video card you can buy for a measly hundred and fifty bucks! Modern operating systems require video grunt, either for windowing effects or high-definition video playback. Beyond that, it's looking more and more like some highly parallizable tasks may move to the GPU. Have you ever read stuff like "even the slowest GPU implementation was nearly 6 times faster than the best-performing CPU version"? Get used to reading statements like that; I expect you'll be reading a lot more of them in the future as general purpose APIs for GPU programmability become mainstream. That's another reason, as a programmer and not necessarily a gamer, you still want a modern video card. For all this talk of coming 8 and 16 core CPUs, eventually the GPU could be the death of the general purpose CPU.

We also want our video card to be efficient. Many don't realize this, but your video card can consume as much power as your CPU. Sometimes even more! The 4850, for all its muscle, is remarkably efficient as well. According to a recent AnandTech roundup, it's on par with the most efficient cards of this generation. Pay attention to your idle power consumption, because power consumed means heat produced, which in turn means additional noise and possible instability.

Corsair 520HX 520W Power Supply ($100 after rebate)

The power supply is probably one of the most underrated and misunderstood components of a modern PC. First, because people tend to focus on the "watts" number when the really important number is actually efficiency -- a certain percentage of energy that goes into every power supply is turned into waste heat. An efficient power supply will run cooler and more reliably because it uses higher quality parts. People think you need 1.21 Jigawatts to run a powerful desktop system, but that's just not true. Unless you have a bleeding-edge CPU paired with two high-end top of the line gaming class video cards, trust me -- even 500 watts is overkill.

The Corsair model I recommend gets stellar reviews. It has modular cables and the 80 plus designation, so it's 80% efficient at all input voltages. Note that a quality power supply is not a substitute for a quality UPS or surge protector, but it helps.

Scythe "Ninja" SCNJ-2000 cooler ($50)
Scythe "Ninja Mini" SCMNJ-1000 cooler ($35)

I'll be honest with you. I have a giant heatsink fetish. These giant hunks of aluminum and copper, and the liquid-filled heatpipes that drive them, fascinate me. But there's a more practical reason, as well: if you want a quiet computer, you don't even bother with the stock coolers that are bundled with the CPU. Over the last few years, I keep coming back to Scythe's classic "Ninja" tower cooler, which is available in tall and short varieties. They're so astoundingly efficient that, with adequate case ventilation, they can be run fanless. I even (barely) managed to squeeze the Ninja Mini into my home theater PC build, and it's now mercifully fanless as well. There are plenty of other great tower/heatpipe coolers on the market, but the Ninja is still one of the best, a testament to its pioneering design. The CPU is (usually) the biggest consumer of power in your PC, so it's sensible to invest in a highly efficient aftermarket cooler to keep noise and heat at bay under load.

There you have it. More than you ever possibly wanted to know about how an obsessive geek builds a PC -- painstakingly analyzing every single part that goes into it. Now, like Rob, you're probably sorry you asked; who needs all the philosophical digressions, just give us the damn parts list! OK, here it is:

The best bang for the buck developer x86 box I can come up with, all for around $1100.

I try to avoid posting about hardware too much, but sometimes I can't help myself. I blame Rob. Enjoy your new system, Mr. Conery.

Posted by Jeff Atwood    179 Comments

July 24, 2008

Coding Without Comments

If peppering your code with lots of comments is good, then having zillions of comments in your code must be great, right? Not quite. Excess is one way good comments go bad:

'*************************************************
' Name: CopyString
'
' Purpose: This routine copies a string from the source
' string (source) to the target string (target).
'
' Algorithm: It gets the length of "source" and then copies each
' character, one at a time, into "target". It uses
' the loop index as an array index into both "source"
' and "target" and increments the loop/array index
' after each character is copied.
'
' Inputs: input The string to be copied
'
' Outputs: output The string to receive the copy of "input"
'
' Interface Assumptions: None
'
' Modification History: None
'
' Author: Dwight K. Coder
' Date Created: 10/1/04
' Phone: (555) 222-2255
' SSN: 111-22-3333
' Eye Color: Green
' Maiden Name: None
' Blood Type: AB-
' Mother's Maiden Name: None
' Favorite Car: Pontiac Aztek
' Personalized License Plate: "Tek-ie"
'*************************************************

I'm constantly running across comments from developers who don't seem to understand that the code already tells us how it works; we need the comments to tell us why it works. Code comments are so widely misunderstood and abused that you might find yourself wondering if they're worth using at all. Be careful what you wish for. Here's some code with no comments whatsoever:

r = n / 2;
while ( abs( r - (n/r) ) > t ) {
  r = 0.5 * ( r + (n/r) );
}
System.out.println( "r = " + r );

Any idea what that bit of code does? It's perfectly readable, but what the heck does it do?

Let's add a comment.

// square root of n with Newton-Raphson approximation
r = n / 2;
while ( abs( r - (n/r) ) > t ) {
  r = 0.5 * ( r + (n/r) );
}
System.out.println( "r = " + r );

That must be what I was getting at, right? Some sort of pleasant, middle-of-the-road compromise between the two polar extremes of no comments whatsoever and carefully formatted epic poems every second line of code?

Not exactly. Rather than add a comment, I'd refactor to this:

private double SquareRootApproximation(n) {
  r = n / 2;
  while ( abs( r - (n/r) ) > t ) {
    r = 0.5 * ( r + (n/r) );
  }
  return r;
}
System.out.println( "r = " + SquareRootApproximation(r) );

I haven't added a single comment, and yet this mysterious bit of code is now perfectly understandable.

While comments are neither inherently good or bad, they are frequently used as a crutch. You should always write your code as if comments didn't exist. This forces you to write your code in the simplest, plainest, most self-documenting way you can humanly come up with.

When you've rewritten, refactored, and rearchitected your code a dozen times to make it easy for your fellow developers to read and understand -- when you can't possibly imagine any conceivable way your code could be changed to become more straightforward and obvious -- then, and only then, should you feel compelled to add a comment explaining what your code does.

As Steve points out, this is one key difference between junior and senior developers:

In the old days, seeing too much code at once quite frankly exceeded my complexity threshold, and when I had to work with it I'd typically try to rewrite it or at least comment it heavily. Today, however, I just slog through it without complaining (much). When I have a specific goal in mind and a complicated piece of code to write, I spend my time making it happen rather than telling myself stories about it [in comments].

Junior developers rely on comments to tell the story when they should be relying on the code to tell the story. Comments are narrative asides; important in their own way, but in no way meant to replace plot, characterization, and setting.

Perhaps that's the dirty little secret of code comments: to write good comments you have to be a good writer. Comments aren't code meant for the compiler, they're words meant to communicate ideas to other human beings. While I do (mostly) love my fellow programmers, I can't say that effective communication with other human beings is exactly our strong suit. I've seen three-paragraph emails from developers on my teams that practically melted my brain. These are the people we're trusting to write clear, understandable comments in our code? I think maybe some of us might be better off sticking to our strengths -- that is, writing for the compiler, in as clear a way as we possibly can, and reaching for the comments only as a method of last resort.

Writing good, meaningful comments is hard. It's as much an art as writing the code itself; maybe even more so. As Sammy Larbi said in Common Excuses Used To Comment Code, if your feel your code is too complex to understand without comments, your code is probably just bad. Rewrite it until it doesn't need comments any more. If, at the end of that effort, you still feel comments are necessary, then by all means, add comments. Carefully.

Posted by Jeff Atwood    267 Comments

July 22, 2008

Building Tiny, Ultra Low Power PCs

In previous posts, I've talked about building your own desktop PC, and building your own home theater PC. I'm still very much in love with that little HTPC I built. Not only does it have a modern dual-core CPU, and fantastic high-definition capable integrated video -- it's an outstanding general purpose media sharing server, too. But the real punchline is that I eventually got that box down to an insanely low 44 watts at idle. That's in the ballpark for a powerful laptop, and far better than your garden variety desktop PC, which will draw somewhere between 100 to 200 watts of power.

44 watts is impressive, but what if you want to build a PC that uses even less power -- radically less?

That's when you turn to something like AMD's Geode platform in the Nano-ITX form factor. It uses five watts of power at idle. That's almost ten times less than my HTPC build I was so proud of!

Nano-ITX motherboard

This is the JetWay J8F9 AMD Geode LX800 motherboard. I can't say "this is actual size" with a straight face without knowing the size and aspect ratio of your monitor, but it's probably darn close. The actual dimensions are just under five inches on each side. It may not look like much, but consider the specs:

  • 500 Mhz AMD x86 Geode LX 800 CPU
  • 200 pin SO-DIMM memory slot, 1 GB DDR-400 max
  • Two ATA-100 drive connections
  • mini-PCI expansion slot
  • CompactFlash memory card slot
  • onboard audio / VGA / fast ethernet / USB

This thing is, for all intents and purposes, a complete, standalone x86 PC that fits in the palm of your hand and sips five watts of power. Well, assuming you have an enormous hand.

You will need memory and a storage device, of course. You could pick up a laptop hard drive, but another clever thing about this board is that it allows you to use a cheap CompactFlash card as your storage medium -- for the optimal low power, no moving parts install.

  1. AMD Geode LX 800 Nano ITX Motherboard/CPU Combo $154
  2. 512MB 200-pin SO-DIMM DDR-400 $20
  3. 4GB compact flash card $14
  4. 12vdc AC/DC external wall wart $18

So we can put together our own tiny utility PC for right at 200 bucks. Not bad. Unbox it, snap in the memory and CF card, plug in the wall wart, and you're ready to install and boot your operating system of choice. It's that simple.

Naturally, you won't get barn-burning performance, but if you remember the Pentium II 300 Mhz systems of yesteryear, you'll know what to expect. You may recall those now-ancient boxes were still able to do some pretty amazing things in their day. I would not build an ultra-lower power PC assuming it will be tolerable for day-to-day web browsing and email reading, unless you're comfortable using text mode or command-line interfaces exclusively.

This must be a market segment JetWay specializes in; they have a surprisingly large number of Mini-ITX motherboards to choose from. I don't think you'll find anything more power-efficient than the Geode LX 800 model, though, but there are some lesser expensive choices that get close. Lots of variety!

If the 5" x 5" profile of the Nano-ITX is far too large for your tastes, how do you feel about Pico-ITX? It's even smaller at 10cm x 7.2cm.

picoitx-epia-px10000.jpg

I've been following the ultra low power, tiny form factor PC segment for quite a few years now. With the emergence of Intel's Atom and "netbooks" like the ASUS Eee, it's a segment that is dangerously close to becoming mainstream. If you're interested, mini-itx.com is still one of the best sources of hands-on reviews, information, and community projects. It's fun stuff.

What could you do with a tiny, highly efficient x86 PC that boots up in under a minute?

Posted by Jeff Atwood    97 Comments

July 20, 2008

Web Development as Tag Soup

As we work with ASP.NET MVC on Stack Overflow, I find myself violently thrust back into the bad old days of tag soup that I remember from my tenure as a classic ASP developer in the late 90's. If you're not careful bordering on manically fastidious in constructing your Views, you'll end up with a giant mish-mash of HTML, Javascript, and server-side code. Classic tag soup; difficult to read, difficult to maintain.

I don't mean tag soup in the sense of badly formed HTML, or the malformed world we live in. I mean tag soup in the sense of mixing HTML markup and server-side code. Now you can double your pleasure: badly formed HTML, meet badly written code.

The tag soup problem seems to be endemic to all modern web development stacks. I see that Ruby on Rails apps have the same problem; here's a slice of representative RHTML from Typo, a Ruby blogging engine.

Ruby RHTML markup and code

Do you find this readable? Can you see where the code begins and the markup ends? Are you confident you could change the code structure without breaking the HTML, or change the HTML structure without breaking the code?

Sometimes editing this stuff makes me feel like I'm playing Operation. I have to ever so carefully maneuver my metal tweezers into one tiny slice of code or HTML and make my changes without touching the edges and setting off that blasted electrical buzzer.

Operation game

I'm not trying to single out Rails or Typo here; I could easily show you a ASP.NET MVC view that's just as confusing (or as "clear", if you think that's perfectly readable, I guess). Tag soup is everywhere; take a look at the Python Django framework templates:

<h1>Archive for {{ year }}</h1>

{% for date in days %}
    {% ifchanged %}<h3>{{ date|date:"F" }}</h3>{% endifchanged %}
    <a href="{{ date|date:"M/d"|lower }}/">{{ date|date:"j" }}</a>
{% endfor %}

Perhaps when it comes to mixing HTML and server-side code, some form of soup is unavoidable, a necessary evil. The soup can be quite palatable; maybe even delicious. It's certainly possible to write good tag soup and bad tag soup.

But I have to wonder: is there a better way? Is there something beyond RHTML, Views, and Templates? What examples would you point to of web development stacks that avoided degenerating into yet more hazardous, difficult to maintain tag soup? Is there anything truly better on the horizon?

Or is this year's newer, fancier, even-more-delicious iteration of tag soup as good as it ever gets for web development?

Posted by Jeff Atwood    267 Comments

July 17, 2008

Dealing With Bad Apples

Robert Miesen sent in this story of a project pathology:

I was part of a team writing an web-based job application and screening system (a job kiosk the customer called it) and my team and our customer signed on to implementing this job kiosk using Windows, Apache, PHP5, and the ZendFramework -- everyone except one of our team members, who I will refer to as "Joe". Joe kept advocating the use of JavaScript throughout the technology deliberation phase, even though the customer made it quite clear that he expected the vast majority of the job kiosk to be implemented using a server-side technology and all the validation should be done using server-side technology.

The fact that the customer signed off on this, however, did nothing to deter Joe from advocating JavaScript -- abrasively. Every time our project hit a bump in the road, Joe would go off on some tirade on how much easier our lives would be if we were only writing this job kiosk in JavaScript. Joe would constantly bicker about how we were all doing this all wrong because we weren't doing it in JavaScript, not even bother to learn the technologies we were actually using, and, whenever fellow teammates would try and gently bring him back into the fold (usually via email), Joe would just flame the poor guy. At the height of Joe's pro-JavaScript bigotry, he would regularly belt off comments like, "Well, if we had only done it in JavaScript," to such an extent that the team would have been better off if he had just quit (or was reassigned or fired.)

After reading this story, I had to resist the urge to lean forward, hand placed thoughtfully under my chin, brow furrowed, and ask -- have you tried JavaScript?

Robert thought this story was a cautionary tale about technology dependence, but I see something else: a problem team member, a classic bad apple.

an apple goes bad

I'm sure "Joe" had the best of intentions, but at the point where you're actively campaigning against the project, and working against your teammates -- you're a liability to the project.

The cost of problem personnel on a project is severe, as noted in Chapter 12 of McConnell's Rapid Development: Taming Wild Software Schedules.

If you tolerate even one developer whom the other developers think is a problem, you'll hurt the morale of the good developers. You are implying that not only do you expect your team members to give their all; you expect them to do it when their co-workers are working against them.

In a review of 32 management teams, Larson and LaFasto found that the most consistent and intense complaint from team members was that their team leaders were unwilling to confront and resolve problems associated with poor performance by individual team members. (Larson and LaFasto 1989). They report that, "more than any other single aspect of team leadership, members are disturbed by leaders who are unwilling to deal directly and effectively with self-serving or noncontributing team members." They go on to to say that this is a significant management blind spot because managers nearly always think their teams are running more smoothly than their team members do.

How do we identify problem personnel? It's not difficult as you might think. I had a friend of mine once describe someone on his team as -- and this is a direct quote -- "a cancer". At the point which you, or anyone else on your team, are using words like cancer to describe a teammate, you have a serious project pathology. You don't have to be friends with everyone on your team, although it certainly helps, but a level of basic personal and professional respect is mandatory for any team to function normally.

Steve outlines a few warning signs that you're dealing with a bad apple on your team:

  1. They cover up their ignorance rather than trying to learn from their teammates. "I don't know how to explain my design; I just know that it works." or "My code is too complicated to test." (These are both actual quotes.)

  2. They have an excessive desire for privacy. "I don't need anyone to review my code."

  3. They are territorial. "No one else can fix the bugs in my code. I'm too busy to fix them right now, but I'll get to them next week."

  4. They grumble about team decisions and continue to revisit old discussions long after the team has moved on. "I still think we ought to go back and change the design we were talking about last month. The one we picked isn't going to work."

  5. Other team members all make wisecracks or complain about the same person regularly. Software developers often won't complain directly, so you have to ask if there's a problem when you hear many wisecracks.

  6. They don't pitch in on team activities. On one project I worked on, two days before our first major deadline, a developer asked for the day off. The reason? He wanted to spend the day at a men's clothing sale in a nearby city -- a clear sign he hadn't integrated with the team.

Let me be quite clear on this point: if your team leader or manager isn't dealing with the bad apples on your project, she isn't doing her job.

You should never be afraid to remove -- or even fire -- people who do not have the best interests of the team at heart. You can develop skill, but you can't develop a positive attitude. The longer these disruptive personalities stick around on a project, the worse their effects get. They'll slowly spread poison throughout your project, in the form of code, relationships, and contacts.

Removing someone from a team is painful; it's not fun for anyone. But realizing you should have removed someone six months ago is far more painful.

Posted by Jeff Atwood    146 Comments

July 15, 2008

The Ultimate Software Gold Plating

Some developers love to gold plate their software. There are various shades of .. er, gold, I guess, but it's usually considered wasteful to fritter away time gold plating old code in the face of new features that need to be implemented, or old bugs that could be squashed.

Developers are fascinated by new technology and are sometimes anxious to try out new features of their language or environment or to create their own implementation of a slick feature they saw in another product -- whether or not it's required in their product. The effort required to design, implement, test, document, and support features that are not required lengthens the schedule.

But gold plating your code isn't all bad. Perhaps the most remarkable tale of successful developer gold plating I've ever read is the one Blake Patterson outlines:

Not long ago I purchased a new-in-box Atari Jaguar, complete with Jeff Minter's psychedelic sequel to Tempest, Tempest 2000. It's an amazing game that's been ported to many other platforms, but the consensus is that none are as solid as the Jaguar original. Having played several of the ports, I'd have to agree.

tempest 2000

An interesting thing about "the world's first 64-bit console" -- its controller was, as the Brits would say, fairly pants. It was large, sported a calculator-button array for game overlays (like the Intellivision controller), had no shoulder buttons, and featured only a D-pad for directional control. (ed: certainly one of the weirdest members of the game console controller family tree, to be sure)

atari jaguar controller

As the arcade original is controlled with a rotary spinner knob, the D-pad falls rather short of providing ideal game control.

tempest spinner

But, of course, being such a savvy chap, Jeff Minter realized this.

Jeff wrote in support for an analog rotary controller ... that did not exist. Neither Atari nor third party manufacturers produced such a controller in the Jaguar's heyday. Jeff, as I understand it, hacked his own together by wiring an Atari paddle controller into a Jaguar controller. In the years since the Jaguar's passing, a few small operations have offered modified Jaguar controllers with spinners wired into them for purchase.

Jeff Minter's an interesting historical figure in the computer gaming community, as the author of several 8-bit computer era game classics. I've talked about his long-standing interest in audio visualization here once before. He's still creating games today; his latest is the Xbox Live downloadable title Space Giraffe. Jeff has a blog that he updates fairly regularly.

Still, I'm amazed that Jeff added code to a commercially shipped console game to support a completely optional homebrew spinner controller of his own creation. That's the very definition of "not required". This code lied dormant in the game until a handful of enthusiasts, fourteen years later, cobbled together custom controllers to play the game as it was originally intended by the author.

If that isn't the ultimate case of gold plating your software, I don't know what is. My hat is off to you, Mr. Minter.

Posted by Jeff Atwood    61 Comments

July 14, 2008

Maybe Normalizing Isn't Normal

One of the items we're struggling with now on Stack Overflow is how to maintain near-instantaneous performance levels in a relational database as the amount of data increases. More specifically, how to scale our tagging system. Traditional database design principles tell you that well-designed databases are always normalized, but I'm not so sure.

Dare Obasanjo had an excellent post When Not to Normalize your SQL Database wherein he helpfully provides a sample database schema for a generic social networking site. Here's what it would look like if we designed it in the accepted normalized fashion:

social network database example, normalized

Normalization certainly delivers in terms of limiting duplication. Every entity is represented once, and only once -- so there's almost no risk of inconsistencies in the data. But this design also requires a whopping six joins to retrieve a single user's information.

select * from Users u
inner join UserPhoneNumbers upn
on u.user_id = upn.user_id
inner join UserScreenNames usn
on u.user_id = usn.user_id
inner join UserAffiliations ua
on u.user_id = ua.user_id
inner join Affiliations a
on a.affiliation_id = ua.affiliation_id
inner join UserWorkHistory uwh
on u.user_id = uwh.user_id
inner join Affiliations wa
on uwh.affiliation_id = wa.affiliation_id

(Update: this isn't intended as a real query; it's only here to visually illustrate the fact that you need six joins -- or six individual queries, if that's your cup of tea -- to get all the information back about the user.)

Those six joins aren't doing anything to help your system's performance, either. Full-blown normalization isn't merely difficult to understand and hard to work with -- it can also be quite slow.

As Dare points out, the obvious solution is to denormalize -- to collapse a lot of the data into a single Users table.

Social database example, denormalized

This works -- queries are now blindingly simple (select * from users), and probably blindingly fast, as well. But you'll have a bunch of gaping blank holes in your data, along with a slew of awkwardly named field arrays. And all those pesky data integrity problems the database used to enforce for you? Those are all your job now. Congratulations on your demotion!

Both solutions have their pros and cons. So let me put the question to you: which is better -- a normalized database, or a denormalized database?

Trick question! The answer is that it doesn't matter! Until you have millions and millions of rows of data, that is. Everything is fast for small n. Even a modest PC by today's standards -- let's say a dual-core box with 4 gigabytes of memory -- will give you near-identical performance in either case for anything but the very largest of databases. Assuming your team can write reasonably well-tuned queries, of course.

There's no shortage of fascinating database war stories from companies that made it big. I do worry that these war stories carry an implied tone of "I lost 200 pounds and so could you!"; please assume the tiny-asterisk disclaimer results may not be typical is in full effect while reading them. Here's a series that Tim O'Reilly compiled:

There's also the High Scalability blog, which has its own set of database war stories:

First, a reality check. It's partially an act of hubris to imagine your app as the next Flickr, YouTube, or Twitter. As Ted Dziuba so aptly said, scalability is not your problem, getting people to give a shit is. So when it comes to database design, do measure performance, but try to err heavily on the side of sane, simple design. Pick whatever database schema you feel is easiest to understand and work with on a daily basis. It doesn't have to be all or nothing as I've pictured above; you can partially denormalize where it makes sense to do so, and stay fully normalized in other areas where it doesn't.

Despite copious evidence that normalization rarely scales, I find that many software engineers will zealously hold on to total database normalization on principle alone, long after it has ceased to make sense.

When growing Cofax at Knight Ridder, we hit a nasty bump in the road after adding our 17th newspaper to the system. Performance wasn't what it used to be and there were times when services were unresponsive.

A project was started to resolve the issue, to look for 'the smoking gun'. The thought being that the database, being as well designed as it was, could not be of issue, even with our classic symptom being rapidly growing numbers of db connections right before a crash. So we concentrated on optimizing the application stack.

I disagreed and waged a number of arguments that it was our database that needed attention. We first needed to tune queries and indexes, and be willing to, if required, pre-calculate data upon writes and avoid joins by developing a set of denormalized tables. It was a hard pill for me to swallow since I was the original database designer. Turned out it was harder for everyone else! Consultants were called in. They declared the db design to be just right - that the problem must have been the application.

After two months of the team pushing numerous releases thought to resolve the issue, to no avail, we came back to my original arguments.

Pat Helland notes that people normalize because their professors told them to. I'm a bit more pragmatic; I think you should normalize when the data tells you to:

  1. Normalization makes sense to your team.
  2. Normalization provides better performance. (You're automatically measuring all the queries that flow through your software, right?)
  3. Normalization prevents an onerous amount of duplication or avoids risk of synchronization problems that your problem domain or users are particularly sensitive to.
  4. Normalization allows you to write simpler queries and code.

Never, never should you normalize a database out of some vague sense of duty to the ghosts of Boyce-Codd. Normalization is not magical fairy dust you sprinkle over your database to cure all ills; it often creates as many problems as it solves. Fear not the specter of denormalization. Duplicated data and synchronization problems are often overstated and relatively easy to work around with cron jobs. Disks and memory are cheap and getting cheaper every nanosecond. Measure performance on your system and decide for yourself what works, free of predispositions and bias.

As the old adage goes, normalize until it hurts, denormalize until it works.

Posted by Jeff Atwood    301 Comments

July 12, 2008

Monkeypatching For Humans

Although I love strings, sometimes the String class can break your heart. For example, in C#, there is no String.Left() function. Fair enough; we can roll up our sleeves and write our own function lickety-split:

public static string Left(string s, int len)
{
    if (len == 0 || s.Length == 0)
        return "";
    else if (s.Length <= len)
        return s;
    else
        return s.Substring(0, len);
}

And call it like so:

var s = "Supercalifragilisticexpialidocious";
s = Left(s, 5);

Fairly painless, right?

But with the advent of C# 3.0, there's an even better way -- extension methods. With an extension method, we "extend" the String to add the missing function. The code is fairly similar; I'll highlight the changed parts in red.

public static string Left(this string s, int len)
{
    if (len == 0 || s.Length == 0)
        return "";
    else if (s.Length <= len)
        return s;
    else
        return s.Substring(0, len);
}

And now we can call it as if this very method existed on the String class as shipped:

var s = "Supercalifragilisticexpialidocious";
s = s.Left(5);

Pretty slick. It's difficult not to fall in love with extension methods, as they allow you to mold classes into exactly what you think they should be. This is fairly innocuous in C#, as extension methods only allow you to add new functionality to classes, not override, remove, or replace anything.

But imagine if you could.

Well, that's exactly how it is in other, more dynamic languages such as Javascript, Python, Perl, and Ruby. Something as prosaic as C# extensions is old hat to these folks. In those languages, you could redefine everything in the String class if you wanted to. This is commonly known in dynamic language circles as monkeypatching.

monkey patching

If the idea of monkeypatching scares you a little, it probably should. Can you imagine debugging code where the String class had subtly different behaviors from the String you've learned to use? Monkeypatching can be incredibly dangerous in the wrong hands, as Avdi Grimm notes:

Monkey patching is the new black [in the Ruby community]. It's what all the hip kids are doing. To the point that smart, experienced hackers reach for a monkey patch as their tool of first resort, even when a simpler, more traditional solution is possible.

I don't believe this situation to be sustainable. Where I work, we are already seeing subtle, difficult-to-debug problems crop up as the result of monkey patching in plugins. Patches interact in unpredictable, combinatoric ways. And by their nature, bugs caused by monkey patches are more difficult to track down than those introduced by more traditional classes and methods. As just one example: on one project, it was a known caveat that we could not rely on class inheritable attributes as provided by ActiveSupport. No one knew why. Every Model we wrote had to use awkward workarounds. Eventually we tracked it down in a plugin that generated admin consoles. It was overwriting Class.inherited(). It took us months to find this out.

This is just going to get worse if we don't do something about it. And the "something" is going to have to be a cultural shift, not a technical fix. I believe it is time for experienced Ruby programmers to wean ourselves off of monkey patching, and start demonstrating more robust techniques.

Try to imagine a world where every programmer you know is a wannabe language designer, bent on molding the language to their whims. When I close my eyes and imagine it, I have a vision of the apocalypse, a perfect, pitch-black storm of utterly incomprhensible, pathologically difficult to debug code.

I was just looking at random PHP plugin code the other day, and it was, frankly, crap. But that's because most code is crap. Including my own. It is, sadly, the statistical norm. That's why sites like The Daily WTF are guaranteed to have more material than they can possibly ever publish for the next millennia. (Note to self: invest in this website). I can only imagine what that PHP plugin code would have looked like, had its developer been granted the ability to redefine fundamental PHP keywords and classes at will. These are the sort of thoughts that drive me to drink Bawls. And that stuff is disgusting.

You might say that PHP, sans the fundamental dynamic language ability to monkeypatch, is just another crappy Blub language. But there's also a ton of incredibly useful PHP code out there. So it seems to me that the ability to monkeypatch doesn't stop people from producing a huge volume of useful code, even in a kind of.. horrible language. Some of it is even good!

While I acknowledge the power and utility of dynamic language monkeypatching, I know enough about programmers -- myself absolutely included -- to know the vast majority of us have absolutely no business whatsoever re-designing a programming language. There's a reason some of the most deeply respected computer scientists in the world end up as language designers.

Perhaps then, given the risks, monkeypatching should mean reaching for the meta-hammer as infrequently as humanly possible. This is a position that Avdi himself espouses in a followup comment:

I'm afraid a lot of people have missed the actual meat of my argument -- that dynamic extension of classes is currently overused in Ruby, in ways that are:

  • Needless - another technique (such as a mixin, or locally extending individual objects) would have worked as well or better.
  • Overcomplicated - the use of a monkey patch actually created more work for the author.
  • Fragile - the solution is tightly bound to third-party internals, reducing the usefulness of the plugin or gem because it is prone to breakage.
  • Excessively wide in scope - by hardcoding extensions to core classes, the author takes the choice to scope the change out of the plugin/gem user's hands, further limiting utility.

My point is that there are alternatives - often alternatives which are actually easier to implement and will make your plugin or gem more useful to the user.

While I enjoy the additive nature of C# extensions, even those are enough to make me a little nervous, as mild as they are. Full-blown dynamic language monkeypatching goes even further; it might even be the ultimate expression of programming power. Is there anything more pure and godlike than programming your own programming language?

But if wielding that power doesn't scare and humble you a little, too, then maybe you should leave the monkeypatching to the really smart monkeys.

Posted by Jeff Atwood    142 Comments
Read older entries »
Content (c) 2009 Jeff Atwood. Logo image used with permission of the author. (c) 1993 Steven C. McConnell. All Rights Reserved.