February 22, 2010
The Non-Programming Programmer
I find it difficult to believe, but the reports keep pouring in via Twitter and email: many candidates who show up for programming job interviews can't program. At all. Consider this recent email from Mike Lin:
The article Why Can't Programmers... Program? changed the way I did interviews. I used to lead off by building rapport. That proved to be too time-consuming when, as you mentioned, the vast majority of candidates were simply non-technical. So I started leading off with technical questions. First progressing from easy to hard questions. Then I noticed I identified the rejects faster if I went the other way – hard questions first – so long as the hard questions were still in the "if you don't know this then you can't work here" category. Most of my interviews still took about twenty minutes, because tough questions take some time to answer and evaluate. But it was a big improvement over the rapport-building method; and it could be done over the phone.After reading your article, I started doing code interviews over the phone, using web meetings. My interview times were down to about 15 minutes each to identify people who just can't code— the vast majority.
I wrote that article in 2007, and I am stunned, but not entirely surprised, to hear that three years later "the vast majority" of so-called programmers who apply for a programming job interview are unable to write the smallest of programs. To be clear, hard is a relative term -- we're not talking about complicated, Google-style graduate computer science interview problems. This is extremely simple stuff we're asking candidates to do. And they can't. It's the equivalent of attempting to hire a truck driver and finding out that 90 percent of the job applicants can't find the gas pedal or the gear shift.
I agree, it's insane. But it happens every day, and is (apparently) an epidemic hiring problem in our industry.
You have to get to the simple technical interview questions immediately to screen out the legions of non-programming programmers. Screening over the telephone is a wise choice, as I've noted before. But screening over the internet is even better, and arguably more natural for code.
I still wasn't super-happy with having to start up the web meeting and making these guys share their desktops with me. I searched for other suitable tools for doing short "pen-and-paper" style coding interviews over the web, but I couldn't find any. So I did what any self-respecting programmer would do. I wrote one.Man, was it worth it! I schedule my initial technical screenings with job applicants in 15-minute blocks. I'm usually done in 5-10 minutes, sadly. I schedule an actual interview with them if they can at least write simple a 10-line program. That doesn't happen often, but at least I don't have to waste a whole lot of time anymore.
Mike adds a disclaimer that his homegrown coding interview tool isn't meant to show off his coding prowess. He needed a tool, so he wrote one -- and thoughtfully shared it with us. There might well be others out there; what online tools do you use to screen programmers?
Three years later, I'm still wondering: why do people who can't write a simple program even entertain the idea they can get jobs as working programmers? Clearly, some of them must be succeeding. Which means our industry-wide interviewing standards for programmers are woefully inadequate, and that's a disgrace. It's degrading to every working programmer.
At least bad programmers can be educated; non-programming programmers are not only hopeless but also cheapen the careers of everyone around them. They must be eradicated, starting with simple technical programming tests that should be a part of every programmer interview.
| [advertisement] JIRA 4 - Simplify issue tracking for everyone involved. Get started from $10 for 10 users. |
February 12, 2010
Welcome Back Comments
I apologize for the scarcity of updates lately. There have been two things in the way:
Continuing fallout from International Backup Awareness Day, which meant all updates to Coding Horror from that point onward were hand-edited text files. Which, believe me, isn't nearly as sexy as it … uh … doesn't sound.
I am presenting and conducting a workshop at Webstock 2010 in New Zealand. This is a two week trip I'm taking with the whole family, including our little buddy Rock Hard Awesome, so the preparations have been more intense than usual.
On top of all that, according to the program, I just found that my presentation involves interpretive dance, too. Man. I wish someone had told me! My moves are so rusty, they've barely improved from Electric Boogaloo. But hey, at least I don't have to sing Andrews Sister songs like poor Brian Fling.
And then, of course, there's that crazy Stack Overflow thing I'm always yammering on about. Very busy there, our team is expanding, and we have big plans for this year, too.
But, there is hope!
Thanks to the fine folks at Six Apart -- and more specifically the herculean efforts of one Michael Sippey -- Coding Horror is now hosted in the TypePad ecosystem. Which means, at least in theory, better "cloud" type reliability in the future. (cough)
One accidental bit of collateral damage was that comments, by necessity, were disabled during this two month period. At first, I was relieved. This may seem a bit hypocritical, since I originally wrote A Blog Without Comments is Not a Blog. And I still believe it too. But as I prophetically noted in the very same post:
I am sympathetic to issues of scale. Comments don't scale worth a damn. If you have thousands of readers and hundreds of comments for every post, you should disable comments and switch to forums, probably moderated forums at that. But the number of bloggers who have that level of readership is so small as to be practically nil. And when you get there, believe me, you'll know. Until then, you should enable comments.
I guess you can put this in the "nice problems to have" category, but let me tell you, it's not so nice of a problem when it's on your plate. At a large enough scale, comments require active moderation or they rapidly go sour. People get mean, the crazies come out in full force, and the comments start to resemble an out of control trailer park reality show brawl. It's fun, I suppose, but in a way that drives out all the sane people. Left unchecked, the best you can hope for is to end up head resident at the sanitarium. And that's a hell of a way to go out.
(the above is from Mike Reed's amazing Flame Warriors series, by the way. Well worth your time if you haven't seen it already.)
The degeneration of comments was a shame, because it undermined my claim that comments are awesome.
It's an open secret amongst bloggers that the blog comments are often better than the original blog post, and it's because the community collectively knows far more than you or I will ever know.The best part of a blog post often begins where the blog post ends. If you are offended by that, I humbly submit you don't understand why blogs work.
Why would I have bothered to found Stack Overflow with Joel Spolsky if I didn't believe in the power of community -- that none of us is as dumb as all of us? Honestly, a lot of the design of Stack Overflow comes from my personal observations about how blog comments work. But my creaky old Coding Horror comments offered none of the fancy voting and moderation facilities that make Stack Overflow work. And without ample free personal time and attention from me to weed the comment garden, the comments got out of control.
Most of all, I blame myself.
I got some amazing emails in lieu of comments on my last few blog posts, and it positively kills me that these emails were only seen by two sets of eyes instead of the thousands they deserve. That's a big part of why I hate email silos. And really, email in general.
But there was another unanticipated side effect of having comments disabled that Stéphane Charette pointed out to me in email.
Here is an interesting "silver lining" to the crash you had. Without comments, it forces us, your faithful readers, to think more about what you have to say.In a way, things are back to how your blog used to be. In recent years, the huge influx of comments means that we -- or just I? -- end up spending 1/4 of my time reading what you wrote, and then merging in what everyone else wrote. Depending on how I feel about the topic and your approach to the issue, the weight values may be very different than 50/50. But regardless, I always have to consider when clicking on my Coding Horror bookmark: "Is now the right time to check if he has a new entry? Do I have enough time to read through a hundred comments? Should I wait until later tonight when the kids are in bed to go read his latest article?"
I never thought about it until recently. Your crash is what brought this up to light. Like tonight, when I saw your new headline in my iGoogle page, I didn't have to consider whether or not it was the right time. I read the article, and then thought for myself. I didn't let other people's comments steer my thoughts. How nice!
I'm not certain why it works like this. Often, the sheer number of comments distracts from what you wrote, but for some reason, it is impossible not to at the very least scroll through what people say. In a way, your blog has ended up like a slashdot article, with a paragraph or two of content at the top, and then everyone wanting to insert their $0.02.
Thinking for yourself. Now there's a novel idea. In the reverberating echo chamber that is the internet, I think we would all do well to remind ourselves of that periodically.
He's also right that the psychic burden of all those comments was weighing not just on readers, but on me, the writer, too. That's why I had a false sense of freedom when comments were disabled. You mean I can say whatever I want, and nobody can contradict me underneath my very own post? Revolutionary!
There are some absolute gems of insight and observation in comments, but sometimes extracting them was too much like pulling teeth. At the same time, I felt obligated to read all the comments on every post of mine. If I was asking people to read the random words I'm spewing all over the internet, how could I not extend my commenters the same courtesy? That's just rude.
It seems the only thing worse than comments being on was comments being off. It started to feel empty. As if I was in an enormous room, presenting to an eerily mute audience.
So, while I am very glad to have comments back, and I welcome dialog with the community, there will be … changes. For the benefit of everyone's mental health.
No more anonymous comments. While I would prefer to allow anonymous comments, it is clear that at this scale I don't have time to deal properly with anonymous comments. If you want to say something, you'll need to authenticate. If what you have to say isn't worth authenticating to post, it's probably best for both of us if you keep it to yourself anyway.
The good news is that the TypePad commenting system supports a veritable laundry list of authentication mechanisms -- OpenID (naturally), Twitter, Facebook, Google, Yahoo, and many others. So authenticating to post a comment should only present a mild, but necessary, barrier to conversation.
Comment moderation will be more stringent. If you don't have something useful and reasonably constructive to say in your comment, it will be removed without hesitation. You can be as critical of me (or, better still, my arguments and ideas) as you like, but you must convince me that you're contributing to the conversation and not just yelling at me or anyone else.
I'm not looking for sycophants, but shrill argument is every bit as bad. When you comment here, try to show the class something interesting they can use. That's all I'm asking.
It feels good to be back. Thanks to Six Apart for making it happen.
And, most of all, thanks to you for reading.
| [advertisement] JIRA 4 - Simplify issue tracking for everyone involved. Get started from $10 for 10 users. |
January 25, 2010
Cultivate Teams, Not Ideas
How much is a good idea worth? According to Derek Sivers, not much:
It's so funny when I hear people being so protective of ideas. (People who want me to sign an NDA to tell me the simplest idea.) To me, ideas are worth nothing unless executed. They are just a multiplier. Execution is worth millions.To make a business, you need to multiply the two. The most brilliant idea, with no execution, is worth $20. The most brilliant idea takes great execution to be worth $20,000,000. That's why I don't want to hear people's ideas. I'm not interested until I see their execution.
I was reminded of Mr. Sivers article when this email made the rounds earlier this month:
I feel that this story is important to tell you because Kickstarter.com copied us. I tried for 4 years to get people to take Fundable seriously, traveling across the country, even giving a presentation to FBFund, Facebook's fund to stimulate development of new apps. It was a series of rejections for 4 years. I really felt that I presented myself professionally in every business situation and I dressed appropriately and practiced my presentations. That was not enough. The idiots wanted us to show them charts with massive profits and widespread public acceptance so that they didn't have to take any risks.All it took was 5 super-connected people at Kickstarter (especially Andy Baio) to take a concept we worked hard to refine, tweak it with Amazon Payments, and then take credit. You could say that that's capitalism, but I still think you should acknowledge people that you take inspiration from. I do. I owe the concept of Fundable to many things, including living in cooperative student housing and studying Political Science at Michigan. Rational choice theory, tragedy of the commons, and collective action are a few political science concepts that are relevant to Fundable.
Yes, Fundable had some technical and customer service problems. That's because we had no money to revise it. I had plans to scrap the entire CMS and start from scratch with a new design. We were just so burned out that motivation was hard to come by. What was the point if we weren't making enough money to live on after 4 years?
The disconnect between idea and execution here is so vast it's hard to understand why the author himself can't see it.
I wouldn't call ideas worthless, per se, but it's clear that ideas alone are a hollow sort of currency. Success is rarely determined by the quality of your ideas. But it is frequently determined by the quality of your execution. So instead of worrying about whether the Next Big Idea you're all working on is sufficiently brilliant, worry about how well you're executing.
The criticism that all you need is "super-connected people" to be successful was also leveled at Stack Overflow. In an email to me last year, Andy Baio -- ironically, the very person being cited in the email -- said:
I very much enjoyed the Hacker News conversation about cloning the site in a weekend. My favorite comments were from the people that believe Stack Overflow is only successful because of the Cult of Atwood & Spolsky. Amazing.
I don't care how internet famous you are; nobody gets a pass on execution. Sure, you may have a few more eyeballs at the beginning, but if you don't build something useful, the world will eventually just shrug its collective shoulders and move along to more useful things.
One of my all time favorite software quotes is from Wil Shipley:
This is all your app is: a collection of tiny details.
In software development, execution is staying on top of all the tiny details that make up your app. If you're not constantly obsessing over every aspect of your application, relentlessly polishing and improving every little part of it -- no matter how trivial -- you're not executing. At least, not well.
And unless you work alone, which is a rarity these days, your ability to stay on top of the collection of tiny details that makes up your app will hinge entirely on whether or not you can build a great team. They are the building block of any successful endeavor. This talk by Ed Catmull is almost exclusively focused on how Pixar learned, through trial and error, to build teams that can execute.
It's a fascinating talk, full of some great insights, and you should watch the whole thing. In it, Mr. Catmull amplifies Mr. Sivers' sentiment:
If you give a good idea to a mediocre group, they'll screw it up. If you give a mediocre idea to a good group, they'll fix it. Or they'll throw it away and come up with something else.
Execution isn't merely a multiplier. It's far more powerful. How your team executes has the power to transform your idea from gold into lead, or from lead into gold. That's why, when building Stack Overflow, I was so fortunate to not only work with Joel Spolsky, but also to cherry-pick two of the best developers I had ever worked with in my previous jobs and drag them along with me. Kicking and screaming if necessary.
If I had to point to the one thing that made our project successful, it was not the idea behind it, our internet fame, the tools we chose, or the funding we had (precious little, for the record).
It was our team.
The value of my advice is debatable. But you would do well to heed the advice of Mr. Sivers and Mr. Catmull. If you want to be successful, stop worrying about the great ideas, and concentrate on cultivating great teams.
| [advertisement] JIRA 4 - Simplify issue tracking for everyone involved. Get started from $10 for 10 users. |
January 18, 2010
The Great Newline Schism
Have you ever opened a simple little ASCII text file to see it inexplicably displayed as onegiantunbrokenline?
Opening the file in a different, smarter text editor results in the file displayed properly in multiple paragraphs.
The answer to this puzzle lies in our old friend, invisible characters that we can't see but that are totally not out to get us. Well, except when they are.
The invisible problem characters in this case are newlines.
Did you ever wonder what was at the end of your lines? As a programmer, I knew there were end of line characters, but I honestly never thought much about them. They just … worked. But newlines aren't a universally accepted standard; they are different depending who you ask, and what platform they happen to be computing on:
| DOS / Windows | CR LF | \r\n | 0x0D 0x0A |
| Mac (early) | CR | \r | 0x0D |
| Unix | LF | \n | 0x0A |
The Carriage Return (CR) and Line Feed (LF) terms derive from manual typewriters, and old printers based on typewriter-like mechanisms (typically referred to as "Daisywheel" printers).
On a typewriter, pressing Line Feed causes the carriage roller to push up one line -- without changing the position of the carriage itself -- while the Carriage Return lever slides the carriage back to the beginning of the line. In all honesty, I'm not quite old enough to have used electric typewriters, so I have a dim recollection, at best, of the entire process. The distinction between CR and LF does seem kind of pointless -- why would you want to move to the beginning of a line without also advancing to the next line? This is another analog artifact, as Wikipedia explains:
On printers, teletypes, and computer terminals that were not capable of displaying graphics, the carriage return was used without moving to the next line to allow characters to be placed on top of existing characters to produce character graphics, underlines, and crossed out text.
So far we've got:
- Confusing terms based on archaic hardware that is no longer in use, and is confounding to new users who have no point of reference for said terms;
- Completely arbitrary platform "standards" for what is exactly the same function.
Pretty much business as usual in computing. If you're curious, as I was, about the historical basis for these decisions, Wikipedia delivers all the newline trivia you could possibly want, and more:
The sequenceCR+LFwas in common use on many early computer systems that had adopted teletype machines, typically an ASR33, as a console device, because this sequence was required to position those printers at the start of a new line. On these systems, text was often routinely composed to be compatible with these printers, since the concept of device drivers hiding such hardware details from the application was not yet well developed; applications had to talk directly to the teletype machine and follow its conventions. The separation of the two functions concealed the fact that the print head could not return from the far right to the beginning of the next line in one-character time. That is why the sequence was always sent with the CR first. In fact, it was often necessary to send extra characters (extraneous CRs or NULs, which are ignored) to give the print head time to move to the left margin. Even after teletypes were replaced by computer terminals with higher baud rates, many operating systems still supported automatic sending of these fill characters, for compatibility with cheaper terminals that required multiple character times to scroll the display.CP/M's use of
CR+LFmade sense for using computer terminals via serial lines. MS-DOS adopted CP/M'sCR+LF, and this convention was inherited by Windows.
This exciting difference in how newlines work means you can expect to see one of three (or more, as we'll find out later) newline characters in those "simple" ASCII text files.
If you're fortunate, you'll pick a fairly intelligent editor that can detect and properly display the line endings of whatever text files you open. If you're less fortunate, you'll see onegiantunbrokenline, or a bunch of extra ^M characters in the file.
Even worse, it's possible to mix all three of these line endings in the same file. Innocently copy and paste a comment or code snippet from a file with a different set of line endings, then save it. Bam, you've got a file with multiple line endings. That you can't see. I've accidentally done it myself. (Note that this depends on your choice of text editor; some will auto-normalize line endings to match the current file's settings upon paste.)
This is complicated by the fact that some editors, even editors that should know better, like Visual Studio, have no mode that shows end of line markers. That's why, when attempting to open a file that has multiple line endings, Visual Studio will politely ask you if it can normalize the file to one set of line endings.
This Visual Studio dialog presents the following five (!) possible set of line endings for the file:
- Windows (CR LF)
- Macintosh (CR)
- Unix (LF)
- Unicode Line Separator (LS)
- Unicode Paragraph Separator (PS)
The last two are new to me. I'm not sure under what circumstances you would want those Unicode newline markers.
Even if you rule out unicode and stick to old-school ASCII, like most Facebook relationships … it's complicated. I find it fascinating that the mundane ASCII newline has so much ancient computing lore behind it, and that it still regularly bites us in unexpected places.
If you work with text files in any capacity -- and what programmer doesn't -- you should know that not all newlines are created equally. The Great Newline Schism is something you need to be aware of. Make sure your tools can show you not just those pesky invisible white space characters, but line endings as well.
| [advertisement] JIRA 4 - Simplify issue tracking for everyone involved. Get started from $10 for 10 users. |
January 10, 2010
A Democracy of Netbooks
As a long time reader of Joey DeVilla's excellent blog, Global Nerdy, I take exception to his post Fast Food, Apple Pies, and Why Netbooks Suck:
The end result, to my mind, is a device that occupies an uncomfortable, middle ground between laptops and smartphones that tries to please everyone and pleases no one. Consider the factors:
To summarize: Slightly bigger and pricier than a phone, but can't phone. Slightly smaller and cheaper than a laptop, but not that much smaller or cheaper. To adapt a phrase I used in an article I wrote yesterday, netbooks are like laptops, but lamer.
- Size: A bit too large to go into your pocket; a bit too small for regular day-to-day work.
- Power: Slightly more capable than a smartphone; slightly less capable than a laptop.
- Price: Slightly higher than a higher-end smartphone but lacking a phone's capability and portability; slightly lower than a lower-end notebook but lacking a notebook's speed and storage.
This is so wrongheaded I am not sure where to begin. I happen to agree with Dave Winer's definition of "netbook":
- Small size.
- Low price.
- Battery life of 4+ hours. Battery can be replaced by user.
- Rugged.
- Built-in wifi, 3 USB ports, SD card reader.
- Runs my software.
- Runs any software I want; no platform vendor to decide what's appropriate.
- Competition. Users have choice and can switch vendors at any time.
Netbooks are the endpoint of four decades of computing -- the final, ubiquitous manifestation of "A PC on every desk and in every home". But netbooks are more than just PCs. If the internet is the ultimate force of democratization in the world, then netbooks are the instrument by which that democracy will be achieved.
No monthly fees and contracts.
No gatekeepers.
Nobody telling you what you can and can't do with your hardware, or on their network.
To dismiss netbooks as like laptops, but lamer is to completely miss the importance of this pivotal moment in computing -- when pervasive internet and the mass production of inexpensive portable computers finally intersected. I'm talking about unlimited access to the complete sum of human knowledge, and free, unfettered communication with anyone on earth. For everyone.
It's true that smartphones are slowly becoming little PCs, but they will never be free PCs. They will forever be locked behind an imposing series of gatekeepers and toll roads and walled gardens. Anyone with a $199 netbook and access to the internet can make free Skype videophone calls to anywhere on Earth, for as long as they want. Meanwhile, sending a single text message on a smartphone costs 4 times as much as transmitting data to the Hubble space telescope.
I don't care how "smart" your smartphone is, it will never escape those corporate shackles. Smartphones are simply not free enough to deliver the type of democratic transformation that netbooks -- mobile PCs cheap enough and fast enough and good enough for everyone to afford -- absolutely will.
That's why I love netbooks. In all their cheap, crappy glory. And you should too. Because they're instruments of user power.
The truly significant thing is this -- the users took over.Let me say that again: The users took over.
I always say this is the lesson of the tech industry, but the people in the tech industry never believe it, but this is the loop. In the late 70s and early 80s the minicomputer and mainframe guys said the same kinds of things about Apple IIs and IBM PCs that Michael Dell is saying about netbooks. It happens over and over again, I've recited the loops so many times that every reader of this column can recite them from memory. All that has to be said is that it happened again.
Once out, the genie never goes back in the bottle.
Netbooks aren't an alternative to notebook computers. They are the new computers.
Cheap and crappy? Maybe those early models were, but having purchased a new netbook for $439 shipped, it is difficult for me to imagine the average user ever paying more than $500 for a laptop.
For the price, this is an astonishingly capable PC:
- Dual Core 1.2 GHz Intel CULV Celeron processor
- 2 GB RAM
- Windows 7 Home Premium
- 11.6" screen with 1366 x 768 resolution
- Thin (1") and light (3.5 lbs)
- Good battery life (5 hours)
- 3 USB ports, WiFi, webcam, gigabit ethernet
Windows 7 is a fine OS, but this machine would surely be cheaper without the Microsoft Tax, too.
The Acer Aspire 1410 isn't just an adequate netbook, it's a damn good computer. At these specifications, it is a huge step up from those early netbook models in every way. But don't take my word for it; read the reviews at netbooked and Liliputing. (Caveat emptor -- there are lots of 1410 models, and the newer dual core CPU version is the one you want.)
Of particular note is the CPU. While the Intel Atom is a technological coup, I don't feel current Atom CPUs deliver quite enough performance for a modern, JavaScript-heavy, video-intensive internet experience. It is quite clear that Intel is intentionally hobbling newer iterations of the Atom CPU in the name of market segregation, and to prevent too much netbook price erosion.
That's why the current Intel CULV CPUs are far more attractive options -- they're dramatically faster, and have become power-efficient marvels. I hooked up my watt meter to this Aspire 1410 and I was surprised to find it consume between 13 and 16 watts of power in typical use -- while my wife was browsing the web in Firefox, over a wireless connection, with multiple tabs open. I fired up Prime95 torture test to force the CPU to 100% load, and measured 21 watts with one CPU core fully loaded, and 26 watts when both were. These are wall measurements which reflect power conversion inefficiencies of at least 20%, so real consumption was between 10 and 20 watts. I was wondering why it ran so cool; now I know. It barely uses enough power to generate any heat!
Modern netbooks are not cheap and crappy. They're remarkable computers in their own right, and they're getting better every day. Which makes me wonder:
A recurring question among Apple watchers for decades has been, “When is Apple going to introduce a low-cost computer?Steve Jobs answered that decades-old complaint by stating, "We don't know how to build a sub-$500 computer that is not a piece of junk."
They may be pieces of junk to Mr. Jobs, but to me, these modest little boxes are marvels -- inspiring evidence of the inexorable march of powerful, open computing technology to everyman and everywhere.
We have produced a democracy of netbooks. And the geek in me can't wait to see what happens next.
| [advertisement] JIRA 4 - Simplify issue tracking for everyone involved. Get started from $10 for 10 users. |
December 29, 2009
Responsible Open Source Code Parenting
I'm a big fan of John Gruber's Markdown. When it comes to humane markup languages for the web, I don't think anyone's quite nailed it like Mr. Gruber. His philosophy was clear from the outset:
Markdown is intended to be as easy-to-read and easy-to-write as is feasible.Readability, however, is emphasized above all else. A Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions. While Markdown’s syntax has been influenced by several existing text-to-HTML filters — including Setext, atx, Textile, reStructuredText, Grutatext, and EtText — the single biggest source of inspiration for Markdown’s syntax is the format of plain text email.
If you're an ASCII-head of any kind, you will feel immediately at home in Markdown. It was so obviously designed by someone who has done a lot of writing online, as it apes common plaintext conventions that we've collectively been using for decades now. It's certainly far more intuitive than the alternatives I've researched.
With a year and a half of real world Markdown experience under our belts on Stack Overflow, we've been quite happy. I'd say that Markdown is the worst form of markup except for all the other forms of markup that I've tried. Of course, tastes vary, and there are plenty of viable alternatives, but I'd promote Markdown without hesitation as one of the best -- if not the best -- humane markup options out there.
Not that Markdown is perfect, mind you. After exposing it to a large audience, both Stack Overflow and GitHub independently discovered that Markdown had three particular characteristics that confused a lot of users:
- URLs are never hyperlinked without placing them in some kind of explicit markup.
- The underscore [_] can be used to delimit bold and italic, but also works for intra-word emphasis. While a typical use like "_italic_" is clear, there are disturbing and unexpected side effects in phrases such as "some_file_name" and "file_one and file_two".
- It is paragraph and not line oriented. Returns are not automatically converted to linebreaks. Instead, paragraphs are detected as one or more consecutive lines of text, separated by one or more blank lines.
Items #1 and #2 are so fundamental and universal that I think they deserve to be changed in the Markdown specification itself. There was so much confusion around unexpected intra-word emphasis and the failure to auto-hyperlink URLs that we changed these Markdown rules before we even came out of private beta. Item #3, the conversion of returns to linebreaks, is somewhat more debatable. I'm on the fence on that one, but I do believe it's significant enough to warrant an explicit choice either way. It should be a standard configurable option in every Markdown implementation that you can switch on or off depending on the intended audience.
Markdown was originally introduced in 2004, and since then it has gained quite a bit of traction on the web. I mean, it's no MediaWiki (thank God), but it's in active use on a bunch of websites, some of them quite popular. And for such a popular form of markup, it's a bit odd that the last official update to the specification and reference implementation was in late 2004.
Which leads me to the biggest problem with Markdown: John Gruber.
I don't mean this as a personal criticism. John's a fantastic writer and Markdown has a (mostly) solid specification, with a strong vision statement. But the fact that there has been no improvement whatsoever to the specification or reference implementation for five years is … kind of a problem.
There are some fairly severe bugs in that now-ancient 2004 Markdown 1.0.1 Perl implementation. Bugs that John has already fixed in eight 1.0.2 betas that have somehow never seen the light of day. Sure, if you know the right Google incantations you can dig up the unreleased 1.0.2b8 archive, surreptitiously posted May 2007, and start prying out the bugfixes by hand. That's what I've had to do to fix bugs in our open sourced C# Markdown implementation, which was naturally based on that fateful (and technically only) 1.0.1 release.
I'd also expect a reference implementation to come with some basic test suites or sample input/output files so I can tell if I've implemented it correctly. No such luck; the official archives from Gruber's site include the naked Perl file along with a readme and license. The word "test" does not appear in either. I had to do a ton more searching to finally dig up MDTest 1.1. I can't quite tell where the tests came from, but they seem to be maintained by Michel Fortin, the author of the primary PHP Markdown implementation.
But John Gruber created Markdown. He came up with the concept and the initial implementation. He is, in every sense of the word, the parent of Markdown. It's his baby.
As Markdown's "parent", John has a few key responsibilities in shepherding his baby to maturity. Namely, to lead. To set direction. Beyond that initial 2004 push, he's done precious little of either. John is running this particular open source project the way Steve Jobs runs Apple -- by sheer force of individual ego. And that sucks.
Since then, all I can find is sporadic activity on obscure mailing lists and a bit of passive-aggressive interaction with the community.
On 15 Mar 2008, at 02:55, John Gruber wrote:I despise what you've done with Text::Markdown, which is to more or less make it an alias for MultiMarkdown, almost every part of which I disagree with in terms of syntax additions.Wow, that's pretty strong language. I'm glad I'm provoking strong opinions, and it's nice to see you actively contributing to Markdown's direction ;)Personally, I don't actually like (or use) the MultiMarkdown extensions. As noted several times on list, I do not consider what I've done to in any way be a good solution technically / internally in it's current form, and as such Markdown.pl is still a better 'reference' implementation.
However I find it somewhat ironic that you can criticise an active effort to actually move Markdown forwards (who's current flaws have been publicly acknowledged), when it passes more of your test suite than your effort does, and when you haven't even been bothered to update your own website about the project since 2004, despite having updated the code which can be found on your site (if you dig) much more recently than this.
I despise copy-pasted code, and forks for no (real) reason - seeing another two dead copies of the same code on CPAN made me sad, and so I've done something to take the situation forwards. Maybe if you'd put the effort into maintaining a community and taking Markdown.pl forwards at any time within the last 4 years, you wouldn't be in a situation where people have taken 'your baby' and perverted it to a point that you despise. If starting with Markdown.pl and going forwards with that had been an option, then that would have been my preferred route - but I didn't see any value in producing what would have been a fifth perl Markdown implementation.
It's almost at the point where John Gruber, the very person who brought us Markdown, is the biggest obstacle preventing Markdown from moving forward and maturing. It saddens me greatly to see such negligent open source code parenting. Why work against the community when you can work with it? It doesn't have to be this way. And it shouldn't be.
I think the most fundamental problem with Markdown, in retrospect, is that the official home of the project is a static set of web pages on John's site. Gruber could have hosted the Markdown project in a way that's more amenable to open source collaboration than a bunch of static HTML. I'm pretty sure SourceForge was around in late 2004, and there are lots of options for proper open source project hosting today -- GitHub, Google Code, CodePlex, and so forth. What's stopping him from setting up shop on any of those sites with Markdown, right now, today? Markdown is Gruber's baby, without a doubt, but it's also bigger than any one person. It's open source. It belongs to the community, too.
Right now we have the worst of both worlds. Lack of leadership from the top, and a bunch of fragmented, poorly coordinated community efforts to advance Markdown, none of which are officially canon. This isn't merely incovenient for anyone trying to find accurate information about Markdown; it's actually harming the project's future. Maybe it's true that you can't kill an open source project, but bad parenting is surely enough to cause it to grow up stunted and maybe even a little maladjusted.
I mean no disrespect. I wouldn't bring this up if I didn't care, if I didn't think the project and John Gruber were both eminently worthy. Markdown is a small but important part of the open source fabric of the web, and the project deserves better stewardship. While the community can do a lot with the (many) open source orphan code babies out there, they have a much, much brighter future when their parents take responsibility for them.
| [advertisement] JIRA 4 - Simplify issue tracking for everyone involved. Get started from $10 for 10 users. |
December 17, 2009
Building a PC, Part VI: Rebuilding
I can't believe it's been almost two and a half years since I built my last PC. I originally documented that process in a series of posts:
- Building a PC, Part I: Minimal boot
- Building a PC, Part II: Burn in
- Building a PC, Part III: Overclocking
- Building a PC, Part IV: Now It's Your Turn
- Building a PC, Part V: Upgrading
Now, lest you think I am some kind of freakish, cave-dwelling luddite, what with my ancient two and a half year old PC, I have upgraded the CPU, upgraded the hard drive, and upgraded the video card since then. I also went from 4 GB of RAM to 8 GB of RAM, but I didn't happen to blog about that. Normal computers age in dog years -- every year they get seven years older -- but mine isn't so bad with all my upgrades! I swear!
Judge for yourself; here's a picture of it.
But seriously.
A big part of the value proposition of building your own PC is upgrading it in pieces and parts over time. When you're unafraid to pop the cover off and get your hands dirty with a little upgrading, you can spend a lot less to stay near the top of the performance heap over time. It's like the argument for buying a car versus renting it; the smart buyers keep the car for as long as possible to maximize the value of their investment. That's what we're doing here with our upgrades, and a rebuild is the ultimate upgrade.
In defense of my creaky old computer, the Core 2 series from Intel has been unusually strong over time, one of their best overall platforms in recent memory. It was almost good enough to banish the excerable Pentium 4 series from my mind. Man those were horrible! But the Core 2 series was a solid design with some serious legs; it and scaled brilliantly, from single to dual to quad core, and in frequency from 1 GHz to 3.5 GHz.
I was initially unimpressed with the new Core i7 architecture that Intel launched to replace the Core 2. While the new Nehalem architecture is a huge win on servers, it's kind of "meh" on the desktop. I have endless battles with overzealous developers who swear up and down that they use their desktops like servers. Sure you do! And you're building the space shuttle with it, right? Of course you are. Yeah.
Meanwhile, back on Planet Desktop, there were some other reasons that I started thinking seriously about upgrading from my overclocked Core 2 Duo to a Core i7 upgrade:
- The Core i7 platform uses triple channel DDR3 memory. While the benefits of the additional bandwidth are somewhat debatable on the desktop (as usual), one interesting side-effect is that motherboards have 6 memory slots. While 16 GB is theoretically possible on Core 2 systems, it required extremely expensive 8 GB DIMMs. But with 6 memory slots, we can achieve 12 GB without breaking the bank -- by using six 2 GB DIMMs.
- The Core i7 is Intel's first "real" quad-core architecture. Intel's previous quad core CPUs were basically two dual core CPUs duct taped together on the same die. No such shortcuts were taken with the i7. While the difference is sort of academic, there are some smallish real world performance implications.
- Mainstream software is finally ready for quad core CPUs. It's not uncommon today to find applications and games that can actually use two CPU cores reasonably effectively, and those that can use four or more cores are not the extreme rarity they used to be. Don't get me wrong, scaling well to four or more CPU cores is still rare, but it's no longer spit-take rare.
- Intel introduced the mainstream second generation Core i5 series, so the platform is fairly mature. All the new architecture bugs are worked out. It's also less prohibitively expensive than it was when it was when it was introduced.
At this point, I had the seven year upgrade itch really bad. My 3.8 GHz Core 2 Duo with 8 GB of RAM was not exactly chopped liver, but I started fantasizing a lot about the prospect of having a next generation quad-core CPU (of similar clock speed) with hyperthreading and 12 GB of RAM.
If you're wondering why I need this, or why in fact anyone would need such an embarrassment of desktop power, then I'll refer you to my friend Loyd Case.
Don’t ask me why I need six cores and 24GB. To paraphrase a Zen master, if you have to ask, you do not know.
Loyd has indirectly brought up another reason to choose the i7 platform; it's pin-compatible with Intel's upcoming "Gulftown" high end 6-core CPU. So, your upgrade path is clear. (It's also rumored that the next iteration of the Mac Pro will have two of these brand new 6-core CPUs, before any other vendor gets access to them, which is totally plausible.)
As far as I'm concerned, until everything on my computer happens instantaneously, my computer is not nearly fast enough. Besides, relative to how much my time costs, these little $200-$500 upgrades to get amazing performance are freakin' chump change. If I save a measly 15 minutes a day, it's worth it. As I like to remind pointy-haired managers all over the world, Hardware is Cheap, and Programmers are Expensive. OK, maybe I'm biased, but the conclusion was overwhemingly clear: it's UPGRAYEDD time!
This is a more than an upgrade, though, it's a rebuild -- a platform upgrade. That means I'll be assembling the following …
- new Motherboard
- new RAM
- new CPU
- new heatsink
… and dropping that into my existing system, which is highly optimized for silence. The case, power supply, hard drives, DVD-R, etc won't change. On the outside, it'll look the same, but on the inside, it's a whole new PC. This is analogous to replacing the engine in a sports car, I suppose. On the outside, it will appear to be the same car, but there's a lot more horses under the hood.
As I said in the first part of my building your own PC series, if you can assemble a LEGO kit, you can build a PC.
Take your time, be careful, and go in the right order. So, first things first. Let's assemble the CPU, heatsink, and memory on the motherboard -- in that specific sequence, because modern heatsinks can be a pain to attach.
Man, check out at all that hot, sweet, PC hardware! I get a little residual thrill just cropping the picture. Love this stuff! Anyway, that gives us a mountable motherboard with all the important bits pre-installed:
- ASRock X58 Extreme motherboard ($169)
Inexpensive, has all the essential features I care about, and is recommended by Tom's Hardware. I'm not into fancy, spendy motherboards; I think they're a ridiculous waste of money. - XIGMATEK HDT-S1283 cooler ($35).
Direct contact between the CPU cooler heatpipes and the CPU surface is the new hotness, or rather, coolness. It really works, since all the top performing CPU coolers use it now. This one is fairly inexpensive at $35 and gets great reviews. Also, I highly recommend the optional screw mount kit ($8). Modern CPU coolers are large, and the mounting mechanism needs to be more solid than plastic pushpins. - Kingston HyperX 4GB (2 x 2GB) DDR3 2000 ($135) × 3
I've had good luck with Kingston in the past. I went with their semi-premium brand this time, as I plan to do a bit of overclocking and the price difference is fairly small. Remember, this is a 12 GB build, so we'll need three of these kits to populate all 6 memory slots on the motherboard. - Intel Core i7-960 3.2 GHz CPU ($590)
While you could make a very solid argument that the Core i7-920 CPU ($289) is a better choice because it's identical and overclocks to the same level, I was willing to spend a bit more here as "insurance" that I get to the magical 3.8 Ghz level that my old Core 2 Duo was overclocked to.
update: since a few people asked, here are my case and power supply recommendations.
- Antec P183 Black Computer Case ($140)
I used the older P180/P182 Antec case in my original series; it's still one of my favorites. This version brings some much needed improvements to airflow to accommodate higher power CPUs and video cards, as documented in a recent Silent PC review article. - CORSAIR CMPSU-650HX 650W Power Supply ($120)
You don't want to skimp on the power supply, but there's no need to spend exorbitant amounts, either. Forget the wattage rating and look at the quality. Corsair is known for very high quality power supplies. The HX series is a bit more, but has modular cables, which makes for a cleaner build.
It adds up to about $1000 all told. A rebuild is definitely more expensive than one-off upgrades of CPU, memory, and hard drive. But, remember, this is a rebuild of my PC -- and a fire-breathing, top of the line performance rebuild at that. That takes spending a moderate (but not exorbitant) amount of money.
Now that we've got all that stuff assembled, the next thing to do is open my existing PC, disconnect all the cables going to the motherboard, temporarily remove any expansion cards, unscrew the motherboard and lift it out.
Once the old motherboard assembly was pulled out, I plopped in the new motherboard, screwed it down, and reattached the cables and expansion cards. Don't close up the PC at this point, though. Before powering it on, double check and make sure all the cables are reattached correctly:
- Power cables from the PSU to the motherboard. There are usually at least two, on modern PCs.
- Hard drive cables from the HDDs to the motherboard.
- Power switch, Reset switch, Activity light cables. Without the power switch connected, good luck powering up. This motherboard happens to have built-in power and reset switches for testing, but most don't.
- Fan connectors from the Heatsink and case fans to the motherboard.
- Power cables from the PSU to the video card, if you have a fancy video card.
If anything is wrong, we'll just have to re-open the case again. On top of that, we need to monitor temperatures and airflow, and that's much easier with the case open.
Fortunately, my rebuild booted up on the first try. If you're not so lucky, don't fret! Disconnect the power cord, then go back and re-check everything. I get it wrong, sometimes, too; I actually forgot to reconnect the video card power connectors, and was wondering why only the secondary video card was booting up. Once I re-checked, I immediately saw my mistake, fixed it, and rebooted.
Once you have a successful boot, don't even think about booting into the operating system yet. Enter the BIOS (this is typically done by pressing F12 or Delete during bootup) and check the BIOS screens to make sure it's detecting your hard drives, memory, and any optical drives successfully. Browse around and do some basic reality checks. Then do not pass GO, do not collect $200, go straight to your motherboard manufacturer's website and download the latest BIOS. On another computer, obviously. Most modern motherboards allow updating the BIOS from a USB key, so just copy the BIOS files on the USB key, reboot, and use the BIOS menus to update. After you've updated the BIOS, set BIOS options to taste, and we're finally ready to boot into an operating system.
While this may sound like a lot of work, it really isn't. All told it was maybe an hour, tops. I'm fairly experienced at this stuff, but it's fundamentally not that complicated; it's still just a very fancy adult LEGO kit.
Courtesy of this $1000 rebuild, my ancient 2.5 year old PC is reborn as a completely new state-of-the-art PC, at least internally. That was always part of the plan! Next up -- once we've proven that it's stable in typical use -- overclocking, naturally. I'll have more on that in a future blog post, but I can tell you right now that Core i7 overclocking is … interesting.
| [advertisement] JIRA 4 - Simplify issue tracking for everyone involved. Get started from $10 for 10 users. |
December 14, 2009
International Backup Awareness Day
You may notice that commenting is currently disabled, and many old Coding Horror posts are missing images. That's because, sometime early on Friday, the server this blog is hosted on suffered catastrophic data loss.
Here's what happened:
- The server experienced routine hard drive failure.
- Because of the hard drive failure, the virtual machine image hosting this blog was corrupted.
- Because the blog was hosted in a virtual machine, the standard daily backup procedures at the host were unable to ever back it up.
- Because I am an idiot, I didn't have my own (recent) backups of Coding Horror. Man, I wish I had read some good blog entries on backup strategies!
- Because there were no good backups, there was catastrophic data loss. Fin, draw curtain, exeunt stage left.
At first, I was upset with our provider, CrystalTech.
I am still confused how the most common, routine, predictable, and mundane of server hardware failures -- losing a mechanical hard drive -- could cause such extreme data loss carnage. What about, oh, I don't know, a RAID array? Aren't they designed to prevent this kind of single point of failure drive loss catastrophe? Isn't a multi drive RAID array sort of standard on datacenter servers? I know we have multi-drive RAID arrays on all of our Stack Overflow servers.
I also wish their routine backup procedures had greater awareness of virtual machine images. While I'll grant you that backing up a live virtual machine is somewhat complex, and typically requires special operating system support and API hooks, it is not exactly an unknown science at this point in time. Heck, at the very least, just let us know when the backup has been regularly failing each day, every day, for years.
Then I belatedly realized that this was, after all, my data. And it is irresponsible of me to leave the fate of my data entirely in someone else's hands, regardless of how reliable they may or may not be. Responsibility for my data begins with me. If I haven't taken appropriate measures, who am I to cast aspersions on others for not doing the same? Glass houses and all that.
So, I absolve CrystalTech of all responsibility in this matter. They've given us a great deal on our dedicated server, and performance and reliability (with one recent, uh... exception) have been excellent to date. It is completely my fault that I neglected to have proper backups in place for Coding Horror. Well, technically, I did have a backup but it was on the virtual machine itself. Does that count? No? Halfsies?
Apparently, I was gambling that nothing bad would ever happen at the datacenter. Because that's what you're doing when you run without your own backups. Gambling.
I'll add gambling to the long, long list of things I suck at. I don't know when to hold 'em or when to fold 'em.
Now that I've apologized, it's time to let the healing begin. And by healing, I mean the excruciatingly painful process of reconstructing Coding Horror from internet caches and the few meager offsite backups I do have. My first order of business was to ask on SuperUser what strategies people recommend for recovering a lost website with no backup. Strategies other than berating me for my obvious mistake. Also, comments are currently disabled while the site is being reconstructed from static HTML. Oh, darn!
I'll let my son Rock Hard Awesome stand in for the zinger of a comment that I know some of you were just dying to leave.
I'm not saying I don't deserve it. Consider me totally zingatized.
I mentioned my woes on Twitter and I was humbled by the outpouring of community support. Thanks to everyone who reached out with support of any kind. It is greatly appreciated.
I was able to get a static HTML version of Coding Horror up almost immediately thanks to Rich Skrenta of blekko.com. He kindly provided a tarball of every spidered page on the site. Some people have goals, and some people have big hairy audacious goals. Rich's is especially awe-inspiring: taking on Google on their home turf of search. That's why he just happened to have a complete text archive of Coding Horror at hand. Rich, have I ever told you that you're my hero? Anyway, you're viewing the static HTML version of Coding Horror right now thanks to Rich. Surprisingly, there's not a tremendous amount of difference between a static HTML version of this site and the live site. One of the benefits of being a minimalist, I suppose.
That pretty much solved all my text post recovery problems in one fell swoop. Through this process, I've learned that anything even remotely popular you put on the web will be archived as text, forever, by a dozen different web spiders. I don't think you can actually lose text you post on the web. Not in any meaningful sense; I'm not sure it's possible. As long as you're willing to spend the time digging through web spider archives in some form (and yes, I did cheat mightily), you can always get textual content back, all of it.
The blog images, however, are another matter entirely. I have learned the hard way that there are almost no organizations spidering and storing images on the web. Yes, there is archive.org, and God bless 'em for that. But they have an impossible job they're trying to do with limited resources. Beyond that, there's ... well, frankly, a whole lot of nothing. A desperate, depressing void of nothing. In fact, if you can only back up one thing on your public website, it should be the images. Because that's the thing you'll have the most difficulty recovering when catastrophe happens. I'm planning to donate $100 to archive.org as I have a whole new appreciation for how rare an internet-wide full archive service -- one that includes images -- really is.
That said, There are some limited, painful avenues to explore for recovering lost website images. I started with an ancient complete backup from mid 2006 with full images. And then Maciej Ceglowski of the nifty full-archive bookmarking service pinboard.in generously contributed about 200 blog posts that he had images for.
I also went through a period when I was going on a bandwidth diet and experimenting with hosting Coding Horror images elsewhere on the web. I'm slowly going through and recovering images locally from there. Beyond that, several avid Coding Horror readers contributed some archived images -- so thanks to Yasushi Aoki, Marcin Gołębiowski, Peter Mortensen, and anybody else I've forgotten.
Also, I should point out that a few enterprising programmers have proposed clever schemes for automatic recovery of images, such as Niyaz with his blog post Get cached images from your visitors, and John Siracusa with his highly voted 304 idea. I haven't had time to follow up on these yet but they seem plausible to me.
I've restored all the images I have so far, but it's still woefully incomplete. The most important part of Coding Horror is definitely the text of the posts, but I do have some regrets that I've lost key images from many blog posts, including those about my son. It feels like irresponsible parenting, in the broadest possible sense of the words.
The process of image recovery is still ongoing. If you'd like to contribute lost Coding Horror images, please do. I'd be more than happy to mail stickers on my dime to anyone who contributes an image that is currently a 404 on the site. Update: That was fast. Carmine Paolino, a computer science student at the University of Bologna, somehow had a nearly complete mirror of the site backed up on his Mac! Thanks to his mirror, we've now recovered nearly 100% of the missing images and content. I've offered to donate $100 to the charity or open source project of Carmine's choice.
What can we all learn from this sad turn of events?
- I suck.
- No, really, I suck.
- Don't rely on your host or anyone else to back up your important data. Do it yourself. If you aren't personally responsible for your own backups, they are effectively not happening.
- If something really bad happens to your data, how would you recover? What's the process? What are the hard parts of recovery? I think in the back of my mind I had false confidence about Coding Horror recovery scenarios because I kept thinking of it as mostly text. Of course, the text turned out to be the easiest part. The images, which I had thought of as a "nice to have", were more essential than I realized and far more difficult to recover. Some argue that we shouldn't be talking about "backups", but recovery.
- It's worth revisiting your recovery process periodically to make sure it's still alive, kicking, and fully functional.
- I'm awesome! No, just kidding. I suck.
So when, exactly, is International Backup Awareness Day? Today. Yesterday. This week. This month. This year. It's a trick question. Every day is International Backup Awareness Day. And the sooner I figure that out, the better off I'll be.
| [advertisement] JIRA 4 - Simplify issue tracking for everyone involved. Get started from $10 for 10 users. |
December 10, 2009
Microformats: Boon or Bane?
I recently added microformat support to the free public CVs at careers.stackoverflow.com by popular demand.
Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards.
The official microformat "elevator pitch" tells us nothing useful. That's not a good sign. It doesn't get much better on the learn more link, either.
I'm left scratching my head, wondering why I should care. What problem, exactly, do microformats solve for me as a user? As a software developer? There's lots of hand-wavy talk about data, but precious little in the way of concrete stories or real world examples.
But I have a real world example: a CV. To some human resource departments the standard web interchange format for a CV or Resume is already established -- it's called "Microsoft Word". I have no beef with Word, but certainly we'd like to pick a more simple, open data format for our personal data than Microsoft Word -- and the hResume microformat seems to fit the bill. And if your CV is published on the web in a standard(ish) format, it's easier to take it with you wherever you need to go.
I had already implemented the tag and identity microformats on Stack Overflow many months ago. I wasn't convinced of the benefits, but the implementation was so easy that it seemed like more work to argue the point than to actually get it done. Judge for yourself:
<a href="http://www.codinghorror.com/" rel="me">codinghorror.com</a> <a href="/questions/tagged/captcha" rel="tag">captcha</a>
Fairly clean and simple, right? That was the extent of my experience with microformats. Limited, but positive. Then I read through the hResume microformat spec. You should read it too. Go ahead. I'll wait here.
My first impression was not positive, to put it mildly. So you want me to take the ambiguous, crappy "HTML" markup we already have and layer some ambiguous, crappy "microformat" markup on top of it? And that's … a solution? If that's what microformats are going to be about, I think I might want off the microbus.
Let's take a look at a representative slice of hResume markup:
<div class="vcard">
<a class="fn org url" href="http://example.com/">Example</a>
<div class="adr">
<span class="type">Work</span>:
<div class="street-address">169 Maple Ave</div>
<span class="locality">Anytown</span>,
<abbr class="region" title="Iowa">IA</abbr>
<span class="postal-code">50981</span>
<div class="country-name">USA</div>
</div>
</div>
As you can see, the crux of microformats is overloading CSS classes. When you give something the "adr" class within the "vcard" class, that means it's the address data field within the hCard, within the hResume.
While I can see the attraction, this approach makes me uneasy:
- We're overloading the class attribute with two meanings. Is "adr" the name of a class that contains styling information, or the name of a data field? Or both? It's impossible to tell. The minute you introduce a microformat into your HTML, the semantics of the class attribute have been permanently altered.
- The microformat css class names may overlap with existing css classes. Woe betide the poor developer who has to retrofit a microformat on an established site where "locality" or "region" have already been defined in the CSS and are associated with elements all over the site. And let me tell you, many of the microformat css field names are, uh, conveniently named what you've probably already used in your HTML somewhere. In the wrong way, inevitably.
- There's no visual indication whatsoever that any given css class is a microformat. If you hire a new developer, how can they possibly be expected to know that "postal-code" isn't just an arbitrarily chosen CSS class name, it's a gosh darned officially blessed microformat? What if they decide they don't like dashes in CSS class names and rename the style "postalcode"? Wave bye bye to your valid microformat. If it seems fragile and obtuse, that's because it is.
- The spec is incredibly ambiguous. I read through the hResume, hCard, and hCalendar spec multiple times, checked all the samples, viewed source on existing sites, used all the validators I could find, and I still got huge swaths of the format wrong! For a "simple" and "easy" format, it's … anything but, in my experience. The specification is full of ambiguities and requires a lot of interpretation to even get close. I'm not the world's best developer, but I'm theoretically competent, and if I can't implement hResume without wanting to cut myself and/or writing snarky blog posts like this, how can we expect everyone else to?
- It doesn't handle unstructured data well. On Stack Overflow, we have a single "location" field. No city, state, zip, lat, long, and all that jazz: just an unstructured, freeform, enter-whatever-pleases-you "location" field. This was awkward to map in hCard, which practically demands that addresses be chopped up into meticulous little sub-fields. This is a bit ironic for a format supposedly designed to work with the loose, unstructured world wide web. Oh, and this goes double for dates. If you don't have an ISO datetime value, good luck.
Maybe I have a particular aversion to getting my chocolate data structure mixed up with my peanut butter layout structure, but it totally skeeves me out that the microformat folks actually want us to design our CSS and HTML around these specific, ambiguous and non-namespaced microformat CSS class names. It feels like a hacky overload. While you could argue this is no different than the web and HTML in general -- a giant wobbly teetering tower of nasty, patched-together hacks -- something about it fundamentally bothers me.
Now, all that said, I still think microformats are useful and worth implementing, if for no other reason than it's too easy not to. If you have semi-structured data, and it maps well to an existing microformat, why not? Yes, it is kind of a hack, but it might even be a useful hack if Google starts indexing your microformats and presenting them in search results. While I'm unclear on the general benefits of microformats for end users or developers, seeing stuff like this in search results …
… is enough to convince me that microformats are a step in the right direction. Warts and all. While we're waiting for HTML5 and its mythical data attributes to ship sometime this century, it's better than nothing.
| [advertisement] JIRA 4 - Simplify issue tracking for everyone involved. Get started from $10 for 10 users. |
December 3, 2009
Version 1 Sucks, But Ship It Anyway
I've been unhappy with every single piece of software I've ever released. Partly because, like many software developers, I'm a perfectionist. And then, there are inevitably … problems:
- The schedule was too aggressive and too short. We need more time!
- We ran into unforeseen technical problems that forced us to make compromises we are uncomfortable with.
- We had the wrong design, and needed to change it in the middle of development.
- Our team experienced internal friction between team members that we didn't anticipate.
- The customers weren't who we thought they were.
- Communication between the designers, developers, and project team wasn't as efficient as we thought it would be.
- We overestimated how quickly we could learn a new technology.
The list goes on and on. Reasons for failure on a software project are legion.
At the end of the development cycle, you end up with software that is a pale shadow of the shining, glorious monument to software engineering that you envisioned when you started.
It's tempting, at this point, to throw in the towel -- to add more time to the schedule so you can get it right before shipping your software. Because, after all, real developers ship.
I'm here to tell you that this is a mistake.
Yes, you did a ton of things wrong on this project. But you also did a ton of things wrong that you don't know about yet. And there's no other way to find out what those things are until you ship this version and get it in front of users and customers. I think Donald Rumsfeld put it best:
As we know,
There are known knowns.
There are things we know we know.
We also know
There are known unknowns.
That is to say
We know there are some things
We do not know.
But there are also unknown unknowns,
The ones we don't know
We don't know.
In the face of the inevitable end-of-project blues -- rife with compromises and totally unsatisfying quick fixes and partial soutions -- you could hunker down and lick your wounds. You could regroup and spend a few extra months fixing up this version before releasing it. You might even feel good about yourself for making the hard call to get the engineering right before unleashing yet another buggy, incomplete chunk of software on the world.
Unfortunately, this is an even bigger mistake than shipping a flawed version.
Instead of spending three months fixing up this version in a sterile, isolated lab, you could be spending that same three month period listening to feedback from real live, honest-to-god, annoyingdedicated users of your software. Not the software as you imagined it, and the users as you imagined them, but as they exist in the real world. You can turn around and use that directed, real world feedback to not only fix all the sucky parts of version 1, but spend your whole development budget more efficiently, predicated on hard usage data from your users.
Now, I'm not saying you should release crap. Believe me, we're all perfectionists here. But the real world can be a cruel, unforgiving place for us perfectionists. It's saner to let go and realize that when your software crashes on the rocky shore of the real world, disappointment is inevitable … but fixable! What's important isn't so much the initial state of the software -- in fact, some say if you aren't embarrassed by v1.0 you didn't release it early enough -- but what you do after releasing the software.
The velocity and responsiveness of your team to user feedback will set the tone for your software, far more than any single release ever could. That's what you need to get good at. Not the platonic ideal of shipping mythical, perfect software, but being responsive to your users, to your customers, and demonstrating that through the act of continually improving and refining your software based on their feedback. So to the extent that you're optimizing for near-perfect software releases, you're optimizing for the wrong thing.
There's no question that, for whatever time budget you have, you will end up with better software by releasing as early as practically possible, and then spending the rest of your time iterating rapidly based on real world feedback.
So trust me on this one: even if version 1 sucks, ship it anyway.
| [advertisement] JIRA 4 - Simplify issue tracking for everyone involved. Get started from $10 for 10 users. |



