I <3 Steve McConnell*
Coding Horror
programming and human factors
by Jeff Atwood

June 22, 2006

Text Columns: How Long is Too Long?

Ian Griffiths recently wrote a proof of concept WPF browser for the MSDN online help. One of the improvements cited is multi-column text:

This is why WPF offers a column-based reading experience. We know from experience in the print world that breaking text into columns can make it much easier to read. Indeed, once you've got used to reading in columns, going back to the long unbroken lines offered by the standard SDK viewer feels like punishment!

Code is a highly specialized form of writing, but the same sort of questions always come up. Should we use short lines or long lines? A recent comment from Buggy Fun Bunny* describes this conundrum well:

What I've found amusing for the last 20 years is this insistence, by people who should know better, on 80 column (or less) line limits. That was invented by COBOL (which only has ~66 characters available after you subtract the line number from 73) around 1960. It's just silly. I remember how wonderful it was when the VT-220 came along and I could use 132 character lines. If nothing else, formatting a report both in code and as output was a piece of cake.

Beyond the character world, we have wide screen taking over, and folks still think that 80 columns is King. There's more horizontal real estate, and always was. Use it well.

And as to scanning narrow: not everybody does; I've had the Great Books set and can't read them because those narrow double columns drive my eyes over the edge.

Hey, I'm a Great Books fan from way back in the day. Nothing blows a seventh grader's mind quite like reading Graham Greene's The Destructors, or Carson McCuller's Sucker. No wonder people want to burn books. They're subversive.

So what's the best way to structure columns of text on a computer screen? How long is too long? Luckily, the Software Usability Research Laboratory at Wichita State has already researched-- and answered-- this question.

But before I reveal the answer, what do you think? Which of these passages is easier to read and comprehend?

Single column, left aligned
Alice in Wonderland, single column, left aligned

Single column, justified
Alice in Wonderland, single column, justified

Double column, left aligned
Alice in Wonderland, double column, left aligned

Double column, justified
Alice in Wonderland, double column, justified

The answer is more complex than you might think. The SURL paper first summarizes past findings of previous research:

  • Longer line lengths generally facilitate faster reading speeds.
  • Shorter line lengths result in increased comprehension.
  • The optimal number of characters per line is between 45 and 65.
  • Paging through online text generally results in better comprehension than scrolling.
  • Reading speed is faster for both single and multiple columns, but preference is for multiple short columns.
  • Left-justified text is read faster than full-justified text.

And here's what they found in their study:

reading speed results graph

The results of this study suggest that there is not one best way to present text online. Although fast readers performed best at the two-column full-justified condition, slow readers benefited from a single column non-justified layout.

Personally, I was surprised that justification had such a strong influence on the results. Simply breaking up text into columns actually hurts reading speed noticeably for both slow and fast readers. However, if you fully justify the columns, then-- and only then-- reading speed increases dramatically for everyone. And clearly, three column layouts aren't worth it, no matter how you align the text.

So what conclusion can we draw for coders? Probably not much. Code is always left aligned, and in a single column. A related SURL study examines the effect of line length alone which might be more relevant:

This study examined the effects of line length on reading performance. Reading rates were found to be fastest at 95 cpl. Readers reported either liking or disliking the extreme line lengths (35 cpl, 95 cpl). Those that liked the 35 cpl indicated that the short line length facilitated “faster” reading and was easier because it required less eye movement. Those that liked the 95 cpl stated that they liked having more information on a page at one time. Although some participants reported that they felt like they were reading faster at 35 cpl, this condition actually resulted in the slowest reading speed.

Furthermore, line length had no effect on comprehension. Although I'm hesitant to draw broad parallels between source code and a news article, perhaps arbitrarily short lines aren't always necessary in source code to achieve good readability and comprehension.

* I just write this stuff down. I don't make it up.

Posted by Jeff Atwood    View blog reactions

 

« When Object-Oriented Rendering is Too Much Code Vive la Programming Révolution! »

 

Comments

So at three columns, justified, the fast readers were slower than the slow readers?!

I'm a fast reader and shorter lines of text seem faster to me since my eyes can chain easily to the next first line of text, where with longer lines I tend to hunt for the next line when I'm done with the current one.

However, I think this has doubtful applicability to code. IMO, each line should be a new thought and code should normally be browsed with word wrap off and lines extending to infinity. Once a thought needs to be explored, word wrap should be turned on, but otherwise not.

I do think it's important that the line express its major intent in the first 80 characters or so, though. Anything beyond that should be either further parameters that are of little importance or a continuation of the thought expressed in the beginning of the line.

I think doing it like that makes it easier for the brain to scan (thought 1, thought 2, thought 3, not thought 1, thought ... er ... 1, thought 2).

Anthony Mills on June 23, 2006 03:50 PM

Hmm, your conclusion seems strange. Reading speed would increase with longer lines, but not comprehension. That's what the research seemed to say, anyway.

What do we want, code-wise? Should people be able to understand the code, or read it fast?

Simon on June 23, 2006 03:50 PM

> Reading speed would increase with longer lines, but not comprehension. That's what the research seemed to say, anyway.

Actually, they didn't find any statistically significant relationship between # of columns and comprehension:

"A 2x3 randomized block ANOVA found no significant main effects or interaction for either justification or number of columns for total comprehension."

Nor was there any statistically significant result for "satisfaction", eg, the reader's opinion of how easy the page was to read. The only statistically significant results were for reading *efficiency*, which they define as reading speed divided by comprehension scores:

"an interaction approaching significance for justification x number of columns was found"

Whether the columns are justified are not turns out to be hugely important!

Jeff Atwood on June 23, 2006 04:01 PM

> I do think it's important that the line express its major intent in the first 80 characters or so

Definitely.

The comparison is a little shaky, because you NEVER see code formatted in multiple columns. What we're really comparing is a single column with long lines, to a single column with short lines. And this study doesn't cover that..

It does, however, provide some decent evidence that arbitrarily short lines aren't *always* superior to long lines.

[Note: I modified my conclusion in the post to clarify this point]

Jeff Atwood on June 23, 2006 04:06 PM

I think we should demand full justification for the VS.NET code editor.

Haacked on June 23, 2006 04:16 PM

We have standardized on 80 columns because we spend a lot of time reviewing code revision changes side by side using a difference viewer. Viewing differences side by side becomes much easier when the code is written in a narrow column.

Big B on June 23, 2006 04:21 PM

"approaching significance"?? That sounds exactly like "insignificant."

Scott Johnson on June 23, 2006 04:25 PM

> I think we should demand full justification for the VS.NET code editor.

I'm trying really hard to imagine what that would look like.. ;)

> Viewing differences side by side becomes much easier when the code is written in a narrow column.

This is a great point. It's one of my pet peeves, too. Code that spreads out too much horizontally is smelly:

http://www.codinghorror.com/blog/archives/000486.html

The more I think about this, the more I think that code is such a specialized form of writing that none of the typical reading rules apply.

Keeping your lines short is still an important guideline. But arbitrarily refusing to allow any code to extend past column 80 is unnecessarily inflexible.

[updated post to reflect newly discovered SURL study on line legth]

Jeff Atwood on June 23, 2006 04:33 PM

Keeping one's code within a 78 column limit does have the advantage of being rather a lot kinder to those on 80 column terminals when emailing it about than having lines of code either run off the edge of the scree or line break in an ugly manner. Though I suppose this is of only trivial importance unless you're attempting to work with rather oldschool programmers.

Richard on June 23, 2006 05:45 PM

I like your blog because it uses the full width of the browser for the text.

Too bad that bloglines makes you scroll sidewise because other elements of the feed such as pictures increase the virtual width and render some words outside of the view pane.

piyo on June 23, 2006 06:28 PM

Even I can't believe this high percentage, but I just checked how my main project written in Ruby gets by in line size:

d = '/home/dewd/workspace/pilar'

sizes = Hash.new(0)

Dir["#{d}/**/*.rb"].each{|f|
ts = IO.readlines(f).delete_if{|line| line.strip.size == 0 }.map{|line| line.size }
ts.each{|sz| sizes[sz] += 1 }
}

total = 0
less_than_seventy_chars = 0
sizes.each{|size, n_occurrences|
total += n_occurrences
less_than_seventy_chars += n_occurrences if size < 70
}

perc = (less_than_seventy_chars * 100) / total.to_f

puts '====== Result:'
p 'Percentage of lines with less than seventy chars: %f' % perc
p 'Total of lines: %d' % total
p 'Number of lines with less than seventy chars: %d' % less_than_seventy_chars


====== Result:
"Percentage of lines with less than seventy chars: 95.753574"
"Total of lines: 33016"
"Number of lines with less than seventy chars: 31614"

Nx on June 23, 2006 06:34 PM

I'm surprised that you're even comparing text for reading with text for code. I've never seen anyone concern themselves about how _fast_ you can read a line of code. Speed is nothing. Comprehension is everything. And comprehension is much more a matter of what's going on in the line, not how long it is. I would posit that long line of code almost always are less comprehensible than shorter ones because there's more than one thing going on in the line -- nested function calls, ANDed conditions, whatever. Not always, but mostly.

I must say I'm surprised that justified text tests better than ragged text, mostly because the automatic justification algorithms in browsers and in word-processing software can results in some pretty horrible-looking text. (One way typesetters get justification is to split a lot of words so that there are not yawning fields of whitespace in lines, but no automatic algorithm I'm familiar with tackles the tricky job of splitting words.) I notice this particularly, of course, when reading some sort of technical material that's full of method names or URLs or whatever -- ie, words that are long and un-splittable. Those totally hork up justification.

Incidentally, I must respectfully disagree with piyo -- I really dislike text that goes all the way out to the margins. But, you know, different gaps for different chaps and all that.

mike on June 23, 2006 06:41 PM

My preference isn't for a given line length but for the rule "Try very hard not to allow a line of code to extend past the right edge of the editor viewport, avoiding horizontal scrolling."

I find that comprehension drops drastically when I can't see the entire statement at once. Plus, more than once I've been bitten in languages that don't require a statement terminator (VB, etc.) by something off the right edge that I didn't realize was there.

I'll manually wrap statements as they approach the right edge. Granted, the number of columns in the viewport will vary based on screen resolution, etc. Other than the "kids" whose eyes can handle 1600x1200 on a 15-inch LCD, most of the people I work have about the same number colums to work with.

Mick on June 23, 2006 08:08 PM

i personally feel that the longer the line is the harder it is to read and comprehend but that may just be beacuse shorter lines are generaly less important than longer lines. As far as in my own code i like having lines of different length randomly in the code because it allows me to scan the file more quickly looking for a set of lines that resemble the block of code i was working on. The mind can match shapes faster than text.

Andrew Ray on June 23, 2006 08:37 PM

Long lines are definitely a problem when you have two files open side by side. Anyway I think code might have more in common with musical notation than the English language.

Chris L on June 23, 2006 09:41 PM

There's a pretty big and basic difference between the length of code lines and those of regular text: long code lines are usually long because they start at a deep indentation level, meaning they have a large amount of whitespace on the left. For the purpose of reading comprehesion this whitespace doesn't count. So if the indentation level is six tabs at four spaces, your 100 character line is actually shorter than 80 characters.

Another point is recent popularity of very long names for individual code units, as in Java and .NET. A 132 character line in C++ might be incomprehensible because it contains a dozen different statements and operations, but a 132 character line in Java might well contain just a single method call with elaborate namespace and class names. Should we arbitrarily break up single statements because they're long, even though they're perfectly readable?

For these reasons, absolute character limits on code lines without regarding the concrete circumstances are pointless.

Chris Nahr on June 24, 2006 12:13 AM

Some programming languages preach indentation with spaces rather than tabs. E.g., Ruby (2 spaces) and Python (4 spaces).

Nx on June 24, 2006 03:32 AM

> One way typesetters get justification is to split a lot of words so that there are not yawning fields of whitespace in lines, but no automatic algorithm I'm familiar with tackles the tricky job of splitting words.

Knuth put a lot of work into developing one for English for TeX. Properly rendered fully justified text (as TeX has always done it, and as other document processors are only now starting to do), gives the advantages of both left-justified text and conventional fully justified text.

Keith Gaughan on June 24, 2006 11:58 AM

I suppose it all depends on how trusting you are. If you can be sure that the code to the right of col 79 doesn't contain surprises or bugs, then fine - just scroll down and watch the structure unfold. (Yeah, right!)

Having to scroll left-right is a big no-no. Painful is not a strong enough word. If it's such a good thing to do, why aren't Home and End called PageLeft and PageRight, eh???, eh???!

Seriously though, what we really need is syntax-aware source control. Then you could share code with people whose views were very different to your own. Modern IDEs and refactoring tools are starting to take us in that direction. (Hands up if you use Ctrl-E,D in VS2005)

Or maybe we could just agree to check in in a canonical format. 80 columns, anyone?

Dominic Cronin on June 25, 2006 01:29 AM

When convenient, I format the code in parallel columns anyway, for example:
(short statement) (or throw exception),
(short statement) (short comment)
both of which ensure that other details do not obscure the normal flow of control.

Michael Roberts on June 25, 2006 12:32 PM

What I noticed while reading this article is the huge impact of ClearType on readability.
The screenshots of the four examples are not rendered using ClearType, where the rest of the page is on my computer. To me it seems that the impact of ClearType is much bigger than the impact of the column layout.

Kristof Verbiest on June 26, 2006 01:55 AM

My way of deal with long lines of code, is to use a proportional-spaced font (I use Comic Sans MS). You can view about 120 characters in the space that 80 characters of fixed-pitch would take. It take a bit of getting used to with many years of "code must by fixed-pitch" drilled in our heads, but it does work quite well.

James Curran on June 26, 2006 07:40 AM

I like the way the new .NET interface allows you to turn on the word wrapping *and* display the wrap icon. This allows to take advantage of the longer line lengths without having to worry about manually breaking the line for readability, especially on the long horizontals.

You have to have a little considersation for others reading your code, but the ability for them to simply resize the code window to their preferred width and have it wrap is a nice touch.

Granted, if your line is 400 characters and it wraps across 5 lines, it's not doing anyone any favors.

Duncan Bachen on June 27, 2006 08:15 AM

It is an established typographic fact that short lines are easier to read and therefore "comprehension" must also rise with reading speed -- in other words the faster you take it in the sooner you can act upon the information.

Justification is a highly complex issue in typography and it is also well understood that bad justification is unhelpful to the reader. This is where good hyphenation comes in; you cannot discuss justification in any context without also taking hyphenation on board.

Justified text with or without hyphenation has no other typographic function, other than to save space on pages, hence keep production costs down. That is why it was invented. However, if you are going to justify then it must be done well and must be accompanied by some equally good hyphenation algorithm. So many typesetting and page layout apps these days take the lazy approach, using less-than-optimal algorithms for both line spacing and hyphenation -- and it shows.

Craiggybear on September 26, 2006 06:20 AM

you rule :)

Georgia on October 28, 2006 01:54 AM

dsfv vvvcxv dfdsfsf

irfan on May 22, 2007 08:59 PM

for plain reading, i prefer long and justified lines.. justified because that makes the lines more predictable to me, and long because i can skip more unimportant words and more easily (which is fully automatic). :)
i bet this sounds strange but it works, i'm a very fast reader (in my native language at least).

cmon_ on June 4, 2007 03:43 PM

I dislike very long code lines because they are a complete waste of screen estate. Most lines are quite short, and I prefer to have multiple windows 80 chars wide each. Visual Studio (last time I checked) does not exactly encourage that on big screens, but I'm an emacs/vi guy anyway.

Andreas Krey on June 5, 2007 04:37 AM

Research into the design of instructional text has been going on for decades and the Wichita result should not be taken at face value.

This story got a lot of coverage, which is fine, but the results are contradicted by lots of other research.

The most important thing is to realize that there is no optimal line length or justification - it depends entirely on what you're trying to do. Best practice depends on the age and ability of the reader, whether you're designing a news report, the warning label for a medication, a road sign, the instructions for an exam, a novel or, indeed, code. Having said that, non-justified (left only), medium line length works well for most things.

Graham Stalker-Wilde on August 10, 2007 07:45 AM







(hear it spoken)


(no HTML)




Content (c) 2008 Jeff Atwood. Logo image used with permission of the author. (c) 1993 Steven C. McConnell. All Rights Reserved.