Christopher Seiwald's Seven Pillars of Pretty Code argues that code that looks good, works good:
It should not be possible to discern previous changes to a file without seeing the previous revisions. Nothing obscures the essential visual cues more than a shift in style. This practice should be applied as wide as possible: absolutely within functions, generally within a file, and if you're lucky across the system.
Just as with books and magazines, code should be narrow to focus the gaze. As I mention in "Overcome Indentation," the left edge of the code holds the structure and the right side holds the detail, and big long lines mix zones of structure and detail, confusing the reader.
This principle should apply within functions, and in a larger sense, among separate code blocks. A reader can only avoid a total reading if a cursory inspection can reveal the whole block's nature.
Comments should rephrase what happens in the block, not be a literal translation into English. Even if your code is inscrutable and your comments jibberish, the reader can at least attempt to triangulate on the actual purpose.
Use short names (like i, x) for variables with a short, local scope or ubiquitous names. Use medium length for member names. Use longer names only for global scope (like distant or OS interfaces). The tighter the scope, the shorter the name. Long, descriptive names may help the first time reader but hinder him every time thereafter.
Nothing speeds the reader along better than seeing a pattern. These similar looking pieces of code should be lined up one after the other. Grouping reduces the number of entities the reader has to grasp, a critical approach to simplifying the apparent complexity of code.
You must fight indentation to safeguard this property. Code which moves too quickly from left to right (and back again) mixes major control flow with minor detail. Forcibly align the main flow of control down the left side, with one level of indentation for if/while/for/do/switch statements. Use break, continue, return, even 'goto' to coerce the code into left-side alignment. Rearrange conditionals so that the block with the quickest exit comes first, and then return (or break, or continue) so that the other leg can continue at the same indentation level.
It's an entirely reasonable set of advice. Until you realize that Christopher is using his very own jam/make.c as a shining example of the pillars of pretty code.
And make.c is extremely ugly to me.
I don't know if I'd ever consider C code pretty, per se. Maybe it has a nice personality. But pretty? No way. And I'm sure Ruby, Lisp, and Smalltalk aficionados feel the same way about C# code. I know most C# developers can barely force themselves to even look at VB.NET code.
Clearly, pretty code is in the eye of the beholder.
Besides the fact that it's C, I have a few other problems with Mr. Seiwald's make.c code:
The code formatting, as promised, is pretty. But the emphasis on formatting does make me wonder-- rather than focusing on the external, cosmetic attributes of the code, shouldn't we be searching for something prettier on the inside? Such as a higher level language?
Posted by Jeff Atwood View blog reactions
« PC Pinball Sims Is Writing More Important Than Programming? »
I've never been a fan of block aligned variables in C either. I remember working on some code that had this and I was almost afraid to touch it. Any changes involved hitting space an awful lot to get things to align. It may look OK when you're done, but in the process of editing, it's annoying.
And I second the notion that make.c seems ugly. But then again, I do work with Lisp almost exclusively.
Geoff Wozniak on June 19, 2006 06:20 PMTo each, their own. Arguing about which language to use is completely irrelevant to the point at hand and is likely to spark a religious war. People who are Java programmers will try to solve every problem in Java. People who are C programmers will tend to do the same with C. People who are C# programmers will tend to do the same with C#. Perl programmers will do the same with Perl.
In my opinion, it's always better to use an inferior tool that you are very well acquainted with than it is to use the best tool for the job when you haven't the faintest idea what to do with it!
I have some comments about make.c and his seven pillars, but I'll save those for a follow-up comment
Steve Bush on June 19, 2006 06:28 PMComments on the pillars:
1. Unless the original style is virutally non-existant! I've worked with plenty of code where I ended up reformatting the entire file before even beggining to deal with it.
6. Better yet: two or more blocks of code that do the same or similar things should be ONE block of code, possibly with some flags to distinguish the behaviour. It's better to invest the extra time to deal with eliminating extraneous code than to deal with every time that the alternate code becomes out of date.
7. However, code structure oftentimes dictates the overall behavior. if indentation is called for, use it.
On the make.c:
Never ever use /* */ as comment blocks! Unless you have a decent editor (not always an option), these type of comment blocks makes it virtually impossible to easily comment out a block of code for testing (barring adding things like "if (0) {" which can quickly cause problems)
make0 should be broken up into separate functions.
Steve Bush on June 19, 2006 06:38 PM"Don't even think of deciphering this."
Not something I'd like to see in a style exemplar.
jogloran on June 19, 2006 06:46 PMThe biggest problem with make.c is the humongous function. Break that function up, and the need for comments (which quickly tend to go out-of-date) is greatly reduced.
As for the pillar that states that code that does identical things should look the same -- well, that's true if a single function should look like itself.
Mark Wilden on June 19, 2006 06:49 PM>Shouldn't we be searching for something prettier on the inside? Such as a higher level language?
Although a higher level language might be a solution to some problems, I don't believe it will make code fundamentally prettier. People do horrible things with English after all :P
> I don't believe it will make code fundamentally prettier
But isn't pretty formatting a relative constant across all languages? We can line up variable names in FORTRAN just as well as we can in Ruby.
Jeff Atwood on June 19, 2006 10:42 PMI couldn't agree more, especially point #1. Endless reformatting of '{' indentation styles waste a lot of time and makes doing 'diffs' on different versions of a file a nightmare.
Swallow your pride for that one file you are editing, stick with the same style, and apply your own or company's superior style on all other files.
I am absolutely religious about formatting and coding guidelines. So much even that I publish my own guidelines document for various languages including C#, VB, JavaScript, HTML etc.
See http://dotnetjunkies.com/WebLog/jritmeijer/archive/2005/09/08/132429.aspx
Jeroen Ritmeijer on June 20, 2006 12:46 AMHeh, I always forget block comments are /* */, I always use /* //*/. That way you just have to add an extra slash in front of the first one to uncomment, and remove it to recomment.
But it's a pain if you try to comment block comments. I've never quite understood why you can't nest them.
I like #1 a lot: I worked for a while recently on a system that was over 10 years old and still being extended. I'd submit that one of the reasons for its longevity was its consistent design and style. I didn't particularly like either, but it was easy to drop in at any point in the code and figure out what was going on.
Much of the rest I'm less certain about. I don't much like commented blocks of code, for example: why not just refactor out a new routine? Ditto megadentation: just extract the innards. More than once if really necessary.
Oh, and I work in C#, VB and Ruby. Admittedly I'd probably stick with Ruby given the choice, but I'm quite happy in the others.
Mike Woodhouse on June 20, 2006 02:27 AMstatic TARGETS *
make0sort( TARGETS *chain )
This is my biggest pet hate of some (not many) C programmers programming style!
WHY do you hit return after the data type in a function declaration?? There seems to be no reason.
Anyway rant over.....concerning number 1 on the list. I agree this is a good idea, but what if the style is terrible? Am I supposed to follow the previous developer, like a lemming over a cliff?......
Paul on June 20, 2006 04:19 AMIf you put the name of the function on a new line, you can search for the function definition (and nothing else) using a regular expression. For example, ^make.
Personally, I thought the jam source code was very pretty...
Tom Seddon on June 20, 2006 05:14 AM1) Keep your Lisp bigotry to yourself.
2) If you use Lisp, you almost certainly use Emacs. If you use Emacs, you will find that it has a world-class auto-indentation system. I don't know why anyone would be pushing "space".
3) The line "I want to indent my code myself" is idiotic. You have MUCH better things to be doing than aligning code manually. Suck it up, accept the small problems with auto-indentation, and write code.
> static TARGETS *
> make0sort( TARGETS *chain )
> This is my biggest pet hate of some (not many) C > programmers programming style!
> WHY do you hit return after the data type in a function declaration?? There seems to be no reason.
Why? It's obvious. Here's an example
UserInformation* myLongFunction(const char *username, const char *group, const Date& since, const LongStruct& db, const vector& skipList, int ignoreCase);
void myShortFunction();
int myShorterFunction();
When you put the return type on a seperate line, the reader only need look at column 1 to find the name of the function.
> If you put the name of the function on a new line,
> you can search for the function definition (and
> nothing else) using a regular expression. For
> example, ^make.
Only if the function name is in the first column. If it's indented within a class block, or has classname:: before it, or superclassname:: before it, it becomes a lot harder to find than scanning a screenful of 'grep -n make *' and looking for the thing that looks like a definition or declaration.
jamesK on June 20, 2006 08:05 AM"Reduce, reduce, reduce. Remove anything that will distract the reader."
Use variable names like i and j.
Surely thats heresey.
I would suggest even the most locally scoped variable name should provide some information about its reason for being.
Chris Brooksbank on June 20, 2006 08:16 AMHi Chris-
"Use variable names like i and j.
Surely thats heresey."
I agree to disagree. I always use 1 letter variable names for any looping constructs. This immediately tells me that this variable is a throwaway local that is being used by a loop. This is a good convention and it saves on typing.
Is this better than naming the variable loopcountertoparsestring? This variable name is definately more descriptive than i or j. But consider:
arrayofitems[i]
vs.
arrayofitems[loopcountertoparsestring]
I find the first easier to work with. Maybe its because that is they way I always have written it.
Jon Raynor
About this:
Use short names (like i, x) for variables with a short, local scope or ubiquitous names. Use medium length for member names. Use longer names only for global scope (like distant or OS interfaces). The tighter the scope, the shorter the name. Long, descriptive names may help the first time reader but hinder him every time thereafter.
I'm sorry, but that's ridiculous. The human languages we use simply don't permit that on any real sort of level. Sure you can name locally scoped variables that have obvious uses i and j. But beyond that, you either start using abbreviations that are just as hard to understand at a glance as long names are to read, or you use descriptive names regardless of length, and future maintainers can understand it.
I'm just curious - why is there this obsession with keeping typing to a minimum? With all the debugging you have to do to get code working correctly, the amount of typing required for a slightly longer variable name seems negligible to me. Plus, I find that it can often help to slow down when programming. I come up with the best solutions when I walk away from the computer. I just don't buy the argument presented in that point.
Darrin on June 20, 2006 09:46 AMI'm surprised nobody seems to have commented on the more obvious problems in the example C code.
1) One-line loops and conditionals don't use {}s, which I guarantee will eventually bite you on the ass. And not in the good way.
2) 2- and 3-line comments for one line of code (and a debug statement). That's not commenting, that's obfuscation. Don't make me pay attention to stuff that doesn't help. If the block you need to comment merits a multi-line comment, make it a method. Oh, right, C: function.
3) "if(" isn't a keyword. "if" is. Keywords deserve their own space. That's why they're keywords.
4) Several people have commented that make0 is too long: damn straight. That shouldn't have made it past an initial code review. Even if I was reviewing my own code.
5) Wow, that whole FATE chunk sure is a PITA.
6) "Don't even think of deciphering this." And that's _with_ the over-commenting? *giggle* "Insert c in front of s." Sure seems like some trivial variable naming could have made things a lot clearer.
C, while relatively ugly compared to some languages, can still be beautiful in its simplicity and bare-metal mentality.
Over-commenting, or comments of suspect usefulness, make things worse.
Dave
The fact that you have to define a small variable name means you're likely using a crappy abstraction. Perhaps it makes sense in C, which is an abstracted assembly language, but in any other language you shouldn't need to do this.
This of it as a code smell.
wfefdwar4rf2QW on June 20, 2006 10:23 AMThanks for the post - it's had me thinking all morning about coding standards in my more front-end-heavy environment. I'm so thankful my first boss *INSISTED* we read and outline "Code Complete" (http://www.amazon.com/gp/product/0735619670/ )!
My thoughts: http://www.smartminion.com/blog/?bmonth=6/1/2006#entry20060620133417
john d brown on June 20, 2006 10:37 AMNobody has mentioned that "i" and "j" are hard to tell apart when a) read quickly b) appearing on a printed page where the toner was running out or c) appearing small on a screen with only modest resolution. Why introduce even the possibility of such easily preventable errors? Similarly, "1" and "l" are often hard to tell apart and should be avoided. Text should be easy and clear so that we can concentrate on the logical problem at hand.
barry on June 20, 2006 11:35 AMI do like his focus on maintainable code and his oft repeated manta that *others* will have an easier time editing your code if you follow a set of rules. With applications growing so large these days the ability to quickly find your way around a large body of code is critical.
I disagree about shorter variable names, I'd rather have descriptive vs. ISecManCon and with most modern IDE's you really only have to type your variable name once.
All in all a good list. I'd go as far with #1 to say you shouldn't deviate from the languages standard. It's distracting to see C# code capped like old C code or Delphi code capped like Java, etc.
Shawn Oster on June 20, 2006 11:38 AM> Nobody has mentioned that "i" and "j" are hard to tell apart
I don't know if the comparison between i -> j and 1 -> l is a fair one.
> I'm just curious - why is there this obsession with keeping typing to a minimum?
It's an obsession with keeping *READING* to a minimum. Half of what we write will go unread anyway.. all other things being equal, shorter is better. Think of it as Strunk and White for your code.
> I disagree about shorter variable names
The specific guidance he gives is this:
The more important a variable is, the better its name should be.
I agree completely. Short variable names mean "this isn't very important, so don't spend a lot of time trying to process it." So when you see variable names "i", "ds", or "ts" you can assume they're not important-- they're probably used for ephemeral, temporary tasks.
Jeff Atwood on June 20, 2006 12:53 PM#1 is only really feasible if you've got nice code to start with. The work I've been doing lately my code has looked *to me* like islands of sanity in the morass of terrible programming practice that makes up the rest of the code.
Ben Moxon on June 20, 2006 01:14 PMI liked your article and the discussion in the comments. I too wasn't that impressed with the code (although, of course, it could have been much better).
My major criticisms:
- too much commenting, and in multi-line blocks.
I *hate* the following kind of things.
/*
* gets the attenuation
*/
Almost all C compilers let you use //, and if not, then put it on one line dammit.
- related -- putting the parameters each on their own, indented line. I found he was wasteful of screen real estate.
- *HUGE* indentation increment
- too frequent line breaks on booleans
- Mixing of blocking (with {}) conditional, and multi-lining or not
- in reponse to others, I think i & j are very rarely good ideas. I think iLight, iCol, iRow work very well, don't require much typing, and reduce the likelihood that variables will be re-used for different purposes.
- Not distinguising NULL, 0, and FALSE
- Not so keen on his spacing (none after keywords, one after \# and before include
I thought I was a harsh critic of code, but MAN, what a bunch of code nazis you guys are! I don't like C much because it's impossible to write ANY code that involves string parsing or dynamic memory management and not have it look ugly. But IMHO this code (probably because it avoids string handling / dynamic allocation) is pretty good for C code.
When I look at code, my #1 criteria is no more than "can I read it, understand it, and if need be, safely modify it?" I had no problem reading and understanding this code, even if it uses a coding style I see rather infrequently. The only thing that bothered me (momentarily) was the use of "p" for "parent" and "t" for "target", but now that I think about it, it's rather elegant. These two variable names are crucial for understanding the code, and the fact that they're given such short names makes them stand out whenever they are used.
And as far as maintainability goes, I think he did a very good job. After reading the code, I don't have much fear that changes made in one place would need to be made in many places, nor do I fear that the overall design is creaking and liable to fall apart with the next feature addition.
Most of the things mentioned here are total nitpicks that don't affect readability or maintainability in the slightest. I feel as though a lot of commentators here are just regurgitating old shopworn aphorisms in an attempt to feel smarter than the original coder.
"There are functions named make and make0. Ouch."
What's so difficult to understand about this? What better way is there to illustrate the relationship between the two functions? One is obviously a more public function which uses the internal, oddly named function to do most of the heavy lifting. What would you have named it? make_guts, maybe? make_pass1? As long as make0 never calls back into make (which is true here), I'm fine with the naming scheme.
"Better yet: two or more blocks of code that do the same or similar things should be ONE block of code, possibly with some flags to distinguish the behaviour."
I agree with you. And with Seiwald. Repetition is bad. Parallelism is good.
(or would you have me say: Repetition_Or_Parallelism_Is_Good_Or_Bad(bool fRepetition))?
"The biggest problem with make.c is the humongous function."
Not really. It's straight line code. Chopping it up into a half-dozen mini-functions wouldn't add a bit of clarity; it'd just force you to yo-yo around to see what's happening.
Now, if you have a 50-line if or while statement, THAT's a real problem. Ditto code that sets a flag at the top and checks it 100 lines later. "Write short functions" is good advice which prevents these REAL sins from being committed. But since he's not doing that here, I can't complain.
(If it really bothers you, just put empty braces around each commented "step", and visualize them as anonymous functions which, for readability's sake, have been inlined).
"Never ever use /* */ as comment blocks! Unless you have a decent editor (not always an option), these type of comment blocks makes it virtually impossible to easily comment out a block of code for testing."
Steve, meet #if 0. #if 0, meet Steve.
(Actually, I feel that commenting-out code for testing purposes may be a suboptimal practice, but I'll elide that discussion for now).
"This is my biggest pet hate of some (not many) C programmers programming style! WHY do you hit return after the data type in a function declaration??"
Why not? Is it provably any better or any worse than any other arbitrary style? I don't use this style myself, but I don't yell at the people who do.
"Over-commenting, or comments of suspect usefulness, make things worse."
Agreed. But if you think this is over-commenting, you should try reading some of the code *I* see on a daily basis. (I love it when people document the fact, for example, that you can safely call "delete" with a null pointer). In my view this coder uses comments effectively to show the overall design and structure of the program, and not to document details which are more obvious in C than they are in English.
"Keywords deserve their own space."
Of all arbitrary rules to make. I personally put the space outside the parenthesis, but he's an inside-whitespace man. Ain't no harm in it. Some people don't put spaces around the parens at all. Does it honestly make it harder to read, or does it just "irk" you because you're not used to it? For me, I can read anything as long as they put whitespace around the operators (like && and ||).
"I found he was wasteful of screen real estate."
What's the problem? Is there some global whitespace shortage crisis going on that I should know about? Use all the screen real estate you want -- they'll make more.
(Yes, at times there's a benefit at keeping a complicated algorithm readable in one screen of code ... but that's not going on here, is it? The important thing is that the coder doesn't force you to scroll around to find the declaration of a variable (usually they're in the tightest scope already), to find the antecedent of an "else" clause or to find the terminating condition of a while loop).
"Not really. It's straight line code. Chopping it up into a half-dozen mini-functions wouldn't add a bit of clarity; it'd just force you to yo-yo around to see what's happening."
BS. It's very unusual that you have to understand everything that's happening in a chunk of code all at once. Breaking up functional units allows you to focus only on the parts you actually care about.
Editors/IDEs with configurable folding can do much the same thing, but I'd still rather have it broken into individual components.
I'd rather read code well-abstracted code with useful function names and break things down into small units. I don't _want_ to wade through a hundred lines that have nothing to do with what I'm interested in.
While it's a PITA to keep harping on this small code chunk, there are plenty of functional units that could easily be stripped out.
Name them something useful and the useless comments go away and algorithmic comments can be made in function headers, where they belong, waiting to be extracted and formatted.
And I'll pick on whatever whitespace conventions I want ;) We should talk about tabs vs. spaces now :D
Dave
I took this article to mean that he was saying if you format things consistently and in a way that makes the code easy to read, then the result is that it is easier to read and maintain, not that it works better. Based on the fact that you were able to find some things (presumably pretty quickly) in his make.c that you disliked, his formatting of the file must have made it relatively easy. [I am imagining you didn't spend a great deal of time looking through his file.]
Joshua Volz on June 20, 2006 03:45 PM> I thought I was a harsh critic of code, but MAN, what a bunch of code nazis you guys are!
> his formatting of the file must have made it relatively easy
One of the points of this post was to highlight the meaninglessness of formatting choices.
Sure, there are ways to arrange code pathologically so nobody can understand it, eg:
http://www.codinghorror.com/blog/archives/000291.html
But assuming that everyone has made *reasonable* formatting choices, how do you make code easier to understand?
You switch to a better language. In other words, beauty is more than skin (formatting) deep.
Jeff Atwood on June 20, 2006 04:06 PM"Breaking up functional units allows you to focus only on the parts you actually care about ... I don't _want_ to wade through a hundred lines that have nothing to do with what I'm interested in."
He DID break his code up into functional units.
Didn't you see rule number three? "Break code into logical blocks so that each does a single thing or single kind of thing."
Do you see any unnecessary dependencies or code blocks that extend longer than ten lines (the distance between one of his commented sections and another)?
More to the point, do you find that you actually have to wade through each subsection? Because I don't. With his clearly commented subsections, I find it extremely easy to skip over the levels of detail I don't want to get involved in. For me, actually, it's MUCH easier than seperate subroutines, given that I don't have to jump around in the file -- intellisense editor or not.
Maybe you're under the impression that the ONLY way to seperate code into logical blocks is to split them into single-use subroutines. But it's not true. Anonymous blocks are just as effective at conveying the cohesion of function steps -- if not more so.
Alyosha` on June 20, 2006 04:10 PMRe: Alyosha's comments
"... [snip] I suggested not using /* */ ..."
"Steve, meet #if 0. #if 0, meet Steve."
Although I have certainly used #if 0, the advantage of this:
/*
...some code to comment out
*/
versus this:
#if 0
...some code to comment out
#endif
is in terms of visualization. Most of the editors I have used can easily color code the comment block. Not so much with the #if 0 ... #endif block. Makes it far easier to clean up after you're done, though again... using the technique and tools you are used to is better than using a superior tool you haven't mastered.
In addition to commenting out code for testing, I very often use the blocks for commenting out sections of code that are works in progress or code segments that are likely to be deleted, but until I decide to do so, it's useful to be able to restore the lines quickly.
(yes, yes... I know that you can use source code control systems for doing the same thing, but I usually try to avoid messing up the code repository with my own personal brain spewings)
"But assuming that everyone has made *reasonable* formatting choices, how do you make code easier to understand?
You switch to a better language. In other words, beauty is more than skin (formatting) deep."
Agreed on this point. C is way too verbose for me to consider doing serious development in it. C++, if properly used, can be a much more terse language -- but more often than not, it's misused and in practice is only marginally better than C++. C#, Java, Python, perl ... all of these are wonderfully expressive languages that I enjoy writing (if not running). I think, if I got to study it long enough, I would find Erlang to be the same way.
But I think there's a danger in swinging too far the other way on the terseness specturm. Lisp-like languages make it possible to "reprogram" your language, and make it orders of magnitude much more expressive. The practical problem is that in doing so, the rules of usage and the total vocabulary you need to have memorized in order to be fluent expand exponentially. Way out at the extreme end, APL is a great example of a totally expressive, but utterly unreadable language to anyone who hasn't spent five years studying it.
Alyosha` on June 20, 2006 05:15 PMUgly or not, Props to Christopher for putting it out there. I feel newer coders know about what comments should be but all they have seen is text book code which often sacrafices security or performance in order to illustrate one point. Real-world code is a good learning space for n00bies to find style.
On the topic of C, don't forget the famous unix comment (attributed to dmr IIRC):
/* You are not expected to understand this */
It is sometimes misunderstood as being elitist, but in fact is a helpful comment. It means: "Don't spend your time on this bit, it's odd but take it on faith."
James Green on June 20, 2006 08:05 PMPerhaps I'm just an oddball, but that make.c example does look, to my eyes, rather beautiful.
Also, don't ever use "chtulhu" as a variable name, you'll misspell it every time you write it.
"Switching to a better language" is an option, unless you happen to be writing a utility (like Jam) that needs to compile and run on dozens of platforms with minimal external requirements.
Ubiquity trumps all in this case, and ubiquity currently means vanilla, ANSI-standard C.
Tim Lesher on June 21, 2006 07:19 AM> As for the pillar that states that code that does identical things should look the same --
> well, that's true if a single function should look like itself.
Not always possible. Especially not in C, or other strongly typed languages without generics/templates. When your functions all do the same thing, but with different data types, then it's often unavoidable to have a bunch of functions that look almost identical.
My recent work has, unfortunately, been in PL/SQL, an "object enhanced" scripting language for Oracle databases. I can't tell you how many times I have thought to myself "Good gosh, this is going to take hours, and it's going to be huge and repetitive. If only I had some sort of generics or dynamic typing available, it would be a breeze."
At least in C, you can sometimes use macros, if the code will essentially be _exactly_ the same. PL/SQL doesn't even have that. Of course, there's always Pro*C, or Java, and such things. But the company I'm contracted to doesn't allow us to use them.
WaterBreath on June 21, 2006 10:25 AM"Only if the function name is in the first column. If it's indented within a class block, or has classname:: before it, or superclassname:: before it, it becomes a lot harder to find than scanning a"
Uh, we are talking C here not C++ or something else? While I dislike putting function names in column 0 in C code myself, I code with friends who love it and I can tolerate it - the justificaiton is that it is "easy to find declaration in code with 50 source files" etc.
"Maybe you're under the impression that the ONLY way to seperate code into logical blocks is to split them into single-use subroutines. But it's not true. Anonymous blocks are just as effective at conveying the cohesion of function steps -- if not more so."
Not so. Breaking code up with whitespace relies on programmer judgement to keep the pieces working independantly. When you break the code into subroutines the compiler enforces this. It also forces you to declare the parameters that a block of code requires, it makes reuse easier if it should become necessary and it simplifies changing the order of execution. Most importantly, if it were broken into subroutines, it would be possible to tell at a glance what steps were executed when that function was called instead of forcing the reader to scroll down through several screens of code culling out the meaningful comments.
what I've found amusing for the last 20 years, almost, is this insistence, by people who should know better, on 80 column (or less) line limits. That was invented by COBOL (which only has ~66 characters available after you subtract the line number from 73) around 1960. It's just silly. I remember how wonderful it was when the VT-220 came along and I could use 132 character lines. If nothing else, formatting a report both in code and as output was a piece of cake.
Beyond the character world, we have wide screen taking over, and folks still think that 80 columns is King. There's more horizontal real estate, and always was. Use it well.
And as to scanning narrow: not everybody does; I've had the Great Books set and can't read them because those narrow double columns drive my eyes over the edge.
Wider is better (with props to Pontiac?).
Buggy Fun Bunny on June 22, 2006 01:56 PM"Not so. Breaking code up with whitespace relies on programmer judgement to keep the pieces working independantly."
All coding requires programmer judgment; there's no avoiding it. You rely on programmer judgment to tell when a function is too long, I rely on programmer judgment to tell when a function has too many long-scoped variables.
"When you break the code into subroutines the compiler enforces this."
You can restrict variable scope just as easily by declaring them in another level of scope (aka an anonymous block). I do this occassionally.
"It also forces you to declare the parameters that a block of code requires."
And the benefit is ...? A "clever" programmer is just going to declare functions which take a dozen parameters. And you're back to programmer judgment again.
"It makes reuse easier if it should become necessary and it simplifies changing the order of execution".
YAGNI, and at any rate, these transformations take what, 5 seconds to do with NOTEPAD? OTOH, the clutter of reading code filled with single-use functions impairs readability: the ability to find important methods, functions which are general-purpose -- code which I really might want to reuse.
"Most importantly, if it were broken into subroutines, it would be possible to tell at a glance what steps were executed when that function was called instead of forcing the reader to scroll down through several screens of code culling out the meaningful comments."
Really? So you're saying:
#1: in the example code, you found it difficult to skim for the comments, and
#2: you'd rather see a function called addDependentsIncludesToDirectDependencies(target) than a comment such as /* Step 3c: add dependents' includes to our direct dependencies */.
Can't say I agree with you on either count. The scrolling wasn't onerous at all. I could grep for comments pretty easily (even without syntax coloring, it wouldn't be hard). And I don't enjoy squeezing comments into 32 alphanumericunderscore characters. I also don't like (as mentioned before) having to jump around if I need to drill down a level of detail.
Alyosha` on June 22, 2006 02:59 PMWhen this topic first appeared, I figured it would be wise for a C developer like myself to stay out of any possible ensuing religious wars on software conventions. However, after a few days I realized I didn't have to join in the religious wars -- I discovered that Alyosha` presented many arguments I agree with. Obviously, Alyosha` & I share many similar beliefs on coding convention & organization, including pragmatic function size :
"And though I may be in the minority, I both agree & disagree with your above statements & concepts.
As I stated above, "my rule is that a function should be as short as atomically reasonable -- even if that means 2-5 lines for some functions or 10s/100s lines for others.
Each function should do what it requires with few if zero side effects. Whether a function requires two lines of code or 1000, functions should NOT be arbitrarily short JUST for the sake of a cliche."
-- http://discuss.joelonsoftware.com/default.asp?design.4.202810.13#discussTopic203616
> "Steve, meet #if 0. #if 0, meet Steve."
>
> Although I have certainly used #if 0, the advantage of this:
> /*
> ...some code to comment out
> */
>
> versus this:
> #if 0
> ...some code to comment out
> #endif
>
> is in terms of visualization. Most of the editors I have used can easily color code the comment block.
> Not so much with the #if 0 ... #endif block.
> Makes it far easier to clean up after you're done, though ...
> -- Steve Bush
I personally was trained to use the "#if 0/#endif" method for software maintenance reasons. So naturally I found Alyosha`s "meet Mr. #if 0" comment very funny.
However, Steve Bush correctly points out the primary advantage for using the "/* */" method is for IDE comment-coloring visualization. He also correctly points out one important advantage for using the "#if 0/#endif" method -- it is "far easier to clean up", i.e. enable/disable desired code blocks from compilation.
However, he fails to address the parallel problem using the "/* */" method. That is, it is often difficult to enable/disable code using "/* */" if the code block(s) themselves contain "/* */" comments. This is because ANSI C does NOT allow nested comment blocks. Even a careful developer could easily intend to comment out a code block but not realize that only some lines of the code block were commented out while some trailing lines are NOT commented out because of a nested comment block.
In fact, MISRA (Motor Industry Software Reliability Association) & similar FAA & FDA standards prohibit enabling/disabling code blocks using "/* */" for this very reason :
"Where it is required for sections of source code not to be compiled then this should be achieved by use of conditional compilation (e.g. #if or #ifdef constructs with a comment). Using start and end comment markers for this purpose is dangerous because C does not support nested comments, and any comments already existing in the section of code would change the effect."
-- MISRA-C 2004 Rule 2.4
"All coding requires programmer judgment; there's no avoiding it."
Sure, but if you can get the compiler to check something for you, you should take advantage of that.
"And the benefit is ...? A "clever" programmer is just going to declare functions which take a dozen parameters."
The benefit is that it says "this block needs to have this information". It also shows the types of the variables near where they are used, which makes it easier for a reader to understand. The fact that it can be done badly does not negate these benefits.
"the clutter of reading code filled with single-use functions impairs readability: the ability to find important methods, functions which are general-purpose -- code which I really might want to reuse."
The implication there is that you know when you write the code what code you will want to reuse. But in my experience its common to discover new uses for a block of code that once seemed pretty specific. Having the code already residing in a function with an appropriate name not only makes it more convenient to resuse, but it allows people who are unfamiliar with the code to find reusable pieces easier.
"I could grep for comments pretty easily (even without syntax coloring, it wouldn't be hard). And I don't enjoy squeezing comments into 32 alphanumericunderscore characters. I also don't like (as mentioned before) having to jump around if I need to drill down a level of detail."
On my screen that function was over 6 pages long. Thats quiet a bit of jumping just to get a high level overview of it. Modern editors make drilling into functions as trivial as scrolling down a page, so I actually find it easier to navigate if its broken into seperate routines - I can jump quickly to the piece I'm interested in. Its like the table of contents in a book. Its not hidding details from you, its helping you find the right place to look for the details you want.
"The implication there is that you know when you write the code what code you will want to reuse."
Oftentimes I do, but sometimes I don't. Still, I tend to abstract out common code the second or third time I need to use it ... not immediately on the hunch I "might need it later". After all, it's not any easier to find code that's been buried in a mountain of single-purpose functions, as opposed to a snippet from a major function -- and it might be harder, in the case that I prematurely factored it out wrong and that bit of "common code" that I really need involves stitching together two or three and half of these single-use functions.
"Thats quiet a bit of jumping just to get a high level overview of it."
Did you do a lot of jumping around? I didn't. There weren't any big loops or ifs, the variable types and what they were meant to represent were pretty clear from immediate context ... I mostly scrolled, which I agree with you is pretty trivial. :-)
I think on these points we're going round and round here ... all I can say is, I didn't have a problem reading and getting a high-level understanding of what the "big long function" does. The things that you found difficult, I didn't. I don't know what to chalk that up to. I'm pretty sure I'm not an easy person to please ... I can tell you I have seen a lot of functions in my experience that were this long, that were MUCH harder to read, and that I DID think had serious problems -- and I'm very vocal about them when I see them.
Alyosha` on June 22, 2006 11:01 PM"I think on these points we're going round and round here ... "
Agreed. Its not that that function was particularly hard to read, or that I haven't seen much worse, its that in my view it would be *more* readable and maintainable if it were broken into seperate functions. In my experience, A relatively sane function like that, when monkeyed with by a half dozen other coders of varying abilities, can easily morph into a real monster. But like with most stylistic arguments, I fear we're doomed to not agree on this.
What I find interesting is how backward non-OOPs are. You dealing in a non organic paradigm. C is meant for low level programming and almost nothing else. Move on to managed code thank you. Being able to set up objects, build on others objects, and being able to isolate functionality (remove redundant functionality patterns) is more productive.
When I looked at the code sample I thought geez was this written in the 80s? C is not a modern language.
Please CC to joshua.r.thomas@gmail.com
Josh on June 26, 2006 12:43 PM| Content (c) 2008 Jeff Atwood. Logo image used with permission of the author. (c) 1993 Steven C. McConnell. All Rights Reserved. |