I often encounter code like this:
if (rowCount > rowIdx)
{
if (drc[rowIdx].Table.Columns.Contains("avalId"))
{
do
{
if (Attributes[attrVal.AttributeClassId] == null)
{
// do stuff
}
else
{
if (!(Attributes[attrVal.AttributeClassId] is ArrayList))
{
// do stuff
}
else
{
if (!isChecking)
{
// do stuff
}
else
{
// do stuff
}
}
}
rowIdx++;
}
while(rowIdx < rowCount && GetIdAsInt32(drc[rowIdx]) == Id);
}
else
rowIdx++;
}
return rowIdx;
The excessive nesting of conditional clauses pushes the code out into an arrow formation:
if
if
if
if
do something
endif
endif
endif
endif
|
|
And you know you're definitely in trouble when the code you're reading is regularly exceeding the right margin on a typical 1280x1024 display. This is the Arrow Anti-Pattern in action.
One of my primary refactoring tasks is "flattening" arrow code like this. Those sharp, pointy barbs are dangerous! Arrow code has a high cyclomatic complexity value-- a measure of how many distinct paths there are through code:
Studies show a correlation between a program's cyclomatic complexity and its error frequency. A low cyclomatic complexity contributes to a program's understandability and indicates it is amenable to modification at lower risk than a more complex program. A module's cyclomatic complexity is also a strong indicator of its testability.
Where appropriate, I flatten that arrow code by doing the following:
if (SomeNecessaryCondition) {
// function body code
}
.. works better as a guard clause:
if (!SomeNecessaryCondition)
{
throw new RequiredConditionMissingException;
}
// function body code
do
{
ValidateRowAttribute(drc[rowIdx]);
rowIdx++;
}
while(rowIdx < rowCount && GetIdAsInt32(drc[rowIdx]) == Id);
if (Attributes[attrVal.AttributeClassId] is ArrayList)
{
// do stuff
}
else
{
// do stuff
}
The goal is to have code that scrolls vertically a lot.. but not so much horizontally.
also many times it is possible to combine the "ifs" into a single if:
if (something) {
if (somethingelse) {
if (somemore) {...
into
if (something || somethingelse || somemore) {
...
and some times is even better to create a function to validate those, so it would be
if (DataIsValid(something, somethingelse, somemore)) {...
I have to tell you, number 4 chafed a little bit. I have always avoided multiple returns from a function like the plague.
That being said, I jumped into McConnell (Code Complete 2) right after reading your post today and found what he had to say: "Use a return when it enhances readability: In certain routines, once you know the answer, you want to return it to the calling routine immediately. If the routine is defined in such a way that it doesn't require any further cleanup once it detects an error, not returning immediately means that you have to write more code."
McConnel's example:
Comparison Compare(int value1, int value2)
{
if ( value1 value2 )
return Comparison_GreaterThan;
else
return Comparison_Equal;
}
but he also says this:
"Minimize the number of returns in each routine: It's harder to understand a routine when, reading it at the bottom, you're unaware of the possibility that it returned somewhere abouve. For that reason, use returns judiciously--only when they improve readability. "
Now, I have to say, I'm a fairly green developer, but this somehow went against a hard coded rule I had. One function --> one return.
Long story short (too late), I learned something today. My little rule was wrong and needed to be rewritten. I hadn't got that far in the book yet, so hearing it from you AND the gospel (McConnell) will have me trying to change my ways. Thanks!
kludger on January 10, 2006 7:07 PMSo THAT'S what the preview button is for...
Should have been this:
McConnell's example:
Comparison Compare(int value1, int value2)
{
if ( value1 LT value2 )
return Comparison_LessThan;
else if ( value1 GT value2 )
return Comparison_GreaterThan;
else
return Comparison_Equal;
}
I love to see things like this posted on the web. When I was back in school, it was always a major deal if you didn't follow the strict CS rule of only having one return point for your function.
I use to argue that it didn't do you any good if it made your code harder to read, but I would always get into trouble anyway.
It's interesting to see the difference between coding theory and coding practice.
Ryan Smith on January 10, 2006 7:15 PM> also many times it is possible to combine the "ifs" into a single if:
Yes, although that is a bit of a false economy; the code is "flatter" visually but the if blocks are still present; they were just consolidated into one line. As you pointed out, the best solution is to consolidate the logic into an external function with a proper name, thus reducing the overall per-function cyclomatic complexity.
> I have to tell you, number 4 chafed a little bit. I have always avoided multiple returns from a function like the plague.
I'm glad to hear this got you thinking about it!
The two Code Complete guidelines you cite aren't mutually exclusive..
1) favor things that increase readability
2) favor things that reduce the amount of code you have to write
That said, a perfectly flat function with 53 returns is just as pathological as a 3,000 character wide arrow function with a single return. The difference between the two is where our "mad development skillz" come in.
Jeff Atwood on January 10, 2006 7:19 PMI think the goal is more readable code (which just happens to be served by a vertical layout more than a horizontal layout). I prefer a function in which I don't have to scroll at all.
I like your point about using guard clauses. That sort of fits into the Fowler philosophy of failing fast (http://www.martinfowler.com/ieeeSoftware/failFast.pdf).
Get all the failure conditions out of the way at the top of the method so the body can focus on the real meat.
As for 1 return point, I think that's a bit outdated with code that now has structured exception handling. If you think about it, you have several "return points".
if(!SomethingNeeded)
throw new CustomException("message");
That there is a potential return point.
Though I do agree that keeping it to a minimum is important, there are certain places where a return or a break can be expected that make the overall code more readable.
Haacked on January 10, 2006 8:05 PMKludger, your rule is actually only advice on how to achieve a goal. The goal is "understandable code".
Using one return value is one way to get understandable code. Having methods short enough to fit on a screen is another. Do whatever you need to do to make the code more understandable, and if that means you do early exit from a function, big deal.
Robert Watkins on January 10, 2006 8:40 PMHi,
first the thing that has to be said since - like ages ago: Your blog rocks. In the year I'm now subscribed to your RSS feed I never even once encountered a posting that did not at least remotely interest me. Well written, nice topics. Grats!
Second: I so agree with you. Code with deeply stacked IF's is terribly hard to read and understand.
Some experience with exactly this problem can be found in a new blog entry of mine inspired by you:
http://www.gnegg.ch/archives/flattening_arrow_code.html
Philip
As a practical application that combines the guard clause with the "return soonest" philosophy, one ASP.NET pattern I find myself using it a lot is in code like this:
Sub Button1_Click(sender As Object, e As EventArgs)
If Not Page.IsValid Then
Return
End If
' Actual handler code
End Sub
... or the like.
mike on January 11, 2006 3:24 AMHi Jeff,
Correct my if I'm wrong but your suggested refactorings, whilst making the code read better, won't actually reduce cyclomatic complexity?
Josh Twist on January 11, 2006 3:52 AMWhen refactoring one must be careful to preserve semantics. Nested if clauses are conjunctions, not disjunctions:
if (something) {
if (somethingelse) {
if (somemore) {...
actually combines into
if (something && somethingelse && somemore) {
All the clauses must be true for the inner block to execute.
Introducing exceptions also change the contract of the function. There were no exceptions before. Now you need to write catch handlers in all calling code. You need to consider the cost of throwing an exception is roughly 10000x the cost of executing an if statement.
A better refactoring would be to extract the inner if clauses into a separate function, thus breaking off the tip of the arrow:
if (drc[rowIdx].Table.Columns.Contains("avalId"))
{
do
{
BrokenOffBit( Attributes, attrVal, isChecking, rowIdx );
rowIdx++;
}
while(rowIdx < rowCount && GetIdAsInt32(drc[rowIdx]) == Id);
}
else
rowIdx++;
I originally stormed into the coding world throwing guard clauses and multiple function returns out left and right to cut down on those darn curly braces.
Then I would be forced into debugging a problem. And suddenly those exit points made the ol standby of println() retVal pretty tough to pull off.
So my "rule" looks a bit like this: If the function fits on a less than a dozen lines (and they do, 95% of the time) then multiple returns are A-OK. Once you get into some of the ganglier methods, you can have one or two guard clauses, but everything after than has to have a single return value so you can debug easier.
While I'm rambling about guard clauses, has anyone used those plus some coding guidelines about logging every public method call? Where do you put the log statement? If you enter the method and nothing happens, then it's a bit misleading to write it to the log on the first line of the method. But if you try to figure out where the log statement goes, you spend extra time figuring out where to actually log that something happened. I guess I could write more statements, but at some point you end up logging every line of code.
Anyway, great entry!
Steve Jackson on January 11, 2006 8:52 AMCall me archaic, but combining if's can be bad in many cases.
1) It often makes the code less readable. If the multiple items in the if are not logically grouped, then following that logic can be fuzzy sometimes.
2) Not all languages support short-circutting. VB Classic and VBA are prime examples. Combining IF's can lead to bugs and slower code in those cases.
Eric D. Burdo on January 11, 2006 9:07 AMRegarding the multiple exit points, I agree completely.
I inherited a partially written application from someone. In looking through the code, I found a function that went something like this (Delphi code):
strRet := 'Fred';
if (somecondition) then
strRet := 'Something';
if strRet = 'Fred' then
if (someothercondition) then
strRet := 'Something else';
if strRet = 'Fred' then
// and so forth for 30 additional conditionals
Result := strRet;
I asked the original developer why he'd done it this way. The response? "There should be only one exit point."
Needless to say, all of the tests against 'Fred' soon disappeared, and there certainly are more exit points.
Ken White on January 11, 2006 9:59 AMCodeRush/Refactor make the process of working with these kinds of blocks easier, and refactoring them quickly. Structural Highlighting (CodeRush) draws lines between the start and end of the block, which makes them more clear. With Refactor, you can use the "Replace Nested Conditional with Guard Clause" refactoring to easily flatten your code for better readability, or the "Extract Method" refactoring to move part of the block to its own method. If you're interested, check it out at http://www.devexpress.com/products/NET/Coderush/. Definitely check out the "CodeRush Training" link from that page, and download the trial to see it for yourself.
Disclaimer: I'm just a happy CodeRush/Refactor user, and have no commercial interest in either.
> your suggested refactorings, whilst making the code read better, won't actually reduce cyclomatic complexity
#1, #2, and #4 definitely will reduce cyclomatic complexity-- less IFs are nested, therefore there are fewer total paths through the code. The total number of IF statements might be the same, but they are no longer nested or they've been moved to another function (and no longer nested, at least not in the original function).
The suggestion to combine multiple IF statements on the same line, ala
if (something || somethingelse || somemore) {
Won't reduce CC at all. But that's not on my list..
> has anyone used those plus some coding guidelines about logging every public method call?
Why not just dump the stack trace at any given time rather than doing explicit logging? The stack trace will give you the current method for sure..
Jeff Atwood on January 11, 2006 12:45 PM> You need to consider the cost of throwing an exception is roughly 10000x the cost of executing an if statement.
Sigh... The good old premature optimization. As well, in some languages, throwing exceptions is actually not expensive at all.
foobar on January 11, 2006 1:31 PMfoobar -
Making a good policy on how to use exceptions is not a premature optimization. If you are exposing functionality that can only be reached by catching an exception then (on most platforms) you have a bad design. I'm not claiming devs shouldn't use exceptions, but they shouldn't fall into the same Parse vs TryParse trap the .Net runtime fell into before clr 2.0.
Are exceptions control structures? My answer is "no." Maybe I'm just an old fuddy-duddy but throwing exceptions just to guard the execution of code hides the responsibility evaluate and react to conditions but doesn't make it any easier to satisfy.
Terrier on January 11, 2006 3:18 PM> If you are exposing functionality that can only be reached by catching an exception then (on most platforms) you have a bad design.
This is a wholly different issue than not using exceptions because they're "10000x" slower than an if statement. I suppose that means that .NET is "10000x" slower than some other language - look at all the types that throw exceptions! They should have given me an HRESULT instead. What were they thinking?
foobar on January 11, 2006 7:43 PMThere was a lot of good information in this post, however I too, like some of the other posters, like to only return from a function in one spot.
With a well written nested if statement, it is extremely easy to deallocate memory and/or do other cleanup at the correct times. With "return" statements sprinkled through a function, it gets hard to ensure you don't leak memory. Especially if you edit a previously written function, you will have to find all the return statements and figure out whether memory should be freed there.
In languages where memory mangement is done for you (i.e. "It drops out of scope and the system frees it for me") this is somewhat less of a problem, but then tends to make programmers lazy and forget times where the DO have to clean up after themselves.
G-Man on January 11, 2006 9:03 PM> Why not just dump the stack trace at any given time rather than doing explicit logging? The stack trace will give you the current method for sure..
Because that requires debugging symbols to be deployed with your assemblies, which isn't necesarily a bad thing IMO, but it's a constraint nonetheless. I posted my opinions about deploying pdb's here: http://jaysonknight.com/blog/archive/2005/11/09/2900.aspx
jayson knight on January 12, 2006 4:18 AMIn general, if the functions are small and compact, you can avoid the type of nesting shown.
I think, if you can, try for one return point in a function. This will enable you to put 2 debug statements per function, one at the start marking the arguments passed in and one at the end marking the return value back to the caller.
If you passing objects as parameters, this becomes more difficult, but if the functions are small and compact and doing a small amount of work, this works out quite well.
Additionally, I think debug statements that are used in this fashion gives a developer a good roadmap of what the application is doing without being to0 obtrusive or flooding the code with many debug statements. If there are many debug statements inside of function, perhaps the function needs to be refactored into smaller parts to follow this coding methodology.
I usually allow the deployment team to turn on or off the debug statement with a configuration setting. That way, if something goes wrong, we can turn on debugging to see what is going on without having symbols or a debug build or stack traces. If things really become hairy, then we get out the big guns and get dump files, etc.
Another benefit is that I can turn on debugging locally and look at the trace file and verify all the unit tests ran properly by verifying the output from the debug listener if I'm adding in new code (methods, functions, etc.)
Jon Raynor on January 12, 2006 1:29 PMHey Jeff,
Love this post and the discussions it has brought up. What are your thoughts, in an IF statement, on testing the scenario that is likely to happen most of the time? Does that matter for complexity sake or for readability? I mean this way you can avoid going through the rest of the code most of the time but what if it lessens the readability or checking for a negative versus a positive. Below is a very generic scenario but I think you know what I mean.
Thanks
if(!newsletter.subscribe)
{
// code to unsubscribe
// occurs 90% of the time
}
else
{
// code to subscribe
// occurs 10% of the time
}
return;
Erik Lane on January 17, 2006 10:56 AM> What are your thoughts, in an IF statement, on testing the scenario that is likely to happen most of the time? Does that matter for complexity sake or for readability?
Can you implement this check as a guard clause rather than an if..else statement? If so, do that.
I feel very strongly that non-negative if clauses are easier to read, but it's just a guideline. If there's some extremely compelling reason to do it the other way, then use your discretion.
Jeff Atwood on January 17, 2006 12:48 PMFantastic entry. Seriously.
Scott Hanselman on February 4, 2006 9:19 PMDo not attribute to malice what can easily be explianed through incompetence. -Napoleon Bonaparte
brian on January 10, 2007 12:26 PMJeff, I happen to agree about positive if clauses - out of curiosity, do you ever find that sometimes it is even beneficial to do something like this?
if (condition) {
// Do nothing
} else {
... real code here ...
}
Assertions is a lot better solution than refactoring the same code through the same function. Assertions, with tests cases, can guarantee you that your function parameters are good. You then don't have to code all those nasty 'if' and 'else' to test the validity of your parameters. As your parameters have always a good value, you can code functionality straight.
Jrme Radix on October 26, 2007 10:48 PMBigger Screens!!!
Someday we'll all have huge screens where the entire source code for any major project will all fit on one screen. Have some imagination!
How many acres of screens for the source code to Windows Vista? What new hardware technology do we need to see the whole thing just by moving our eyes and head?
JF
Re: Eber Irigoyen
>> also many times it is possible to combine the "ifs" into a single if:
>>
>> if (something) {
>> if (somethingelse) {
>> if (somemore) {...
>>
>> into
>> if (something || somethingelse || somemore) {
>> ...
Good point, but you should put logical AND (&&) instead of OR (||).
Thank goodness! The experience of practice defeats the supposition of theory.
I've written some HUGE procedures (not recommended), A Expression Evaluator routine that creates the Stack. For some reason when I first started, I was insistent on that "single exit" BS. I soon realized that it was making my code the very opposite of what the method was supposed to do. FAR less readable. My first few iterations were created in a flurry of programming inspiration, and where thus devoid of comments. So When I went back to the code later on I couldn't figure out why execution remained in the function so long after the return value was decided. Very little clean-up code was necessary, which is to say it would all be done anyway by VB. Needless to say, I reverted to the both more readable and faster "get the heck out of there as fast as possible" idiom. No problems since then.
BASeCamper on August 28, 2008 3:49 PMThis arrow pattern fixed can be also done like:
: (function1)
do something
;
: function1
condition1 IF EXIT THEN
condition2 IF EXIT THEN
condition3 IF EXIT THEN
(function1)
;
but you don't like the IF EXIT THEN all the time, so you do like that instead:
: ?-EXIT IF R> DROP THEN ;
or it can be like:
: function1
condition1
condition2 AND
condition3 AND
IF (function1) THEN
;
but it is a bit slow if there are a lot of million of conditions to check.
Can someone explain why having multiple return points in a function that you're debugging is bad?
If you want to echo the value that it's going to return, just use "echo blah()" from outside the function, instead of "echo $blah" from inside?
If it's "easier to read", you should probably work on your commenting, no?
Is there something I'm missing?
Schmoo on October 31, 2008 4:13 AMOne point to make here:
Be careful with those gotos.
A return is effectively:
set_return_pointer(value);
goto where_i_was;
And has one of the problems of gotos: you can't see where the goto/return comes from until you find the goto/return statement. The improvement on gotos is that you can't goto the middle of the code block, only come from it.
So, although I see that multiple returns can make for simple code, so do gotos, so the result is similar. If you keep your function simple (as you really, really should) then the gotos will be clear, and you don't have to worry too much.
Phil H on December 18, 2008 6:01 AMSchmoo: sometimes you might want to print internal state of the function that isn't returned; temporary calculations and such. It helps if your function isn't one-to-one.
Louis Wilson on May 29, 2009 10:39 PMJeff, I completely agree with the approach. Infact I have been religiously encouraging guys around me about this for ages.
- Validate the application state in which a particular task cannot be done. All such validations if they fail, they will raise an error or if logically the task cannot be done, just return.
void doSomething(...) {
if( !validation_condition1 )
return; /* possibly raise an error */
if( !validation_condition2 )
return; /* possibly raise an error */
if( !validation_condition3 )
return; /* possibly raise an error */
/* do your job without further checks */
doJob...
}
> if(!newsletter.subscribe)
I personally am sick of seeing such ! written in code, why cannot a developer write explicit comparisons with true or false... here I had to think for a while what the statement is trying to do, then I found out that it is actually
if (newsletter.subscribe == false)
This is one of my pet peeve.
Ajit on June 4, 2009 6:06 AM>> if(!newsletter.subscribe)
>I personally am sick of seeing such ! written in code, why cannot a >developer write explicit comparisons with true or false...
! can be read as NOT
so it translates to 'if NOT newsletter.subscribe'. Read like that it is pretty easy. If false comparison had to be used I'd prefer it on the left.
if ( false == newsletter.subscribe )
On the right it's too late.
P Byrne on June 9, 2009 2:37 AMCannot disagree with point 4 strongly enough.
The number of times I have had to fix (other peoples) code when they have added a return which then skips code which must be run every time or someone has code which must be run every time but has missed that there is a return somewhere in the method. Even when there is only half adozen lines or so of code. Even when the coder is experienced.
The single exit point is there to reduce complexity. I believe it enhances readibility especially if combined with design by contract principles (see Bertrand Meyer's work on this). As can be suggested by design by contract, if the method is called in an incorrect state then contract is broken and exceptions can/should be thrown.
Of course, you can argue that this can be checked for by TDD, however, I think that throwing out decades of software engineering practices for a more agile one is being too black and white. As an industry we need to integrate the best practices each (and many other sources) have to offer.
Additionally, if you need to jump out of a method mid-stream to make it clearer then I would assert that the method is not clear enough in the first place and needs refactoring.
BTW, I am completely amused that the Captcha words I am about to enter are 'abstain' and 'backlash'. Surely I will get some backlash for my belief that we should all abstain from implementing point 4!!!
Gary Varga on July 17, 2009 12:59 AMAgree with most of your post but have an issue with rule #3.
=====================================================
if (Attributes[attrVal.AttributeClassId] is ArrayList)
{
// if this is many lines of code ...
}
else
{
// ... the exceptional circumstance can be easy
// to miss
}
=====================================================
For this reason, I put the exceptions to the rule up front - which is pretty much like your rule #1, except I'm not talking about exceptions in this case - just exceptional circumstances (like boundary conditions)
=====================================================
if (Attributes[attrVal.AttributeClassId] is NOT ArrayList)
{
// deal with the exceptional circumstance
// (nothing to do with exceptions)
}
else
{
// deal with the standard circumstance
}
=====================================================
| Content (c) 2009 Jeff Atwood. Logo image used with permission of the author. (c) 1993 Steven C. McConnell. All Rights Reserved. |