March 19, 2009
I've been a longtime fan of Eric Lippert's blog. And one of my favorite (albeit short-lived) post series was his Five Dollar Words for Programmers. Although I've sometimes been accused of being too wordy, I find that learning the right word to describe something you're doing is a small step on the road towards understanding and eventual mastery.
Why are these words worth five dollars? They're uncommon words that have a unique and specialized meaning in software development. They are a bit off the beaten path. Words you don't hear often, but also words that provide the thrill of discovery, that "aha" moment as you realize a certain programming concept you knew only through experimentation and intuition has a name.
Eric provides examples of a few great five dollar programming words on his blog.
There are two closely related definitions for idempotent. A value is "idempotent under function foo" if the result of doing foo to the value results in the value right back again.
A function is "idempotent" if the result of doing it twice (feeding the output of the first call into the second call) is exactly the same as the result of doing it once. (Or, in other words, every output of the function is idempotent under it.)
This isn't just academic. Eric notes that idempotence is used all the time in caching functions that create the object being requested. Calling the function two or a thousand times returns the same result as calling it once.
Imagine for instance that you were trying to describe how to get from one point in an empty room to another. A perfectly valid way to do so would be to say how many steps to go north or south, and then how many steps to go northeast or southwest. This hockey-stick navigation system is totally workable, but it feels weird because north and northeast are not orthogonal -- you can't change your position by moving northeast without also at the same time changing how far north you are. With an orthogonal system -- say, the traditional north-south/east-west system -- you can specify how far north to go without worrying about taking the east-west movement into account at all.
Nonorthogonal systems are hard to manipulate because it's hard to tweak isolated parts. Consider my fish tank for example. The pH, hardness, oxidation potential, dissolved oxygen content, salinity and conductivity of the water are very nonorthogonal; changing one tends to have an effect on the others, making it sometimes tricky to get the right balance. Even things like changing the light levels can change the bacteria and algae growth cycles causing chemical changes in the water.
Orthogonality is a powerful concept that applies at every level of coding, from the architecture astronaut to the lowest level code monkey. If modifying item #1 results in unexpected behavior in item #2, you have a major problem -- that's a form of unwanted coupling. Dave Thomas illustrates with a clever helicopter analogy:
It sounds fairly simple. You can use the pedals to point the helicopter where you want it to go. You can use the collective to move up and down. Unfortunately, though, because of the aerodynamics and gyroscopic effects of the blades, all these controls are related. So one small change, such as lowering the collective, causes the helicopter to dip and turn to one side. You have to counteract every change you make with corresponding opposing forces on the other controls. However, by doing that, you introduce more changes to the original control. So you're constantly dancing on all the controls to keep the helicopter stable.
That's kind of similar to code. We've all worked on systems where you make one small change over here, and another problem pops out over there. So you go over there and fix it, but two more problems pop out somewhere else. You constantly push them back -- like that Whack-a-Mole game -- and you just never finish. If the system is not orthogonal, if the pieces interact with each other more than necessary, then you'll always get that kind of distributed bug fixing.
Immutability is a bit more broad, but the commonly accepted definition is based on the fact that
String objects in Java, C#, and Python are immutable.
There's nothing you can do to the number one that changes it. You cannot paint it purple, make it even or get it angry. It's the number one, it is eternal, implacable and unchanging. Attempting to do something to it -- say, adding three to it -- doesn't change the number one at all. Rather, it produces an entirely different and also immutable number. If you cast it to a double, you don't change the integer one; rather, you get a brand new double.
Strings, numbers and the null value are all truly immutable.
Try to imagine your strings painstakingly carved out of enormous blocks of granite. Because they are -- they're immutable! It may seem illogical that every time you modify a
string, the original is kept as-is and an entirely new
string is created. But this is done for two very good technical reasons. Understanding immutability is essential to grok string performance in those languages.
I don't pretend that these three words are particularly unique or new, just a tiny bit off the beaten path. They were, however, new to me at one time, and discovering them marked a small milestone in my own evolution as a programmer.
What's your favorite five dollar programming word? And how did it help you reach that particular "aha" moment in your code? (Links to references / definitions greatly appreciated in comments -- perhaps we can all discover at least one new five dollar programming word today. Remember, learn four and you'll earn a cool twenty bucks worth of knowledge!)
Posted by Jeff Atwood
Well, it's a two word phrase but you have to love 'Cyclomatic complexity'.
Oh yeah, transitive closure. Understanding that one is important. Luckily it's pretty simple, and easily explained with a quick whiteboard diagram. It comes from graph theory which many may dismiss as a froofy subject (academic astronauts, anyone?), but you'd be amazed at how applicable it is.
As a simple example, imagine you are trying to remove unused code from your application. Conceptually, all you need to do is compute the transitive closure of functions called starting at main(), and remove everything that is NOT in that set. FxCop does this to generate warnings about unused code, and C++ linkers do as well for removing code from compilation units (i.o.w., if x.obj has 10 functions, but when app.exe links to x.obj there are 3 that app.exe doesn't use, then those 3 functions are discarded from the final executable).
Garbage collection is also based on this. Compute the transitive closures of reachable objects from all your roots (e.g. local variables at the top of every thread's callstack, and from all static fields), and anything that isn't in that set is eligible for collection. Those objects are not reachable.
Thanks for the shout-out once again Jeff.
A favourite five-dollar word not mentioned already that is slightly relevant to programmers is boustrophedonic. Most dot-matrix and inkjet printers are boustrophedonic -- that is, they alternate going left-to-right and right-to-left.
Chaos aka Cohesion aka High Granularity
I think Cyclomatic is a pretty good $5 word. As in Cyclomatic Complexity.
Funky and hokey, two very important terms that have been overlooked.
the first time i saw this, i thought my c++ compiler was yanking my chain. But i had done something terribly wrong with virtual functions and i learned a lot that day about multiple inheritance
also: authenticate and authorize. seems trivial now but back then i didn't understand that there was a difference.
Mine was 'normalise' (a long time ago, thanks!). The light that went on at that moment lit up the entire world of databases-as-something-other-than-a-big-file.
I don't think the second two are $5 words, they're pretty common. Idempotent is a great one, though. Another one I like is reify: to make something abstract concrete or real. So in programming, you might have something abstract like the intermediate state of a search algorithm. When you make an object that perfectly encapsulates that state, you've reified it.
Normalisation would be one for me. It expresses the Don't Repeat Yourself ethic in terms of minimising the number of copies of data, and instead articulating relationships between the data.
So these are five dollar words...now you know..
Mine is Orthogonality, i first read about it in the Pragmatic Programmer book ;)
I do love things connected to maths!
Canonical - the canonical method recognized, authoritative, authorized, accepted, sanctioned, approved, established, orthodox. antonym unorthodox.
the only aha moment i had was when I found the definition for what it meant - in the context its always used in: examples of doing things the right way.
Polymorphism! Especially parametric polymorphism---a fertile source of reuse.
My favourite five dollar word has to be 'sucky sucky'
Really? I thought blogging was supposed to replace dead tree venues for attaining knowledge. I think that idea is close to being resoundly disproven. Is it ironic that you're using (or actually quoting someone else, as always) using the phrase Five dollar word which is really just another anti-intellectual epithet? I don't know anymore.
Mine are the four words that make up the ACID acronym - Atomicity, Consistency, Isolation, Durability.
I'm with BobF - it had to be Objects ... I remember being completely clueless and struggling through Basic, C, Turbo Pascal, C++ ... and then discovering Object magazine at the local newsstand. It took me 2 years of reading before I really got it.
My suggestion is deprecation, deprecate or any of the variations. Usefully but try to use it in a meeting; they could of used obsolete and made it easier for managers.
I think this is a good $5 word along the lines of idempotent:
*Inverse*. Two functions f and g are inverses if g(f(x))=x. Common example in programming is a pair of functions decode and encode or decrypt and encrypt.
My favourite one is SEMANTIC. It pops up everywhere and makes me really think about what I want to do with the code I'm writing
Two of them. They are even pretty common.
Indirection and abstraction.
These are so fundamental, so powerful, so common.
But there's a common level of programmer who doesn't know either.. and those are doomed to remain scriptmonkeys forever.
I'd have to go with Mission Critical (yes, I know it's two words) or, even more impressively, Enterprise.
A linguist might be interested in comparing the frequency of words used in .ppt files versus words in a standard (e.g., newspaper) corpus. I suspect words like refactor, instantiation, and use-cases would be severely over-represented. Once (non-techincal) managers adopt technical words, they lose their original meanings, and may be used to explain anything.
BTW, now you're a Dad: Noone Asks My Baby's Leakage Average.
I actually used the words domain and range yesterday, discussing a web service that accepted and returned xml in a certain format. A web service is a *function*.
monotonic - values must be monotonically increasing
hysteresis/equilibrium - feedback loop to manage data queue depth
reentrant - not quite the same as thread-safe
I’m not sure if I understand the first definition of idempotent. Are you saying that if foo is a function, then x is idempotent under foo if
x == foo(x)?
After some brief searching I’ve not been able to find any other source that defines idempotent this way. (Not saying that this isn’t a valid definition, but it doesn’t seem to be standard).
Is it possible this definition is being confused with that of an idempotent element of a binary operation bar,
x == bar(x,x)?
your interpretation of idempotent conflicts with the quote you used to describe it.
the quote says: f(x) = f(f(x))
you equated idempotence with the value of a function depending only on its inputs (functional).
My answer would have to be functor in the mathematical sense. Simple, straightforward, two laws, millions of instances of them.
Reference (or pointer) - The thing you're changing is not a real value, it's actually the value of some other thing, which could also be a reference to something else.
@Mark 'Mission Critical' and 'Enterprise' are more like $500,000 or $5M words because nothing that costs 'only' 5 bucks can be worthy of the 'Enterprise' ;).
Referential Transparency (a 10 dollar phrase?). It means that an expression can be replaced with its value and the program won't change. Purely functional languages are referentially transparent since there are no side affects and anytime you call a function you'll get the same result. Wikipedia has some exammples.
For me its race condition. From wikipedia:
A race condition or race hazard is a flaw in a system or process whereby the output and/or result of the process is unexpectedly and critically dependent on the sequence or timing of other events. The term originates with the idea of two signals racing each other to influence the output first.
Favorite words -
Structured - as in structured programming
Function - as in not a subroutine
Can anyone give an example of a context in which orthogonal could NOT, without loss of meaning, be replaced with independent ?
Seriously? You link to another of your posts about him instead of to his blog? Oh the hilarity when you mention him again and link to this post.
–verb (used with object), -cat#8901;ed, -cat#8901;ing.
1. to implant by *repeated statement* or admonition; teach persistently and earnestly (usually followed by upon or in):
to inculcate [coding standards] in the young.
You'll find that many classic Don Box writings will use these $5 words, many times, often in the same sentence.
float getShippingValueAfterCalculatingRateThatInfluencesProductPlusSomeOtherRandomWords(Product product);
See, rather keep the method name a bit smaller, still communicating intent, and put the rest in the code comments.
And I consider us bloggers deipnosophists of the wild wild web ;-)
Coupling as in loose coupling. I've been avoiding tight coupling ever since I first heard the expression years ago and it's saved me many, many times.
Meh. Words are useless unless they're obvious to *everyone* what they mean. And Idempotent is certainly not obvious.
So in a comment you'd have to write:
//This function is idempotent (i.e. it returns the same value if you feed run it multiple times)
''(or whatever it means)''
which means that this word is more of a barrier than anything.
For me the word was Inherit. The opened a whole new world for me.
As for Idempotent, consider this sequence:
x = foo(bar)
y = foo(bar)
Now, if x and y are equal, foo() is idempotent.
I think there might also be something with global state in the definition, but I can't work out exactly how that can fit in. (Taking the example of a cache, if no global state changing across the call is a requirement to be idempotent, then a function that might fetch a result from cache or fetch from remote server is never idempotent, because it'd implicitly be updating the cache with the fetched value if it was not in cache.)
To me, 'refactoring' fits the bill. It seems like such an invented word. Re-Factoring. When was it factored in the first place? The code was never FACTORED - it was WRITTEN.
I feel like i´m in my math class!!
Idempotent is used all over the HTTP spec. It's the biggest difference between an HTTP GET and an HTTP POST request; a GET request should be idempotent, but a POST request might not be. If you follow this, you will usually not break the back button.
That's the definition of Deterministic, not Idempotent.
f(x) = f(f(x))
One that comes to mind is invariant.
Involution, least fixed point and coinduction are nice words.
The example of caching functions wrt idempotence is not very useful, because what you call functions here are really procedures with a side effect. In mathematics (and functional programming), function means something else entirely. Therefore, calling this constant behavior idempotence is a bit misleading, and not very useful IMO.
A better example of idempotence would be:
#119891;#8336;: x #8614; x % a
where % is the modulo. Example: 25 mod 10 is 5, and 5 mod 10 is 5.
Another non mathematical example would be a function FQDN which would return the canonical, fully-qualified domain name, given a name. Or one that returns the capitalized version of the input string.
That is NOT idempotent. Idempotent is from the database world and just means that you have the same results IN A DATABASE if you repeat the operation. The standard example is a withdrawal from a bank account (or credit card charge, etc.) You only want the balance to be debited once even if the action is submitted multiple times. That is, the CONSEQUENCES of the action are the same, not the return value of the function. It's easy to see how you could be confused if you don't have the right context for the description.
You can generalize this to identical consequences in any stateful system. That's where the HTTP definition comes in. They're saying that you should get the same results no matter how many times you do a GET, e.g., if you're submitting a request for your bank account balance. If you want to change something, e.g., you're transferring money between accounts, you should use a POST.
x = f(x) is identity.
If I'm paying $5 for a word, my word is 'poontang'...
Anyway seems like a lot of people pretty uptight about what was a reasonably entertaining read. If it's not any value to you, you might find your time better spent elsewhere instead.
A simple example of idempotent
As Chris questions if the following complies with the meaning of the word:
f(x) = f(f(x))
The answers is: Yes. That is idempotent.
Consider the following implementation.
int f (int someValue)
return someValue x 0
It does nto matter if you call it recursively or not. It does not matter if it is a rainy day, you always get the same results.
- A function that returns the value of Pi.
- Save a unique record in a table e.g Save bank transaction #2343. Does not matter how many times you save the record, it will result in that record being unique in the table.
You want your code to be idempotent when you want to prevent duplication or conditional behavior.
Like preventing me from posting this comment multiple times by simply pressing the 'submit' button frenetically. You want the 'submit' button to be idempotent.
For me, although quite recent, it would be all the words associated to Functionnal Programming :
Curryification, closure, immutability
although closures are not inherent to FP, it took me quite a while to grasp the concept, wich is pretty simple when it finally 'clicks', hence the AHA moment. I'm not a .net guy but i remember they were added to C# quite some time ago.
Immutability is a big one. I watched at the FP crowd like if they were crazy for a long time, i mean, what's the use of an immutable variable.
This one is beginning to click for me, i'm in the middle of an AHA moment :)
These aren't quite as fancy as some of the examples above, but I always thought Verification and Validation were fun ones since it's easy to switch the meanings or forget the difference. (Vaguely: Verification is Does it do what the requirements say? and Validation is Does it do what the user wanted?)
I just took my Sun Certified Java Programmer exam. Covariant, cohesiveness and coupling stood out quite prominently in the study guide for the examination.
Eric Lippert's blog entry is a little confusing because it gives one of the mathematical definitions of an idempotent function without explaining how that relates to the typical computer science definition of the term (i.e. from Wikipedia In computer science, the term idempotent is used to describe methods or subroutine calls that can safely be called multiple times).
To help make things more clear think of it this way. Any operation you call in a computer program operates on an environment (the state of the program, i.e. values of local, global, and system variables). If you think of an operation as a function f, and the current state of the environment as x, then any operation can be written as x1 = f(x), where x1 is the new state of the environment after the call. For idempotent operations, calling a method once may change the environment somehow, x1 = f(x), but calling it a second time should produce the same result, x1 = f(f(x)) and thus f(x) = f(f(x)). To give an example, suppose we have an operation that adds two parameters and assigns them to the third parameter:
AddValues(x, y): x + y
And we call this operation twice:
z = AddValues(x, y)
z = AddValues(x, y)
it's idempotent, because the environment is the same after the first call as it is after the second call. Now, suppose we have an operation that adds values to a global accumulator.
AddToAcc(x): acc = acc + x
And we call it twice:
it's not idempotent, because the environment is different after the first call than it is after the second call. In other words f(x) != f(f(x))
Another bloggers post cut-n-pasted into your own blog. Stop.
I once tried to use the word concatenate in a history paper. My prof wrote in the margin something about not making up words.
People, most of these (canonical, coupling, invariant) have the same definition in other disciplines, so they're not software specific. Try learning something outside of your programming bubble!
2) Polymorphism - Basic Object Oriented Programming
3) Pointer/Reference - 'Nuff Said
4) Buggered - Technical term for describing the state of any given development project :)
It's interesting that all three of your $5 words have to do with resistance to change.
Immutable: can't change, period;
Idempotent: can't change after the first iteration (for a given input);
Orthogonal: can't be changed by the other quantity.
Knowing what is and isn't subject to change is of enormous benefit to the programmer. Const, final, invariant, value objects, and the like make our jobs easier because once established, we know what they've got. On the other hand, if everything were constant, programming would be impossible. Interesting software happens on the shore; one foot on solid dry immutable sand and one in wet uncontrolled chaos.
Like Serge, mine is Orthogonality which I picked up reading Pragmatic Programmer.
Ha: A colleague of mine says orthogonal all the time.
Words from set theory also come up frequently: Superset, subset, union, intersection.
If you're refactoring two functions to share some code, you find the intersection of code between them. If you are combining two products together into one, it will have the union of the two feature sets.
Concatenation - I know it's not a new word, and probably pretty common in everyday programmer-speak. But it's always seemed like an awfully complex (and hard to say) word for such a simple idea.
I have 2 words: re-entrant and reify.
(a) What the hell are you talking about?
(b) Why don't you educate us with your educated suggestions?
ephemeral - enduring a very short time;
The classis use is an ephemeral port, which is what you get when you don't specify a port to use for outbound network connections. The OS assigns an ephemeral port to use.
I would like to pile on here, in the idempotent-versus-deterministic train of thought: ABS() is both idempotent and deterministic, SQRT is deterministic but not idempotent. In the grand scheme of (caching) things, idempotent is rather unimportant and uninteresting IMO, deterministic is critical. Eric Lippert's examples of idempotency in http://blogs.msdn.com/ericlippert/archive/2005/10/26/483900.aspx don't really talk about function return values, but discuss side effects. Idempotent means return the same output value for nested calls, not I'm only going to do this once per execution context. That's not deterministic either: deterministic returning the same value given the same input. Even extended to non-return values, doing something once is vastly different from doing the same thing. Five Dollar Words rarely help, especially when they are incorrectly used. And now I've used a whole buncha fifty-cent words to confuse everyone :)
this blog is such a waste of mental powers. i'm already regretting the two posts i made. i always regret reading this blog when i do, because the topics are nothing more than rehashing old topics.
this isn't blogging.
Carl has described those there, succinctly.
Idempotent is often misused, or used to say I use big words, poorly. Here's a good example of two functions that can be run once, twice, as many times as you want, and you'll get the same results the first the and the fiftieth:
f(x) = 1 * x (Note that applying f fifty times gives the same as f(x).)
g(x) = 0 * x
Note that g is the most generate and trivial) example; f is fairly trivial.
Or'ing in the bit 0x1 is idempotent; it'll stay on, no matter how many times you OR in that bit. (exclusive-OR of that bit is, well, not idempotent.)
My terms are often phrases:
(1) one-to-one correspondence, a.k.a. bijection. (The concept underlies a number of hash and database needs.)
(2) mutually exclusive. (Simple, but is all around us. Orthogonal means absolutely unrelated and mutually exclusive includes so intertwined in effects that you cannot have both - which nearly opposite to orthogonal in some ways.)
(3) relatively prime (integers with no common factors, are relatively prime. 7 and 22 are relatively prime, so that you can do things with the two numbers as a pair that you might normally do only with a prime number and any-ol' other number, such as generating all the possible buckets in a hash table in one cycle through the list.) Prime numbers and relatively-prime pairs can get you into the part of Math called group theory, which is what guarantees the working of about every hash algorithm in existence.
(4) typeface versus font family. (Not only does it distinguish the people who do layout and the old prepress work, but leads you to start putting together documents that stick within 1-2 font families -- as opposed to just grabbing pretty typefaces -- and end up with a more professional look.
@[d3m0n], jargon is useful within a subgroup. It allows precise meaning to be transmitted with little effort (the little effort is the key). Compare the phrase Those two things are the same. to Those two things are isomorphic. Both phrases have roughly the same meaning, but isomorphic tells the person who knows what it means how they are the same. Idempotent is similar in its power to transmit information.
Jargon is one of the reasons you generally require training to enter a field. Common English by itself is too imprecise for deep discussions.
Your nick and attitude imply that you consider yourself elite but as usual the comment betrays a lack of experience.
A word like idempotent not only carries deep meaning, but is an interesting enough word to make someone WANT to understand it. Not to mention you shouldn't need to put idempotent in comments; you need only understand and adhere to the principle when writing your code. If you are still documenting what your code does instead of letting it express itself and then documenting the why, you have a long way to go.
I bet you despise the term enterprise, too...
Refactoring is called refactoring because you are breaking the code down to its individual components (like factoring 12 down to 2, 2, and 3), rearranging these components (3, 2, and 2), and then expanding them back into a program that produces the same answer. Ensuring that the same answer occurs during refactoring is why testing is so important to refactoring. I am annoyed by people who use refactoring to mean rewriting (cleaning up the code, but also changing the answer).
I don't know why, but when I first started coding it took me awhile to grok implicit vs explicit. Lots and lots of other coding puzzle pieces clicked together for me when the difference finally penetrated my cranium.
Refactoring. Although now it seems like the most common programming word around. When I first heard the word refactor it made so much sense. There was a word for what I had been doing all this time.
One of my favorites, that I got from Godel, Escher, and Bach, is isomorphic. A set is isomorphic to another if is it is both one-to-one and onto. For instance, S-expressions are isomorphic to XML. When I realized this, XML finally started making sense to me. Prior to the realization, I had thought of XML as a handy, if verbose, file format that was readable by humans. After the realization, I started to see all of the ways XML could be used: data structures, DSLs, etc.
Nomenclature (Nothing to do with gnomes)
Basically using the big fancy words that experts use instead of laymens terms.
i.e. My two front teeth instead of My two maxillary incisors
I actually am not a big fan of five dollar words.
Not because they aren't useful, but because I find they are used less often to describe programming terms, and more often to try and confuse the muggles among us.
Once you get used to using more technical words to describe what you're doing, it becomes difficult to explain technical concepts to people who don't understand the nomenclature.
@Frank Wilhoit - Can anyone give an example of a context in which orthogonal could NOT, without loss of meaning, be replaced with independent ?
How about when discussing orthogonal vectors? If I remember my math correctly, orthogonal roughly translates as perpendicular, or at 90 degrees to each other; there's probably a bit more to it than that, but I don't recall the details.
Deprecated is big on my list, simply due to the fact that it is constantly being substituted by 'Depreciated', which is incorrect, but is also easy to see why.
Depreciate: to decline in value
Which is what you intend, right? Your old code no longer applies since it has been replaced with newer, more efficient code. In this context, it speaks to the quality of the code.
Meanwhile, Deprecate reads:
Deprecate: to express earnest disapproval of
Notice that this definition is not a reflection of the code quality, but in fact, your stance on that code. It may be the greatest code in the world, or the worst; this is irrelevant. The point of the term is that you, the developer, are making a statement of how that code should be handled by other programmers.
The wikipedia clears it up:
'software features that are superseded and should be avoided'
They may still have some functionality that is appropriate for your application, but you are making a statement about how other developers should treat it...as opposed to describing the actual code itself (and its value).
Wow, that was a huge nerd post for this early hour of the day.
I am by no means a functional programmer, but understanding a lot of its concepts and history has been highly enlightening. It was definately an 'ah-huh!' moment when I realised how state was maintained across a functional program and how everything is immutable.
@[d3m0n], I forgot to add that they also act as shibboleths (another great five dollar word). Many see shibboleths as a bad thing because they are often used in an exclusionary way, but I try to use shibboleths in a good way. I use them to see if someone is already a member of my subgroup. If they aren't then I know that I have to expand my jargon terms into more accessible forms. The shibboleth isn't bad in and of itself, it is how you react to the person passing/failing the test that is good or bad.