I <3 Steve McConnell*
Coding Horror
programming and human factors
by Jeff Atwood

November 15, 2008

Your Favorite NP-Complete Cheat

Have you ever heard a software engineer refer to a problem as "NP-complete"? That's fancy computer science jargon shorthand for "incredibly hard":

The most notable characteristic of NP-complete problems is that no fast solution to them is known; that is, the time required to solve the problem using any currently known algorithm increases very quickly as the size of the problem grows. As a result, the time required to solve even moderately large versions of many of these problems easily reaches into the billions or trillions of years, using any amount of computing power available today. As a consequence, determining whether or not it is possible to solve these problems quickly is one of the principal unsolved problems in Computer Science today.

While a method for computing the solutions to NP-complete problems using a reasonable amount of time remains undiscovered, computer scientists and programmers still frequently encounter NP-complete problems. An expert programmer should be able to recognize an NP-complete problem so that he or she does not unknowingly waste time trying to solve a problem which so far has eluded generations of computer scientists.

You do want to be an expert programmer, don't you? Of course you do!

NP-complete problems are like hardcore pornography. Nobody can define what makes a problem NP-complete, exactly, but you'll know it when you see it. Just this once, I'll refrain from my usual practice of inserting images to illustrate my point.

(Update: I was shooting for a poetic allusion to the P=NP problem here but based on the comments this is confusing and arguably incorrect. So I'll redact this sentence. Instead, I point you to this P=NP poll (pdf); read the comments from CS professors (including Knuth) to get an idea of how realistic this might be.)

Instead, I'll recommend a book Anthony Scian recommended to me: Computers and Intractability: A Guide to the Theory of NP-Completeness.

Computers and Intractability: A Guide to the Theory of NP-Completeness

Like all the software engineering books I recommend, this book has a timeless quality. It was originally published in 1979, a shining testament to smart people attacking truly difficult problems in computer science: "I can't find an efficient algorithm, but neither can all these famous people."

So how many problems are NP-complete? Lots.

Even if you're a layman, you might have experienced NP-Completeness in the form of Minesweeper, as Ian Stewart explains. But for programmers, I'd argue the most well known NP-completeness problem is the travelling salesman problem.

Given a number of cities and the costs of travelling from any city to any other city, what is the least-cost round-trip route that visits each city exactly once and then returns to the starting city?

The brute-force solution -- trying every possible permutation between the cities -- might work for a very small network of cities, but this quickly becomes untenable. Even if we were to use theoretical CPUs our children might own, or our children's children. What's worse, every other algorithm we come up with to find an optimal path for the salesman has the same problem. That's the common characteristic of NP-complete problems: they are exercises in heuristics and approximation, as illustrated by this xkcd cartoon:

xkcd: Travelling Salesman Problem

What do expert programmers do when faced by an intractable problem? They cheat. And so should you! Indeed, some of the modern approximations for the Travelling Salesman Problem are remarkably effective.

Various approximation algorithms, which quickly yield good solutions with high probability, have been devised. Modern methods can find solutions for extremely large problems (millions of cities) within a reasonable time, with a high probability of being just 2-3% away from the optimal solution.

Unfortunately, not all NP-complete problems have good approximations. But for those that do, I have to wonder: if we can get so close to an optimal solution by cheating, does it really matter if there's no known algorithm to produce the optimal solution? If I've learned nothing else from NP-complete problems, I've learned this: sometimes coming up with clever cheats can be more interesting than searching in vain for the perfect solution.

Consider the First Fit Decreasing algorithm for the NP-complete Bin Packing problem . It's not perfect, but it's incredibly simple and fast. The algorithm is so simple, in fact, it is regularly demonstrated at time management seminars. Oh, and it guarantees that you will get within 22% of the perfect solution every time. Not bad for a lousy cheat.

So what's your favorite NP-complete cheat?

Posted by Jeff Atwood    View blog reactions
« Stop Me If You Think You've Seen This Word Before
We Are Typists First, Programmers Second »
Comments

I would have to say 3-SAT. Simply because it sits at the root of most NP proofs and nearly any other NP problem can be reduced to it.

Good book BTW. Its still used in most universities for teaching advanced algorithms and algorithm theory.

Blair on November 15, 2008 6:39 AM

I say this often and I think it applies here rather well: The best solution to any problem is avoiding having that problem (not avoiding solving it) in the first place.

Calvin Spealman on November 15, 2008 6:55 AM

Regarding not being satisfied by approximate solutions, I like to point out that the original problem was itself an approximation. If a salesman really wants to visit a bunch of cities, there are lots of little things that go into determining what really is the best route that are not part of the simply-stated computer science problem -- avoiding roads with construction, not driving through major cities at rush hour, etc. -- and so your solution that's within 2% of optimal may be just as good as the exact solution at solving the real problem.

John on November 15, 2008 6:57 AM

approaches that use evolution in one way or another are pretty damned good. for the traveling salesman problem you could use a generic algorithm, for example. they're really fun programs to write too!

nanreh on November 15, 2008 7:19 AM

Up Up Down Down Left Right Left Right B A Select Start

Matt on November 15, 2008 7:26 AM

AAAARGH! So... many... inaccuracies... must... bleach... brain...
By the ancient Hynerian gods, Jeff, please never ever write about computer science again. Seriously.

Yes, maybe you got your main point across, but you could just have easily explain it all to laymen without making useless mistakes on the way. Well, *someone* could have gotten it right. It is evident that you couldn't... Or didn't even try.

The two most annoying are probably calling approximations "cheats", and your obvious lack of research into what "NP-complete" means (or even, what it IS).

Approximation algorithms and their bounds don't just appear out of thin air. They are more often than not invented, researched, proven correct, and upper-bounded by the same "fancy computer scientists" studying "intractable" problems. They are often quite a bit harder to reason about than the obvious (but slow) optimal solutions. Calling them "cheats" is like calling dynamic programming a cheat "because it just uses more memory".

Regarding NP-complete, it is not only a "fancy computer science jargon shorthand for incredibly hard". In the same vein, "gravity" is not a fancy physics jargon for "things fall down". There's a meaning behind the term, there's theory. Thank you for trivializing complexity and computability.
"Nobody can define what makes a problem NP-complete, exactly,"? Perhaps *you* can't. After all, it's just "fancy jargon", isn't it?
For those reading my rant this far, here's a quick informal definition from the National Institute of Standards and Technology:

A problem is in the NP-complete class if answers can be verified quickly (polynomial time in length of input), and quick (polynomial time again) algorithm to solve this problem can be used to solve all other NP problems quickly (this is called reduction).

You can also find it in the top of wikipedia's page on NP-complete here http://en.wikipedia.org/wiki/Np_complete .

M on November 15, 2008 7:33 AM

When doing the traveling salesman problem; do you include layovers as being a city that he/she "visits"? :P

Suroot on November 15, 2008 7:41 AM

Like M said (but less abrasively), there's a concrete definition of what NP-Complete means. It's mathy, but that just means it's a CS thing instead of a programming thing.

I don't mind calling approximations cheats, whatever gets you through the day.

Bill Weiss on November 15, 2008 7:51 AM

NP-Complete is perhaps the most non-intuitive name for a very very hard problem!

Wouldn't an optimal solution fall under the realm of premature optimisation? Shouldn't 2-3% off optimal be fine unless profiling determined otherwise?

It amazes me that the realm of computers, somewhat uniquely, is subject to the constraints of pure efficiency. For thousands of years we have done things the slow way then when we finally have a method of solving a problem 1 million times faster, we're expected to do it 1.001 times faster. Baffles the mind.

Take for instance, traffic lights. The most horribly inefficient system in existance today - affecting millions of people and I'd argue costing millions of dollars annually in inefficiency. Work on solving that problem before worrying about saving that poor salesman $100 in gas money.

`Josh on November 15, 2008 7:55 AM

3D Software Rendering

Hoffmann on November 15, 2008 8:33 AM

+1 for @M and @Bill Weiss: There are well-accepted terms of art for these; please don't call them cheats. Also, you failed to communicate the distinction between an approximation algorithm and a heuristic. The latter is an algorithm that works well in practice but isn't guaranteed to do so, while an approximation algorithm has a specific bound on how nearly optimal its answers are guaranteed to be. (There are also randomized approximation algorithms, which provably meet a bound from optimality with a specific probability, often dependent on specific characteristics of the input.)

However, I'll agree with the notion that every software engineer should understand NP-completeness at least well enough to recognize the possibility that he/she is encountering it.

Joe Ganley on November 15, 2008 8:42 AM

John:
"I like to point out that the original problem was itself an approximation. If a salesman really wants to visit a bunch of cities, there are lots of little things that go into determining what really is the best route [...]-- avoiding roads with construction, not driving through major cities at rush hour, etc. ..."

Thats why people invented weighted graphs.


Josh:
"NP-Complete is perhaps the most non-intuitive name for a very very hard problem!

Wouldn't an optimal solution fall under the realm of premature optimisation? Shouldn't 2-3% off optimal be fine unless profiling determined otherwise?"


It might not be intuitive, but it is rather logical when you know the theory behind it.

What the hell are you talking about? Profiling refers to running time, 2-3% off refers to optimality. If you already know that your problem is NP-complete and your instance sufficiently large, you won't have to do profiling (which will be rather unsuccessful anyway). But if you don't get that TSP is just an analogy to describe a kind of graph problem, which can come up in totally different contexts, you're a lost cause anyway.


Jeff, I have to concede with M here. Maybe you managed to give a somewhat perverted introduction to the topic.

But this one makes me cringe: "Nobody can define what makes a problem NP-complete, exactly, but you'll know it when you see it."

Do you really think that? Do you think that thousands of papers and hundreds of books have been written about a topic, which is not clearly defined and that in theoretical computer science?

I'm the first to admit, that it might be difficult, if not impossible to define NP-completeness in a blog post and I would have been really amazed if you had pulled that off. But, blatantly lying and saying it can't be defined was the worst thing you could have done.

H on November 15, 2008 9:11 AM

With respect, M, Jeff did not say that there is no definition of the term "NP-Complete". He clearly knows what the term means, and even links to the Wikipedia article you cite. What he said is there is no way to define the characteristics of a problem that make them NP complete. That's a much more interesting problem, and he's right. We don't have an a priori way to tell which problems are going to be NP Complete, though we can recognize when a specific problem is. Its worth reading Jeff's wording carefully, as he's usually quite precise.

There's really no need to be so aggressive and patronizing. Especially if you aren't correct.

Dan

Dan on November 15, 2008 9:15 AM

"Nobody can define what makes a problem NP-complete, exactly, but you'll know it when you see it."

I think what Jeff means is that nobody can define what makes a particular problem belong to NP, but not P. Of course, we don't know if any problems actually do satisfy this, but some (like the traveling goofball) seem to.

Eam on November 15, 2008 9:22 AM

Traveling Salesman is not NP-Complete, the decision variant is.
http://en.wikipedia.org/wiki/Travelling_salesman_problem#NP-hardness

I sortof expected little mistakes like that, though. Algorithmic complexity can be very counter-intuitive. For example, if you have to decide colorability on a planar graph:
2-coloring: P
3-coloring: NP-Complete
4-coloring: constant time [4-color theorem]

Strilanc on November 15, 2008 10:02 AM

That might be what Jeff wanted to say, however there is a strong definition of NP-completeness (a language L is part of NP and all other languages in NP are polynomial time reducible to L). As all problems in P are part of NP, this remains the same even if P=NP. In that case it won't be very interesting, as all problems in P, except always accepting and never accepting are P-complete, but it is still a valid definition.

H on November 15, 2008 10:16 AM

Strilanc,
if you start with Jeff not mentioning that he talks about decision problems, you might as well mention that he didn't state what type of reduction he is talking about.

H on November 15, 2008 10:25 AM

@dan:
I've set it before. Jeff's MO is the "google expert" MO (he calls it "just-in-time learning"). It may be fine if you're just trying to solve this or that problem and get on with your work. But if you're gonna talk about something, you had better do the research. And Jeff seldom does.

Obviously Jeff puts in considerable time in *writing* his blog posts. I appreciate that. Is it that hard to spend more minutes for *researching* them?
Or here's an idea! Don't talk about things that you clearly DON'T know about. Reading Wikipedia does not make one a domain expert, and is definitely not enough to start explaining this to others.

Every single time Jeff talks about something I know about, he make many small but not insignificant errors, betraying his ignorance on the subjects. Sometimes Jeff makes gross errors. The kind of errors that anyone who has a little bit of knowledge in that area would know.
It makes me wonder how much of what Jeff says about stuff I don't know about, is a mistake.

Also, trivializing the work of hundreds of computer scientists is extremely condescending. I actually avoided calling it what it is, in my previous post.
How can Jeff, with his 30 minutes of reading Wikipedia, claim that "no one can define what makes a problem NP-complete"?
And calling something 'fancy computer science jargon shorthand for "incredibly hard"' is not only condescending, it's down right insulting, when it comes from someone who very obviously don't know what their talking about.

My point? Such articles (read back a bit, this is not an isoladed incident) are superficial and shallow, poorly researched pieces.
I don't think it was Jeff's intention, but they promote and ENCOURAGE the notion that programmers don't need to study CS at all, that it's just a bunch of B.S.
I mean, why study intractable problems when we can "cheat"?

And like I said, it's insulting. I don't insult Jeff's work as a web programmer. I know NOTHING about web programming. I don't write about web programming in such manner.

Dan, "We don't have an a priori way to tell which problems are going to be NP Complete, though we can recognize when a specific problem is"
That's a great sentence. Very clear. But that's totally not what Jeff said. Again, I encourage you to read back through the archives, on posts dealing with subjects bordering on CS. Jeff is often wrong on the basic details, and you can read the comments and see him being corrected time and again.

M on November 15, 2008 10:30 AM

Wow, these comments have gotten pretty nasty.

I think what's coming out more than anything is evidence of the growing animosity between computer science (theory) and software development (practice).

@M: It's insulting? How many computer scientists have I heard in the past week imply this analogy:
computer science : mechanical engineer :: programmer : auto mechanic
I've known brilliant head-in-the-clouds computer scientists (PhDs and all) that write code at a rate of 5 lines per month (with comments).

Until both sides start respecting each other as completely distinct yet dependent professions, this kind of thing will be par for the course.

Anonyous Coward on November 15, 2008 11:06 AM

To all of the current posts detractors, if you dislike this site so much, why on earth do you even bother to read it? Does it make you feel superior to tear down and trash someone elses observations and comments?

On a somewhat unrelated note, I consider the "orange" CAPTCHA system that Jeff setup was brilliant, especially after reading his explanation of it.

I feel there is an IQ test somewhere regarding peoples complaints against it...

Davide on November 15, 2008 11:22 AM

"Nobody can define what makes a problem NP-complete, exactly, but you'll know it when you see it."

Is this a joke?

Peter Davis on November 15, 2008 12:25 PM

Accusing Atwood of not explaining NP-completeness well enough here is like expecting your friendly college science professor to explain the mathematical and cosmological underpinnings of E=MC2 in 1000 words or less.

Or when the kindergarten teacher doesn't explain the Greco-Roman-Arabic derivation of the letters of the alphabet to his four-year-olds, do you start ranting and raving in his face, spittle flying, cursing his stupidity?

The piece was (obviously) an attempt to take a very difficult subject and make it palatable. I'd say it succeeded. It's a pale and weak sort of ego that would find anything offensive here.

Just my .02.

James Devlin on November 15, 2008 12:27 PM

I was wondering about the not having approximations for all NP-complete problems. The thing that is getting me is I thought that part of the definition of being in the NP-complete set is that a problem in the set is isomorphic with every other problem in the set. ( That is why proving that p=np or the inverse for one problem proves it for all of the problems. ) Given that why is it that we can not us a approximation for one on any of the others? Am I really confused about something here?

-Jonathan

Jonathan Moore on November 15, 2008 12:28 PM

<a href="http://en.wikipedia.org/wiki/Graph_theory">http://en.wikipedia.org/wiki/Graph_theory</a>

James Devlin on November 15, 2008 12:33 PM

My favorite NP-Complete problem is bin-packing. Maybe shape separability.

Dan Lecocq on November 15, 2008 12:35 PM

I, for one, rather enjoy Jeff's comments about computing. I work as a project manager for a small software company whose customers are not technical by any stretch of the imagination. Being able to communicate computer science ideas to the lay person, and make it enjoyable and understandable is not a simple task. Sometimes you have to dumb it down and stretch some ideas to make it work. So go bleach your brain, crawl back into your cave, and read your IEEE spec documents. Damn, can't we have more well rounded people in this industry???

Jon on November 15, 2008 12:37 PM

Why do you make some many laymen sort of posts about how we should all be learning more about computer science (when many of probably have CS degrees and have known about this since sophomore year of undergrad days) but you think learning C is useless?

Just wondering.

Charles on November 15, 2008 12:44 PM

To Jon who is railing about "more well rounded people in this industry:" You realize this is a technical blog right? Sure there may be non-technical "project managers" reading this but its kind of pointless to have a laymen sort of discussion with a manager of a K-Mart about Approximation Algorithms. Why would you even do that? You're just showing your ignorance by perpetuating some stereotype about technical people going back to their caves.

Charles on November 15, 2008 12:47 PM

@M Wow. You really know how to "take the bait" in every sense of the phrase!

Articles like these are exactly _why_ I read Coding Horror. I get a brief introduction to a subject I will find interesting, that I ought to be aware of the existence of, or both. I also get a little humour, and not so much as a trace of condescension about my lack of abilities or knowledge.

It's never occurred to me, *ever*, to just take Jeff's word for anything. Why would I? No subject that I find interesting is likely to be completely exposed by a few paragraphs, however accurate or well written, especially when a substantial amount of the writing is dedicated as it is to sarcasm, memes, or humourous links. Single-sourcing my learning is unlikely to be helpful either, so I never count on a single source, no matter the level of expertise involved.

Now I have a starting point for any research I care to conduct, a few terms that I can now spell properly, and a useful working definition of the subject at hand. That's right, not a correct definition; it doesn't need to be absolutely correct to be partially useful. I can also count on Jeff's unstoppable sense of self-promotion to require him to keep an eye out for humourous or light updates in future posts , so he can self-link. I don't have to keep a scan out for developments if I don't want a deep understanding of the subject.

Just to be clear, it is cheating, if you are the kind of person who believes that every program you write must be totally optimized right out of the gate. Where would Jeff get the idea that a substantial number of his readers are like that, I wonder?

Dustman on November 15, 2008 1:05 PM

Jeff rocks. His blog is wonderful, informative, and entertaining.

M is an anonymous coward.

Keep it up Jeff.

Swami Atma on November 15, 2008 1:37 PM

@Dustman: Thanks, you just saved me a LOT of time writing to make the same points but not as well. I'm a CS grad, and still got something useful from the post: a book recommendation. Thanks, Jeff.

A. L. Flanagan on November 15, 2008 2:02 PM

@Charles: Ok, so the cave comment was knee-jerk and stupid, I admit it. My only point was that the industry _needs_ technical people who can communicate to and work with non-techies. This to me is a skill that is hard to find among your run of the mill software engineer.

By the way, I have a degree in C.S. and worked as an engineer for 6 years prior to my current position, so I am not "ignorant" to computer science principles. "You're just showing your ignorance by perpetuating some stereotype" that all project managers are non-technical :)

Jon on November 15, 2008 2:08 PM

"Wouldn't an optimal solution fall under the realm of premature optimisation? Shouldn't 2-3% off optimal be fine unless profiling determined otherwise?"

You're confusing micro-optimisations with algorithmic complexity. Selecting the right algorithm is a very different thing from obscuring your code by trying to be clever and fast.

[ICR] on November 15, 2008 2:30 PM

Please revise your comment "NP-complete problems are like hardcore pornography. Nobody can define what makes a problem NP-complete, exactly, but you'll know it when you see it."

That's plainly wrong. It's quite common that people misunderstand the the details of complexity theory, overestimate their grasp of the topic, and spread their misconceptions. I'm surprised that even some of my most talented developer friends have this trait. And I certainly had it all wrong until grad school.

People who DO


Jason on November 15, 2008 2:43 PM

..oops hit enter too soon.

People who DO know the topic CAN explain it in accessible ways, and DO NOT utter blatant falsities. Some examples of this is:

"This problem, like many others, is NP-complete. Sigh, everything good in life is NP-complete" (meaning, many important problems that we need and would love to solve are, unfortunately, NP-complete. It is ABSOLUTELY TRUE that NP-completeness crops up more than we'd like... and sometimes in unexpected places)

"P ?= NP issue is analogous to why ordinary people can't compose like Mozart, but can still appreciate his work for the genius that it is." (listening and appreciating music is like being a verifier, you can check that a solution to a hard problem is correct, but you can't produce the solutions yourself. An ordinary person could brute force it by continuously composing until something came out that was magnificient... but then that could take a very very long time).

THESE comments are constructive... AND correct.

Jason on November 15, 2008 2:54 PM

@Jonathan: The transformations that convert an instance of one NP-complete problem into another aren't necessarily (and typically aren't) approximation-preserving. Similarly, a heuristic that works well for problem A often won't work well on an instance of problem B transformed into an instance of problem A.

Joe Ganley on November 15, 2008 2:56 PM

Dan --

I agree that M is being a bit of an asshole, but I startled a bit when I read that NP-complete was a "know it when I see it" situation. It's a precise statement of fact, which is contrary to fact. In other words, it's WRONG.

I really don't see what you're trying to say. The characteristics of a problem that make it NP complete are completely and unambiguously defined -- the definition is the same as the definition of NP-complete. What sort of characteristics are you looking for, by way of analogy?

Ens on November 15, 2008 2:57 PM

Problem: people.

Solution: hammer.

dnm on November 15, 2008 3:16 PM

I think Jeff's writings regarding hardware are his best.

You would be kind of naive to not think that there is a psychological plan to Jeff's posts: regular (which may limit his research), self linking, Amazon links, designed to get fan-boys typing and pulse rate up.

Jeff is wrong or incomplete or addressing issues he's not completely familiar with. If you have read this blog for a number of years then you know in your heart that it has become somewhat "unclean".

Steve on November 15, 2008 3:32 PM

For me, spreading ignorance (or bad information due to ignorance) is an issue.
If you are gonna talk about subject X, make sure you know subject X, well enough to talk about it. At least, make sure that you are not making gross errors about subject X. Is it really too much to ask?

It bothers me on both the practical level and as a matter of principle.

I've been working as a professional programmer for years, and I've encountered many without basic scientific background. And that's FINE! Not everyone needs it, or wants it. But then these people read a blog post, and it sounds right, so they believe it. After all, they lack the knowledge to figure out which parts are true and which are don't. That's why they are reading the blog!
And they learn stuff that is WRONG. And then they are going to implement this stuff, and argue about it, and generally BELIEVE that anything having to do with CS is either unpractical or is easily enough learned in 30 minutes of reading.
And then I have to work with these people, and manage them! They have no grasp of how much they simply don't KNOW. And at some point, their knowledge simply ain't gonna cut it. And they are going to argue with me, or write me off as a user of "fancy computer science jargon". I mean, it's just register allocation, right? How hard could it be? It's only BSP tree optimization, let's just check all the options!

So I am trying to combat this ignorance for practical, selfish reasons. Programmers need to understand the problems they work on. They need to understand when they are out of their depths, and its time to hit the books, or call someone who knows. Or reject that project, or bump up the cost and time estimate. At least map out the areas of your understanding, so you'll know when you're on treacherous ground.

The other reason is a matter of principle. Ignorance is pretty bad, and I reject mediocrity for its own sake.
In less fancy terms, if you are writing technical posts, get the technical details RIGHT! Isn't that a given? But apparently I am an asshole for pointing it out :)

For the person who asked, the reasons I read Jeff's posts are two:
a) Jeff often finds interesting links. I sometimes only skim through what he writes, looking for the links.
b) On occasion like this, I have a chance to make my contribution to education. Maybe help some of the people who would otherwise be misinformed, by pointing out mistakes. People that some day I might work with. I'd rather their eyes don't glaze over when I mention fancy computer science jargon.

@Swami: calling me an "anonymous coward" is fine, and perhaps far from the truth, but is also an "ad hominem argument".

M on November 15, 2008 3:37 PM

Want to have some fun with similar problems? Go give Project Euler a shot: http://projecteuler.net/

Some of the last ones will drive you crazy! >.<

Mitch on November 15, 2008 3:48 PM

I don't know if this is considered an NP-Complete problem, but the algorithm I'm still still trying to find a solution to is comparing a data list problem.

Given a table(s) with no primary key or true consistency
---------------------------------------------
Thomas Smith 123 Any St Suite 1 Dayton 45424
Tom Smith 123 Any St Apt 1 Huber Heights 45424
Roberta Smith 123 Any St Apt 2 Huber Heights 45424
Bobby Smith 123 Any St Apt 2 Dayton 45424
ABC Realty 123 Any St Apt 1 Dayton 45424
---------------------------------------------

Try to de-duplicate and/or assign a consistent reference to the data programmatically has always been a nightmare.

The other problem I've run into is capitalization of proper names. Easy to do for a human, but tell a computer to do it.

Jim P. on November 15, 2008 4:07 PM

Sorry, I guess I should have put a smiley face next to this comment to convey my (poor) humour:

"Wouldn't an optimal solution fall under the realm of premature optimisation? Shouldn't 2-3% off optimal be fine unless profiling determined otherwise? [hehe]"

Better?

@M
"But then these people read a blog post, and it sounds right, so they believe it. After all, they lack the knowledge to figure out which parts are true and which are don't. That's why they are reading the blog!"

I'm a non-scientific person, of the type you allude to. I don't read this blog as an insight to all things CS. I read this blog for its entertainment value and as an introduction to areas I may otherwise not have come across. Is Jeff responsible for mine and thousands of others learning habits? It is MY job to research areas that I may be implementing.

Further, Jeff has ALWAYS stated not to take his word verbatim. He uses the 'shock and awe' method. If people are in a position where they need to analyse NP-Complete problems, do you think their sole reference would be here?

Once I got to uni I was angry at my high school teachers for teaching me 'incorrect' definitions. Once I hit third year uni, I was thankful to my high school teachers for not burdening me with unnecessay complexity until it was required.

Finally - a blog post exists for the author to speak about whatever they want to. That people subscribe is their business. I don't believe there is any responsibility that comes with being a successful blogger. You have no right to tell Jeff he may not speak about something. The comments are here to point out inconsistencies. And if the quality of the post is in question (or that of the entire blog) people will stop reading. This, Jeff has also explained previously.

In summary, M - get off your high horse and write a blog of your own, then direct those that require greater understanding to yours.

`Josh on November 15, 2008 5:28 PM

> Please revise your comment "NP-complete problems are like hardcore pornography. Nobody can define what makes a problem NP-complete, exactly, but you'll know it when you see it." That's plainly wrong.

How is it wrong?

--
http://max.cs.kzoo.edu/~kschultz/CS510/ClassPresentations/NPCartoons.html

The theory of NP-completeness provides many straightforward techniques for proving that a given problem is "just as hard" as a large number of other problems that are widely recognize as being difficult and that have been confounding the experts for years. Armed with these techniques, you might be able to prove that the bandersnatch problem is NP-complete and march into your boss's office and announce: I can't find an efficient algorithm, but neither can all these famous people.

--

I guess I make a distinction between "reducing problem to another known NP-complete problem" and "proving a problem is NP-complete". Apparently we can't *prove* anything is NP-complete, we can only throw a wall of experts at it and see them all fail.

So, a NP-complete problem is not mathematically proven, it's one that no experts can solve, eg, "I'll know it when I see it."

Or more specifically "we'll know it when lots of experts have seen it, tried, and failed."

Is this wrong?

Jeff Atwood on November 15, 2008 5:43 PM

I write video games and I 'cheat' all the time. As long as the results look the same the only thing important to me is how fast can I get those results?

NP - Complete is a mathematical term. I think it should stand for Nobel Prize Completely in the bag! Shame mathematicians can't get one. Instead they have to be happy with a Fields medal (if they’re under 40)

I'm not really bothered about nit picking myself and I believe that absolute answers are seldom necessary. That's the difference between the philosophy of Maths (pure) and realistic Maths (applied). No one cares what the billionth digit of PI is. It would just be fun to be able to know what it is in an instant if we wanted to.

I suspect that some questions themselves are actually flawed. It's a bit like the answer to life the Universe and Everything being 42. Unless the question is understood the answer means nothing.

There is no absolute answer to PI. There never can be in decimal terms. Approximations are what survival is all about.

With the travelling salesman; who says there is 'the quickest route' when there might be 100 routes of exactly the same length. Also where does the salesman start? That is not mentioned in the original question, also the layout of the towns, what if they were all in a straight line? Start at one end and work to the other etc... So I believe the question is flawed in this sense.

I think the words 'cunning approximation' might pacify the opinions of the masses instead of 'cheat'

Like the article though.
:)

Can I also mention that I dislike the idea of the Turing test? Why should a computer try to pass itself off as a human? This is a silly test. A computer is a computer and should do what they are good at.

Do we have a test whereby a human has to pass itself off as a computer? No.

Is there a test whereby an apple has to pass itself off as a banana? No.

Alan Turing may have been a genius, but some times they can come out with some pretty stupid stuff, make mistakes and come up with pointless exercises. If a Philosopher wants to spend the rest of their life pointlessly attempting to extrapilate meaning from an inane and misguided question - let them.

:)

Sci-Fi Si on November 15, 2008 5:47 PM

My favourite NP-complete cheat is Ant Colony Optimisation. It's great: intuitive and interesting, if a little difficult to implement.

Dom on November 15, 2008 5:51 PM

Thought I'd just add that there are still at least 21 Nondeterministic Polynomial questions to solve and plenty of money to make if anyone fancies having a go at them

:)

Sci-Fi Si on November 15, 2008 5:54 PM

Jeff Atwood:
------------------------

I guess I make a distinction between "reducing problem to another known NP-complete problem" and "proving a problem is NP-complete". Apparently we can't *prove* anything is NP-complete, we can only throw a wall of experts at it and see them all fail.

So, a NP-complete problem is not mathematically proven, it's one that no experts can solve, eg, "I'll know it when I see it."

Or more specifically "we'll know it when lots of experts have seen it, tried, and failed."

Is this wrong?
----------

No Jeff, with all due respect, this is completely wrong. There is no mathematical problem called "The NP-Complete Problem." There are certain problems are NP-Complete. You are confusing this with the "P=NP Problem."

You JUST wrote that "we *can't* prove that a problem is NP-Complete" that is completely false. We *very well* know how to prove a problem is NP-Complete. You really need to go through some introductory texts first. It is not like "Well, it's like porn; I know it when I see it."

The issue is with P=NP. Also, it's not "cheating" and its not as if Computer Scientists are trying in vain to find optimal solutions. That's just silly. Saying that "no experts can solve it so I know it when I see it" reveals a fundamental lack of understanding in the theory. If you want to talk about what is "practically done" or something like that then you can say something about how in many cases we proceed as if P != NP; however there is no proof of *that*. However, we can very well prove that problems are NP-Complete.

The bit you quoted:

"The theory of NP-completeness provides many straightforward techniques for proving that a given problem is "just as hard" as a large number of other problems that are widely recognize as being difficult and that have been confounding the experts for years."


All that is saying is that you may find some new problems which are NP-Complete but it is also very likely that those problems can be restated in the form of a "well known" NP-Complete problem (probably one of Karps 21 NP-Complete problems).

Charles on November 15, 2008 6:10 PM

All you who are making out the whole issue of NP-Complete and NP versus P to be just jibber jabber and "practical" answers are what people really care about don't really understand the issues involved. I mean, surely, you must realize if you certain problems (like the RSA problem) are able to be solved efficiently then most cryptographic algorithms are basically useless right? How's that for practical?.... ::shrugs::

Charles on November 15, 2008 6:13 PM

Jeff you have one thing wrong there. You can prove NP-completeness, thats what Cook's Theorem comes down to. What you can't prove is that there is no way to solve an NP-complete problem in polynomial time. We assume there isn't, because a lot of intelligent people have been looking very hard and haven't found one. Now if someone did, every NP-complete problem would still be NP complete, they just wouldnt be "difficult" anymore.

H on November 15, 2008 6:21 PM

Wow. Just wow.
Jeff, why did you post about a subject about which you know so little? Infact, why did you even link to a book you didn't read? I don't understand how you can be in a position to recommend it at all.

Daniel on November 15, 2008 6:56 PM

If you look carefully you will see that I'm an apple.

However, there is no way of working out how long it will take you to see that I am an apple.

I will say between 3 days to 17.2 years depending on how many degrees of Kevin Bacon there are between us

Banana on November 15, 2008 7:06 PM

Jeff:

As many people have pointed out, you're completely wrong with saying nobody knows what NP-complete is. NP-completeness is perfectly clear and actually has a definition (just look on Wikipedia and you'll see it).

P is the class of problems that can be solved in polynomial type. For example, mergesort is O(nlogn) which is polynomial, therefore mergesort is in P.
NP means non-deterministic polynomial time, which means that given a problem and a solution to that problem, you can verify whether the solution is correct for the problem in polynomial time (which means if we have some kind of quantum computer that is capable of doing all possibilities simultaneously, then we can solve any NP problem in polynomial time).

An NP-complete problem is a problem that is more difficult or as difficult as any problem in NP. We have proven many problems are NP-complete - look at SAT and reducibility on Wikipedia and you can see for yourself.

The problem (as some people have pointed out) is proving that P is the same set as NP. Computer scientists haven't been able to prove this right or wrong. Don't get this confused with people not knowing what NP-complete is.

Also keep in mind that NP-complete isn't as hard as it gets. There are complexity classes that are harder than NP like NEXPTIME.

Rob on November 15, 2008 7:54 PM

I don't think Jeff ever claimed to be an expert in the complexity field, or whatever field of CS this falls under... So try to be a little nicer in your smarter-than-thou critiques.

the porn quote---
NP-complete problems are like hardcore pornography. Nobody can define what makes a problem NP-complete, exactly, but you'll know it when you see it.
----
I don't think Jeff (correct me if wrong) is talking to people who have studied the subject... he's talking to the SOFTWARE ENGINEERS (not all software writers are computer scientists, you egocentric SOBs). He is talking to the people who replied thinking the traveling salesman problem is about a salesman.


A software engineer does not need to be able to memorize a list of infinite manifestations of NP-Complete problems (TSP being an optimal network path vs an actual salesman, vs google maps vs how much wood could a wood chuck chuck). A software engineer need only to have a 'spider-sense' about these problems and the know-how to recognize or prove that his/her problem is NP-Complete.


I think it's funny that this article is under such fire...
Jeff may not be able to write a scholarly article on NP-Completeness... but he knows it when he sees it. Isn't that the point of this?

Chees0rz on November 15, 2008 8:19 PM

@Chees0rz:

[quote]
I don't think Jeff (correct me if wrong) is talking to people who have studied the subject... he's talking to the SOFTWARE ENGINEERS (not all software writers are computer scientists, you egocentric SOBs). He is talking to the people who replied thinking the traveling salesman problem is about a salesman.
[/quote]

Perhaps your definition of a software engineer differs from mine. I'd expect any decent SE to have a fundamental grasp of algorithmic complexity -- to the point that they can discuss problems with different complexity classes and tell me why a given problem is very hard. I'd also expect a software engineer to know that the TSP has very little to do with salesmen.

I suggest that anyone who cannot deal with (or at least be able to learn) such theory is in the wrong profession. Computers are hard. Programming is hard. In order to be proficient programmers we need to understand the nature of that hardness lest we make stupid mistakes that cause our software to fail spectacularly.

I don't think any of this is unrealistic. In Australia at least, Software Engineering is just as rigorous in these areas as Computer Science and nothing being discussed in this article or the posts that follow it are beyond undergraduate level education.

Daniel on November 15, 2008 9:40 PM

For those who try to defend the author, we are not blaming him because he is being brief. We are blaming him because he is WRONG.

NP Completeness is a term that can be easily defined, but he is taking a wrong approach. NP is NOT about a problem being hard. It is about the fast verification of a given solution. So in short, NP completeness only has two aspects:
1) fast verification of a given solution
2) If you can solve one NPC problem quickly, you can solve them all quickly.

I don't think this is that much more complicated that what the author said, but this is the correct explanation.

Also, saying that NPC problems can be spotted by expert programmer is completely and utterly false. One excellent example is primality check. For many many years people believe that to test whether a given integer is prime is NPC, WRONG. With the discovery of AKS primality check in 2002, we know that actually primality check is actually in P. Even the best mathematicans/ programmers wouldn't have known before that.

Freddy on November 15, 2008 10:14 PM

This is my first post to this discussion, since I just saw the post. Well, I am really curious why people are 'defending' Jeff.

Nobody here is trying to show off.

The information presented in the blog post is plain wrong, and therefore misleading. I am not going to discuss the relevant topics here (i.e., NP-completeness and approximation algorithms - called 'cheat's in the blog post). This is not a matter of attack or defense, but maybe an opportunity for Jeff to do some self-criticism.

At various posts, I disagree with Jeff, but most of the posts are on very subjective topics, and since there is no objective metric, argument is time spent for nothing. But disagreement is not bad, I am sure in some of the topics Jeff's are more accurate than my thoughts, and in some vice versa.

The worrying part about the post is not only its inaccuracy, but its reflection on the credibility of the author himself. At most times, when I am ignorant on a topic he posts on such as 3D cards, I took what he said for granted - not totally granted, but with high confidence I should say. This is not true from today on, because today he posted on a topic that I am very knowledgeable about, and the information is wrong, which forces me automatically to degrade his credibility on the other posts from today on.

But I hope that the comments on this blog post will at least force Jeff to reconsider his posting strategy.

Jeff: I think you are a good writer, and I don't have to tell you this, because you make your life from blogging which says enough. In any case, I think you should stick to the following basic principles:

* Do not post about something you don't know as if you have an idea about it: Posting about something you don't know is fine unless you make it absolutely clear you don't know it, want to learn more about it, or asking the reader's for assistance and comments.

* Do not give references to text you haven't read unless you make it clear you haven't read them: If you had read the first 2 chapters of Garey & Johnson, you'd not have written this post.

I suggest that you retract this post, or the misinformation propagated will be your fault. I think one of the readers can make a guest post about the topic, which will be informative to the people who don't know the topic.

tuncay on November 15, 2008 11:19 PM

How much wood would a woodchuck chuck, if a...

Reviculous on November 15, 2008 11:51 PM

Is NP-problems like hardcore pornography? I never thought there were a substitute for that, thanks for now I'll be solving Travelling Salesman problem instead... hope it will get me the same pleasure..

Jack on November 16, 2008 12:28 AM

Genetic algorithm. It has worked well for TSP, thus I've always kept a good GA library (with some timeout stoppers for example no improvement in solution during X seconds, or no exceeding X seconds of total computation ) in order to solve NP-Probs when they have arisen.

Jok on November 16, 2008 2:36 AM

When reading this post, it surprised me how it depends on the circumstances when you're first heard the term "NP-complete". Jeff calls it a shorthand for "incredibly hard", but I always remembered it as "easy". How come?
At university, I wondered whether I should study computational linguistics and attended a first lecture about syntax theories to get a glance about what that subject is like. The problem is to syntactically parse a sentence in natural language using a computer. In that lecture, if a syntax theory is "NP-complete", that meant "good, quickly solvable in polynomial time." "Not NP-complete" meant "bad, takes exponential time to solve". But the worst case (in that lecture some version of a Chomsky syntax theory) was "plainly unimplementable".
My interest for computational linguistics originated in my enthusiasm for text adventures (today called "interactive fiction") many years before (on a Commodore 128), so I'd call "well written textadventures with bewitching graphics" (Magnetic Scrolls) as my favorite NP-complete cheat.

tarnold on November 16, 2008 2:58 AM

While I and some of the other critics here have done a pretty good job covering where Jeff went wrong, I decided to give this a go myself: http://joeganley.com/2008/11/np-completeness-i.html ... I plan to write further installments in the coming weeks on attacking NP-hard problems: heuristics, approximation algorithms, exact algorithms, restricted problem inputs, and general optimization algorithms (like simulated annealing).

Joe Ganley on November 16, 2008 4:46 AM

We know exactly what NP-Complete means: it is defined as the set of decision problems that can be solved in polynomial time on a nondeterministic Turing machine.

P can be solved by a deterministic Turing machine using a polynomial amount of computation time, or polynomial time.

We know exactly how to show that a problem is NP-Complete: to show that with polynomial time on a deterministic Turing machine, we can convert solving that problem into solving all NP-complete problems. Reducing a problem is another word for encoding it, and it's NP-Complete if the encoding work is small.

<b>If you wanted to, you could write a program to convert any np-complete problem to some program that would run in polynomial time on a non-deterministic Turing machine, and your program would run in polynomial time if you did it correctly.</b>

The definitive definition paper
http://www.cs.toronto.edu/~sacook/homepage/1971.pdf.gz

I was in a class where we did this, making the transition from 3-SAT to a nondeterministic Turing machine notation. It was boring.

John the Statistician on November 16, 2008 4:51 AM

Can't you all see Jeff is just trying to provoke you? After all he's quoting a cartoon as evidence in his comment.

I think this is a good strategy...
1) pretend you really don't get the point
2) have a buch of foolproof explanations attached to your blog-post for free.

Well, have fun.

Alex on November 16, 2008 5:02 AM

@Alex: just because Jeff has no regard for mathematical truth doesn't mean that it isn't worth our time if we do. (I don't mean this negatively, necessarily. I think it's a natural outcome of combining a practical, only-when-I-need it outlook with the strong opinions weakly held position. Just because it's bullshit in the Harry Frankfurt sense, doesn't mean it isn't a fine way to go about things.)

John the Statistician on November 16, 2008 5:10 AM

Writing the optimal blog post would have taken a very very long time.

Nice approximation, Jeff.

zing on November 16, 2008 5:25 AM

Poor old Jeff. He tries, he really does.

Well after the beating he's taking by various over nerds I feel I ought to make him feel better.

Ok here goes.

I had absolutely no idea what you were talking about, and even less about what you're detractors are talking about.

You're not the only guy who doesn't understand stuff Jeff

PixelGnome on November 16, 2008 5:31 AM

I think what just happened here is called market segmentation.
Some part of Jeff's readers have a course in computability theory under their belt, so they are rightfully angry at the informal definition of a very formal entity.
The other part, whose lives were full without using the term "NP-complete", may have gotten a glimpse of this interesting notion.
Kudos for this, Jeff.
Here's Dorit Hochbaum's book about approximations (not hacks) for NP-hard problems:
http://www.ieor.berkeley.edu/~hochbaum/html/book-aanp.html
(I guess this is more for the first crowd...)

Yuval on November 16, 2008 6:11 AM

Can someone solve this?

a=b
so, a^2 = a*b
so, a^2 - b^2 = (a*b) - b^2
factor: (a+b)(a-b) = b*(a-b)
cancel: (a+b) = b
so, 2b = b
so, 2 = 1

No Googling!

Steve on November 16, 2008 6:50 AM

division by zero exception

Alex on November 16, 2008 7:05 AM

Umm, you definitely CAN define that a problem is NP hard or complete. It's called a proof. If you can prove that a problem is NP*, you can thus prove to yourself (or others) that the actual solution is intractable. So you can stop trying to find one, and focus on
a.) redefining the problem
b.) using approximation algorithms

Chris on November 16, 2008 7:23 AM

Travelling SalesPerson is one of my favorite examples of why you should spend some time refining the requirements of you problem when you run into something that might be NP-Complete. Not only is the TSP problem easy to approximate, only the general case of TSP is NP-Complete. TSP when all of the points are co-planar (a pretty common case on the planet Earth) is *not* NP-Complete.

Sean Reilly on November 16, 2008 9:57 AM

Jeff Atwood: "I guess I make a distinction between "reducing problem to another known NP-complete problem" and "proving a problem is NP-complete". Apparently we can't *prove* anything is NP-complete, we can only throw a wall of experts at it and see them all fail."

This is incorrect - the Cook-Levin theorem (http://en.wikipedia.org/wiki/Cook%27s_theorem) proves that the problem 3-SAT is NP-Complete. My understanding of the history is that Cook published in 1971; Levin's work was done around the same time, but not published until a bit later, and it didn't become so well known in the West, as Levin is Russian. For that reason, the result is often known as Cook's theorem.

Since that time, thousands of problems have been proved NP-complete, usually using a problem known to be NP-Complete, like 3-SAT, and a reduction. Garey and Johnson contains several hundred such problems, and many more are known today.

What is not yet known is how to prove that NP-Complete problem cannot be solved in polynomial time. If that could proved, it would solve the famous P vs NP problem.

(That aside, thanks for your blog, which I greatly enjoy.)

Michael Nielsen on November 16, 2008 9:57 AM

I very seldom see much acknowledgement that there are *lots* of complexity classes far far higher than NP (and thus problems much much harder than NP-complete ones), and that in practice the limits of worst-case tractibility are substantially lower than NP: algorithms which run slower than O(n^2) are seldom tolerable in practice.

In many cases there's no such thing as an "approximate" solution: you can approximate a "best" path, but figuring out whether or not there is *any* path satisfying some set of conditions is all or nothing. In those cases, the best approach is to set aside the worst case and focus on realistic cases. There are plenty of very hard problems (NExpTime and higher) which can be solved in time close to O(n log n) on the kinds of data that tend to be encountered in practice.

If I were a manager and programmer came to me saying "this problem is in such a high complexity class it would be impossible to solve!" I'd encourage the programmer to identify patterns in the type of data we expect that could be exploited for optimization. A formal definition of this "tractable fragment" of the original problem may sometimes be difficult to codify, but that's beside the point if you can solve the problems you encounter. It's easy to phrase any easy problem as a subset of a much harder problem; the trick is to identify the truly necessary bits of the problem and see how hard that part is on its own.

The criticism I'd level at high-complexity research is that academics can dwell on the really hard problems (because they're interesting) and skip quickly over simplifications (because they're trivial and not worthy of study).

Rob Shearer on November 16, 2008 10:35 AM

Having scanned over the other comments, it seems that a lot of people are complaining about the article again. Well, I don’t. It might not be the most accurate description but it brings the point across (and I just like the analogy). Plus, some of the comments are sooo much worse. Consider the very first one: “… because … nearly any other NP problem can be reduced to [3-SAT].” This reveals a complete misundestanding of NP-completeness and the concept of reduction.

With that out of the way, I want to answer your question. I’m undecided between three very different cheats. The first uses cutting planes to literally “cut away” parts of the (mathematical formulation of the) problem to make it easier. I like it of its practical importance. Take TSP. Nowadays, this problem has very good solvers that use this approach (often combined with branch-and-bound to yield branch-and-cut), combined with raw computation power, to tackle even extremely large problem instances. For instance, CONCORDE has found the exact solution for the 24,978 cities of Sweden.

The second approach, Lagrangean relaxation just has to have a place in the list because of its elegant mathematical derivation. By applying a Lagrangean multiplier (that can be found easily) to one term of the problem, the overall runtime of the problem can be drastically decreased.

The last cheat is ant colony optimization. The underlying analogy is just fascinating. Also, implementing this solution is very easy and just plain fun. With the right sets of parameters, this heuristic often leads to very good results. Unfortunately, there’s no guarantee for good results which is simply unacceptable for many problems.

konrad on November 16, 2008 10:55 AM

Hey Now Jeff,
Those cartoons are really funny.
Coding Horror Fan,
Catto

Catto on November 16, 2008 11:13 AM

-----
So, a NP-complete problem is not mathematically proven, it's one that no experts can solve, eg, "I'll know it when I see it."
-----

This is so wrong it hurts.

To add something constructive to the Jambalaya, I should note that not all "high complexity - won't solve in a million-billion years for a large problem" problems are NP-Complete...

... there are problems that are just as hard, or harder, and don't have the decency of being NP-Complete, but still require the use of approximations and heuristics.

Eric Londaits on November 16, 2008 11:56 AM

Why can't I vote-up these answers? :p

Dean on November 16, 2008 2:41 PM

NP = Not Possible [in linear time]

BTW, Jeff, I don't know if you would read my comment, but since your captcha is always orange, and I never see any spam in any of your blog, how did you manage that?


Kevin Le on November 16, 2008 3:30 PM

@Steve,

You asked:

>>Can someone solve this?

>>a=b
>>so, a^2 = a*b
>>so, a^2 - b^2 = (a*b) - b^2
>>factor: (a+b)(a-b) = b*(a-b)
>>cancel: (a+b) = b
>>so, 2b = b
>>so, 2 = 1

There are many variations to this type of puzzles. I knew or rather learned from 7th grade that all of them either involve (1) dividing by 0 or (2) Take a square root and ignore the fact the it's rather the absolute values that are equal.

Kevin Le on November 16, 2008 3:36 PM

-----
NP = Not Possible [in linear time]
-----

Not quite... but in line with the blog post.

NP = Nondeterministic polynomial

it means that it can be solved with a polynomial order algorithm in a Non-deterministic Turing Machine. A non-deterministic Turing Machine is a Turing Machine which can take many paths of conditional execution at the same time.

Eric Londaits on November 16, 2008 4:09 PM

My favorite NP-problem and hack must be the Quake 3 engine's sqrt() function.

http://www.codemaestro.com/reviews/9

Martijn Laarman on November 16, 2008 4:13 PM

Jeff: "Unfortunately, not all NP-complete problems have good approximations."

The very definition of NP-complete shows this to be false; that a problem is NP-complete means all problems in NP can be reduced to this problem. If you have an algorithm for one NP-complete problem, you have an algorithm for all of NP.

porges on November 16, 2008 4:21 PM

I'll answer the original question:

- Simulated Annealing
- Great Deluge
- Genetic Algorithms

Steve Moyer on November 16, 2008 5:13 PM

Hey Jeff,
Cool cartoon.

No, I do not take what you read as gospel, anyone who takes a single opinion from the interweb, and embarks upon a course of action based upon that, deserves whatever they get.

I read this blog because I find it entertaining, and sometimes it prompts me to go and learn something new, or revisit something that I had long since forgotten.

I am not going to defend you as a mathematician or computer scientist, as that is not the point of this blog. Had you posted this in a scientific journal, then it may well be appropriate for others to take such vocal umbrage at your lack of academic rigour.

Having inherited a few systems from mathematicians and academics, and suffered the horrors of making them work in the real world, that is not a perspective that I seek unless I have to.

This is edutainment, has never pretended to be anything else, and there is nothing wrong with that.

Like I said, cool cartoon. Works for a code monkey like me.

seanb on November 16, 2008 6:23 PM

It's not about rigor. It's completely wrong. You are dumber as a result of having read it.

seeuo on November 16, 2008 8:32 PM

Is it just me, or has this blog turned into ads for books lately?

Grom on November 16, 2008 9:26 PM

I think we're missing the main definition in this post:

"NP-complete problems are like hardcore pornography. Nobody can define what makes a problem NP-complete, exactly, but you'll know it when you see it."

So, we must define "hardcore pornography" first, and then discuss if it's or not NP-Complete, isn't it?

alberto on November 16, 2008 11:22 PM

These approximations to NP-Complete problem solutions, i'd be surprised if the quality of the approximation didn't degrade significantly with the input size - to the point perhaps where they're not useful anymore.

I would expect this but i don't remember seeing it confirmed anywhere. I mean, otherwise we'd have a probabilistic polynomial-time algorithm for solving NP-Complete problems (and i don't think we do because last i checked, 5 secs ago on wikipedia, the relationship between BPP and NP was another unsolved problem). In fact i think i remember seeing that likely BPP and P are equivalent (something about derandomization), so since probably P != NP, BPP != NP and we can't have a good approximating algorithm (for all input sizes).

Meaning, these approximating algorithms are no good for large input sizes, but please correct me if i'm completely wrong.

Bobby on November 16, 2008 11:26 PM

Not that I think some of the criticisms of this article aren't warranted, but to those of you who keep screaming about the formal definition of NP:

While Jeff's definition is certainly flawed in some respects, citing the formal definition of NP would be next to useless in the context of this blog. I challenge you to explain the practical implications of NP's formal definition to a lay-programmer in a way that you think is more "correct" than simply saying "known algorithms for finding optimal solutions to NP problems have extremely high computational complexity."

I mean, saying here that NP is about "quick verification of solutions and reducibility" is the same as telling a layperson that e=mc^2 means "energy equals mass times speed of light squared." No *practical* insight about the matter at hand is given by either definition without a lot of heavily mathematical and scientific pipelaying.

People usually appreciate having things explained to them in a way that doesn't make them feel stupid. I make such explanations all the time in my work. However, when I describe things to people using oversimplified abstractions, I make it a point to tell them that it is not *strictly correct* to think of it this way, and that the *truth* of the matter is much more complicated. But most of the time they are satisifed with, and can make much practical use of the simpler version plus caveat.

I don't think most of the serious commenters that are being identified as "defending" Jeff are defending his factual errors here as much as they are recoiling to the somewhat hostile and excessive pedantry being displayed. Hard and exact science is very important and (to me) fascinating. But it is not useful to shove *unnessecarily* large chunks of background theory down the throats of people who just want to get their work done, when some slighly innacurate but much simpler abstraction will almost always suffice for them.

Also, I disagree that your average software developer can't be expected to spot NP problems. In practice they are fairly easily spotted because almost ALL of them encountered in the wild are more or less isomorphic to some other known NP problem, like the Knapsack or TSP. Any good developer should be familiar with the relatively short list of "infamous" NP problems, and *should* be able to tell if the problem at his feet is just a variation on one of those.

Kyle S on November 17, 2008 12:21 AM

As an example, writing a non trivial web site that displays exactly the same in all major browser is NP-complete, and any semi experienced web developer have learned its way to the 3% from optimal solution.

jwickers on November 17, 2008 2:00 AM

I came up with a generic solution for the TSP a couple of years ago. Here's a hint... fractals. And lagrange polynomials. I leave the rest as an exercise for the reader.

Peter Maxwell on November 17, 2008 3:14 AM

My favorite algorithm is the Ant Colony Optimization, actually a solution to the traveling sales man. It is in fact a brute force algorithm and still it's better than real brute force. It starts as pure brute force but already after several iterations some solutions will turn out as being so sub-optimal, that the algorithm will hardly consider them anymore and thus the algorithm finds a solution much faster than real brute force. Real brute force would try all solutions, even if a solution seems to be very suboptimal at first, it might turn out to be an optimal solution in the long run. This is however very unlikely when usang ACO. Solutions that look sub-optimal in the very beginning are usually sub-optimal. If some ants are already back home while others are still on the way, the route used by the other ones is definitely worse than the one of the ants already back home.

Mecki on November 17, 2008 4:06 AM

Jeff is actually right here, and the CS types are barking up the wrong tree. This discussion is about the _software engineering_ term NP-complete, not the mathematical term of the same name. As he says:

--
Have you ever heard a software engineer refer to a problem as "NP-complete"? That's fancy computer science jargon shorthand for "incredibly hard":
--

Lots of other fields have similar confusion caused by a shared use of a word or phrase - for example, both religious studies and architecture use _postmodern_, for completely different things that some people tie themselves into knots trying to justify as essentially the same.

It might avoid future similar confusion if developers were to use the more mathematically correct term _superpolynomial_ when they want to say 'really hard, so we shouldn't do it' without sounding lazy. The only downside is other developers won't understand it they way they would the less mathematically correct term.

soru on November 17, 2008 4:08 AM

> sometimes coming up with clever cheats can be more interesting than searching in vain for the perfect solution

Word.

Smart: understanding and recognising an NP-Complete problem.

Gets things done: uses a suitable approximation.

Paul D. Waite on November 17, 2008 5:11 AM

"Unfortunately, not all NP-complete problems have good approximations."

Err, wait a minute. I thought one of the characteristics that made a particular problem NP-Complete as opposed to NP-Hard was that all other NP-Hard problems could be reduced to it. Therefore, wouldn't it stand that if you have a good approximation for the one NP-Complete problem, you'd have a good approximation for them all?

I have to admit I fell into the trap of refering to the TSP as NP-Complete, but only because it's the one that we're always taught.

"Jeff is actually right here, and the CS types are barking up the wrong tree. This discussion is about the _software engineering_ term NP-complete, not the mathematical term of the same name"

No, Jeff is not right here. NP-Completeness is a shining example of where the theory and the practice meet. It defines the limitations on what can be done with a computer. It's more than knowing that a problem is hard, it's knowing why it's hard, and what you can do to work around it. The software engineering "definition" is incredibly misleading and simplistic, as it ignores the fundamental point that all NP-Hard problems can be reduced to an NP-Complete problem. That's very powerful, because that means that if you're trying to come up with a way to get a result from something you know to be NP-Hard, you can reduce it to an NP-Complete problem and approximate that.

Delmania on November 17, 2008 5:21 AM

I can see both sides of the argument here. Jeff has at least brought to light the idea of NP-completeness to many people who otherwise wouldn't have even heard of the concept. Ok, there are inaccuracies in his post but hopefully those that are interested will follow it up. However, that is where the problem lies. Many people are not that great at thinking for themselves and don't check out the facts that they have been given are actually correct. I see this every day. They take a snippet from a blog and think it's gospel. The problem seems therefore to be not Jeff's post per say, but those that don't expand and "complete" their knowledge of the subject. Anyway, Jeff's post inspired me to write a post on NP-completeness in a little more depth over at own blog (http://www.equivalence.co.uk/), hopefully this may be of use to Jeff and others.

Gregg on November 17, 2008 6:00 AM

Jeff, I think you went too far this time.

This is not the "We have no idea about IT, but we want to look knowledgable" blog. This is "programming and human factors" blog.

Couldn't you write the post without all those childish mistakes?

NP-complete is not another name for "very hard". No.

Imagine, there are NP and NP-complete classes of problems - and both of them are "very hard". Know the difference between them? If not, please do not write about computer science.

Jakub Anderwald on November 17, 2008 6:34 AM

In cryptography, getting the right answer is all that's important, and getting an approximation is next to useless. Take an example of creating a parallel cryptographic hash function (like md5 or sha-1) where you hash small chunks, but then add all the outputs as large integers, and hash this sum one more time. This is essentially the subset sum problem. There are many ways (through dynamic programming and such) to find approximate solutions, but you haven't compromised the cryptography unless you find an exact match. Getting it within 2-3% of the correct solution only reduces your search space from 2^80, in the case of sha-1, down to 2^74 or so, which is still an intractable problem.

Matt Ball on November 17, 2008 6:37 AM

It's actually NP-hard, not NP-complete. There's a distinct and important difference between the two. It doesn't look like you bothered to even read the Wikipedia article you linked.

Jason on November 17, 2008 7:00 AM

Interesting to see you bring up the bin-packing problem. I used variants of the known implementations to create an output image that tried to have the least possible area for a given image set.

Theres source code on my site with it in 12/12/07, I was smitten with finding variants of different sizes and outputs. Really fun stuff.

Jake Heidt on November 17, 2008 7:08 AM

great post jeff. but the alt text on the xkcd img should really be fixed to read as on the actual cartoon page.

Sean on November 17, 2008 7:35 AM

'as it ignores the fundamental point that all NP-Hard problems can be reduced to an NP-Complete problem'

Actually, not true: many 'Np-hard' (s/w eng definition) are, in strict mathematical terms, exponential, or just polynomial with impractical coefficients. Outside military cryptography, there is rarely a pragmatic difference between 'would take a CPU-week' and 'if you turned the Sun into computronium, would still take longer than the heat death of the universe'.

Either is too long to wait for a web page to refresh.

To a zoologist it's Bos Taurus, to a farmer it's a cow. The more precise terms are sometimes useful, but if you want to talk to a farmer, you don't have to use their language, but you'd better understand it.

soru on November 17, 2008 8:10 AM

By the amount of scrolling I had to get to the bottom of the page, can I safely assume that writing a blog post about NP-Complete that doesn't freak people out is NP-Complete?

Neil C. Obremski on November 17, 2008 8:48 AM

Waitaminute...! A reference to a traveling salesman, and not one crack about a farmer's daughter?

Incredible.
Improbable.
Infinitely improbable.

Jon on November 17, 2008 9:03 AM

It's not cheating it's engineering. That's the diff betwixt science and engineering - otherwise we'd never get anything done.

iFubesi on November 17, 2008 9:33 AM

Remove money and no salesmen are needed.

Silvercode on November 17, 2008 9:42 AM

I'm beginning to think that pagination on Stack Overflow is an NP-complete problem, because it's still broken after all this time!

See <a href="http://stackoverflow.com/questions/63241/what-is-the-strangest-programming-language-you-have-used">http://stackoverflow.com/questions/63241/what-is-the-strangest-programming-language-you-have-used</a> and notice the two page fours. And yes, I have logged this on UserVoice: <a href="http://stackoverflow.uservoice.com/pages/general/suggestions/65993">http://stackoverflow.uservoice.com/pages/general/suggestions/65993</a>

John Topley on November 17, 2008 9:49 AM

> Apparently we can't *prove* anything is NP-complete

What I think you mean to say is: We can't *prove* NP-complete problems are "hard".

"NP-complete" is not a synonym for difficult. Some people think it is, and you may hear it used that way, but that does not mean you should encourage it. We already have a perfectly cromulent word for "difficult", and that word is "difficult".

Michael G on November 17, 2008 10:41 AM

I prefer genetic algorithms for NP-hard problems, that are transformable into a genetic algorithm (i.e. TSP).

MTZ on November 17, 2008 12:10 PM

NP-complete?

That's fancy computer science jargon shorthand for "incredibly time-consuming", or "incredibly hard to solve quickly", or more specifically "incredibly time-consuming to solve, and pretty quick to verify the solution"

Matt V on November 17, 2008 12:24 PM

Oh please.

His definition is sufficient (and sarcastically enjoyable) for those who are trying to solve these problems in the real world.

The differences are irrelevant to dealing with them in the real world. They are really hard... too hard to solve in a reasonable amount of time.

That's all I need to know. For the rest, who cares?

Some people take these things too seriously. For the intended audience of this blog (read: developers, not computer scientists), this is an excellent definition.

Practicality on November 17, 2008 12:54 PM

"That's fancy computer science jargon shorthand for "incredibly time-consuming", or "incredibly hard to solve quickly", or more specifically "incredibly time-consuming to solve, and pretty quick to verify the solution"

You got it backwards. For most NP-Hard problems, devising a solution is easy. It's verifying the solution is correct that is difficult.

"His definition is sufficient (and sarcastically enjoyable) for those who are trying to solve these problems in the real world. "

The ever wonderful "real world" argument tha plagues software development. It's not sufficient. I honestly don't really care what the software engineering defintion is, it's incorrect and misleading. The reduction componenet of the definition is crucial because it can help you devise approximations to solve the problem. After all, "it's too hard" doesn't always work.

Delmania on November 17, 2008 1:02 PM

As someone with an advanced degree in CS, I don't think his explanation was all that bad.

Compare it with Wikipedia's definition, which is effectively infinitely recursive:

"a class of problems having two properties:
* Any given solution to the problem can be verified quickly (in polynomial time); the set of problems with this property is called NP.
* If the problem can be solved quickly (in polynomial time), then so can every problem in NP."

Not nearly as useful to the layman, IMHO. I do believe there's a better definition involving a combination of Turing machines and set theory, but if you don't already know about Turing machines that wouldn't be particularly useful to you either, without a twelve paragraph exposition on the subject.

T.E.D. on November 17, 2008 1:15 PM

How about having a new classification, for those us for whom mathematical proofs don't play such a large part in our working lives, so as not to offend our more erudite colleagues:
FK Hard - you'll know it when you see it

seanb on November 17, 2008 1:23 PM

Sigh. So, to those who want to fight over the correct definition. What do you do for a living?

I would really like to know if anyone who is actual "front line" developer (not a researcher or comp sci teacher) has ever found the technical difference to be useful.

I have been in the field for quite a while now, and most everything you are taught in CS is absolutely useless.

In addition, I would like to know if any of you have been out of college for more than two years.

Prove me wrong. Really. I would like to know if there is such a case where proving how hard a problem is would actually be useful in a business environment. (Not a theoretical case, a real one, that actually happened).

Practicality on November 17, 2008 1:27 PM

Is this blog NP Complete?

OG on November 17, 2008 1:55 PM

Practicality --

Embedded system to automate paint designs on walls efficiently. I won't bore you with the finer details (and there were many), but it was trivially reproducible to a travelling salesman problem. As such, I was able to go through the existing body of work and determine which well-known algorithms had sufficiently low complexity to work given my constraints (which translate to constant-factors) before trying to program every bloody one of them in assembly and then watching to see if it moved right away. It was critical and saved countless hours. And I can only do it because we know with true rigour that the TSP is NP-complete.

Now I do mostly systems programming in C++ on desktop computers in performance-critical applications working for, let's keep this somewhat anonymous, one of the top-5 software development companies as ranked by revenue. NEXPTIME isn't an important term here, but a basic idea of complexity analysis that's a little more sophisticated than "I know it when I see it" is and has been absolutely necessary. More sophisticated and more not completely wrong.

You're right though, I'm not long out of school -- ~1.5 years. However, I didn't take CS, I took Engineering (not Software or Computer or Electrical Engineering), and while we briefly covered big-O notation analysis we didn't really deal with P=NP. I picked this up partly on my own time and partly on the job.

I don't mind if you don't know these facts, particularly if it isn't in the least bit relevant to what *you* do. But I do mind complete falsehoods being paraded around as facts. Even that wasn't so bad, but I'm aggravated to see people "defending" the error (including Jeff!? I sure hope he's just confused and asking for clarification instead of claiming it's not wrong). It's an error. We all make errors, even popular bloggers. That doesn't make it not an error.

Ens on November 17, 2008 1:57 PM

@Ens

Neat. Thank you for the informative story. Believe it or not, your real life explanation of how you solved that problem would fascinate me, but that's what I am looking for :).

However, I would argue that the fact you looked it up yourself would validate that we don't need to know this information ahead of time to enter the field.

Also, I would agree that, technically, Jeff made an error. He should say most of us can't define it (or don't care to), rather than nobody can.

However, I do think this a minor and picky error, when the intent of the post was to say that when you encounter such a problem, you need to find a workaround.

I think the harsh insults directed at Jeff for making a technical error of "very few" versus "nobody" are far worse than making the error itself.

Practicality on November 17, 2008 2:15 PM

@Practicality: I've been out of school for 14 years, and spent the first 10 of those years working on problems (many of them NP-hard) of optimizing the layout of computer chips. Making real software that is sold to and used by chip designers at semiconductor companies like Intel, AMD, nVidia, and the like. You doubtless own many pieces of electronics that were designed using software that I wrote. (I'm just making the point that I am neither a researcher nor an academic, but a working software engineer.) Being able to determine which of the problems I encounter is NP-hard is critical to doing this well. Often, trying to determine that a problem is NP-hard when it isn't provides the critical insight to solving it. Only a really sorry engineer writes a heuristic for a problem that can in fact be solved efficiently, just because he doesn't know the difference between "NP-hard" and "I can't easily figure out how to solve this."

I don't know what kind of work you do, but I use much of what I learned in CS every day, and have done so in every job I've ever held.

Joe Ganley on November 17, 2008 5:01 PM

It seems like all the people defending Jeff here are also trying to defend themselves for not knowing a basic CS term.

I really, really hope that I don't have to do any significant amount of work with any of the people who are saying it's of no practical significance "in the real world".

I've been out of college more than 2 years, and I use a very large proportion of the CS that I learned in my job. I'd leave the industry if I was just cranking out boilerplate code all day - and pretty much any time you get ahead of boilerplate there's usually something helpful from CS.

While you can look up stuff and pick it up if you go a) you'll pick it up faster if you have the basics of CS under your belt to start with and b) you'll be able to make better choices various data structures and algorithms throughout the less challenging work that improve your code.

There is no fight over the correct definition - there is one correct definition and Jeff wasn't really close. I think people might be complaining because they've mentally filed this blog in the 'reliable information' and they're begrudging the mental work to re-file it elsewhere.

My personal reason for adding my voice to the fray is that I think having a highly visible, incorrect and misleading piece of information out in the wild is really going to hurt those that are going to try to pick these things up on their own.

Dave on November 17, 2008 5:06 PM

Jesus people, he's not writing a university text-book here, it's a blog! Lay off a bit, or at the very minimum be more constructive with your criticism.

I don't think introducing someone to a new, interesting subject is a bad thing in any way. And maybe even a little mis-information (not that it is even there!) will weed out the ones not willing to research the topic on their own.

I also find it amusing that people will actually spend the time to read the post and then berate the author. If you don't like what he has to say, the solution is simple, DON'T READ IT! It's not an NP-Complete problem!

Kory on November 17, 2008 6:40 PM

I think saying a CS degree has no practical value is a big stretch. But, then again, having a CS degree doesn't necessarily make you a good programmer either!

dismountSoapbox();

Kory on November 17, 2008 6:46 PM

"NP-complete problems are like hardcore pornography".

Jeff, I really think you need to get out more.

Kramii on November 18, 2008 2:05 AM

Reminds me of this:
http://xkcd.com/386/

MyKey_ on November 18, 2008 3:42 AM

> I think what Jeff means is that nobody can define what makes a particular problem belong to NP, but not P.

That is what I was trying to say. Poorly, obviously. I have updated the post.

I still say the definition of NP-complete is awfully.. *conveniently* .. recursive, as T.E.D. mentioned.

I also agree with this particular sentence in the Wikipedia (http://en.wikipedia.org/wiki/NP-hard) article: "The NP-family naming system is confusing".

And how.

Jeff Atwood on November 18, 2008 5:35 AM

@T.E.D.:

The definition on Wikipedia is not recursive. I think the confusion may arise because "NP" and "NP-Complete" mean different things, and the definition of "NP-Complete" depends on the definition of "NP".

A problem is *NP* if (roughly speaking) it's quick to check whether a proposed solution is correct. Note that this includes both "easy" and "hard" problems.

This definition of NP is not recursive. You can decide whether a problem is NP independent of other problems.

A problem is *NP-Complete* if it's both *NP* and if finding an "easy" algorithm to solve it would automatically give you an easy algorithm to all *NP* problems.

This definition of NP-Complete depends on the definition of NP, but it does not depend on itself.

Weeble on November 18, 2008 5:53 AM

Another reason it sounds recursive is that generally the way you prove a problem is NP-Complete is by showing that it's NP and then reducing another NP-Complete problem to it. So how do you prove that anything is NP-Complete in the first place? The answer is that this is only the easy way. The hard way is to prove that *every* problem in NP can be reduced to your problem. That's what the Cook-Levin theorem did for the SAT problem. Once you know *one* NP-Complete problem, it becomes easier to prove other problems are NP-Complete.

http://en.wikipedia.org/wiki/Cook–Levin_theorem

Weeble on November 18, 2008 6:01 AM

> A problem is NP-complete if it is both [quick to check whether a proposed solution is correct] and finding an [efficient] algorithm to solve it would automatically give you an [efficent] algorithm to solve *all* [quick to check whether a proposed solution is correct] problems.

Isn't this.. er.. still.. partically recursive? We know it because solving one would solve all the others, which in turn solves the others, which in turn solves the others.. ad infinitum?

I have to say I find this topic *incredibly* confusing, even after reading all the above comments (many of which were quite helpful, thank you) and the relevant wikipedia articles.

Jeff Atwood on November 18, 2008 6:19 AM

"Isn't this.. er.. still.. partically recursive?"

The definition involves infinite stuff (the set of all NP problems), but it is not recursive. The definition of NP stands alone, and the definition of NP-Complete depends only on the definition of NP.

The presence of infinite stuff isn't as big a deal as you'd think. You can prove that adding together any two odd numbers will give you an even number, even though there are an infinite number of odd numbers. I don't really understand the Cook-Levin theorem myself, but I'm happy to know that it shows that all of the infinite possible problems (including all the ones nobody has thought of yet) in NP can be converted to an instance of the SAT problem. That might seem surprising, but it's not all that different to proving something about all the infinite possible odd numbers. Then, with that in hand, you can go on to show that other stuff is NP-Complete, because if you can convert SAT to your problem, you know you can convert *any* problem in NP to SAT and then convert it to your problem.

Weeble on November 18, 2008 6:51 AM

Ok, not enough time to read all the comments that don't have the answer to the question, "what's your favorite cheat?" Mine: refactoring a hard problem such that I can avoid the <part I want to avoid> as much as possible. I imagine that there is a CS term for it, but I majored in English :) Clearly a lot of the commenters are smart--probably smarter and certainly more educated (in this field, anyway) than I; how does the flamage help anyone "get things done"?

That One Guy on November 18, 2008 7:34 AM

Im sorry Jeff but your update is still somewhat wrong. You should abandon the thought NP-complete == hard-problem . It should be more like NP-comple => we think it's a hard problem.

I'll try to give an understandable and maybe partly informal definition of NP-complete below, bear with me (or ignore it, if you think it's not interesting).

NP-completeness is defined by two criteria:

1. The problem is in NP (aka the problem can be solved in polynomial time by a non-deterministic Turing machine. (Before I get flamed for mentioning Turing Machines: Regarding some previous posts by Jeff I think he knows what I'm talking about.)
Note this is trivially true for all problems in P (those which can be solved in polynomial time by a deterministic Turing machine). So P is a part of NP.

2. The problem L is one of the "hardest problems in NP".
Let's say we have some magic box, which accepts the input for our problem L and gives us the correct output, taking only a fixed amount of time, independent of the input. Now to prove, that our problem L is "the hardest", we have to show that we can solve all other problems in NP in polynomial time using our magic box (This is known as Reduction).
We do this by giving an algorithm, which transforms the input for other problems into input for our problem. Now showing something for ALL problems sounds difficult, but it can be done. (If you want to know how look up the Cook-Levin Theorem).

If we have one problem L', which is known to be "the hardest" it is enough to show that our problem L is equally hard. This means we only have to give an algorithm which solves L in polynomial time using a magic box which solves L'.


So, the Cook-Levin Theorem shows that we can transform every problem in NP into SAT. Going on from there we show that we can transform SAT into other problems, like Traveling Salesman, Bin Packing, etc. take your pick.

We think that we can't solve this problems in polynomial time on a non-deterministic Turing Machine (or on your average computer), but we can't proof that and if someone would come up with a way to solve one of these problems in polynomial time, we could solve all of them in polynomial time (we could just apply the algorithms, that we defined when we proved their NP-completeness).

However, even if someone comes up with such a solution (which is highly unlikely) the definition of NP-complete I gave above still applies to all of the problems, so they remain NP-complete. We only revise the deduction NP => difficult.

H on November 18, 2008 10:00 AM

Wow. Jeff Atwood, master troll!

I think I'm going to add Jeff's explanation of "NP-complete" to Conservapedia. http://www.conservapedia.com/Public-key_encryption

It's a pretty good imitation of a "pop science" explanation of CS terms, but with glaring errors that stand out to anyone with an undergrad CS education. ("Nobody really knows what NP-completeness means... Problems are only NP-complete because we can't prove they aren't.")

Maybe Jeff can explain one-way hash functions next. ("A one-way hash function is a way of implementing a read-only hash table, similar to Ada's 'const final' attributes.") Or branch prediction. ("Branch prediction is the ability of a programming language to conditionalize certain instructions on flags in the hardware's ISR register.") Or garbage collection. ("Languages such as Java are only considered garbage-collected because we can't prove they aren't.")

Anonymous Cowherd on November 18, 2008 10:54 AM

(BTW, please don't "fix" the Conservapedia articles. The "errors" are intentional parody.)

Anonymous Cowherd on November 18, 2008 10:55 AM

> please don't "fix" the Conservapedia articles.

Well, except if *I* "fix" them, they'll be correct, no?

I've redacted the offensive sentence from the post, with explanation.

In all honesty, despite reading all the above comments and the wikipedia entries, I still don't quite understand the definition of NP-complete other than as "we'll know it when lots of experts have seen it, tried, and failed."

There has to be a simpler definition out there somewhere. I talked to Joel Spolsky about this and he had a pretty good analogy that might work; I urged him to write a blog post about it.

Jeff Atwood on November 18, 2008 3:28 PM

Jeff -

NP-complete is actually pretty simple, so I bet whatever sources you have been reading have just over complicated it.

'NP' is a class of problems - that is a set of problems that all share a particular property. Here, it's the property that they can all be solved in polynomial time by a Turing machine (or equivalent computer) that can - every time there's a choice point in the algorithm to solve it - magically guess the right decision to make.

So rather than have your Turing machine deterministically make the same decision at every choice point, our new machine somehow knows which is the right way to go.

That's what the NP stands for - 'non-deterministic polynomial'.

Imagine you're trying to solve the Travelling Salesman problem by brute force. Enumeration of all those graphs takes at least O(2!) time. However, our non-deterministic Turing machine magically guesses which is the right way through the search space, and can construct the circuit in polynomial time - by choosing the right edge (again, magically) at every choice point.

Now, the class of problems that can be solved in deterministic polynomial time is called P, and P is a subset of NP since every problem in P can also be solved by our non-deterministic Turing machine in polynomial time. What we don't know is if there is a deterministic polynomial time solution to all the problems in NP - if there was, then P=NP and bingo, you've won a million dollars.

To help us find out whether P=NP, computer scientists have discovered a kind of problem that is the *hardest problem in NP*. By hardest, we mean that any algorithm to solve the hardest problem will take as least as long to run as any other algorithm to solve any other problem in NP.

We call these problems 'NP-complete'. If we could find an efficient deterministic polynomial time algorithm to solve an NP-complete problem, we'd have shown that *every* problem in NP has an efficient solution, because remember by definition NP-complete problems are the hardest in NP.

(Sidenote: we show that a problem is 'harder' than another problem by showing that the easier problem can be reduced in polynomial time to an instance of the harder one. Then a solution to the easier one is found by reducing it to the harder problem and then solving the harder one, so at worst we can solve in the time it takes to solve the harder one. If you think about this carefully, you'll see this is exactly what we probably want 'harder' to mean.)

So that's all NP-complete problems are: the hardest problems that can be solved by a non-deterministic Turing machine in polynomial time. We don't know if a deterministic polynomial time solution to them exists - it may do, but I doubt it - but that doesn't change the fact that we can recognise them, prove them to be NP-complete, and formally identify them much more easily than we can hardcore porn :)

Henry on November 18, 2008 3:48 PM

Well, that just begs the question - why the blog post in the first place? And why recommend a book that you haven't read and/or didn't understand? Not cheap shots here, I actually want to understand why you took that credibility risk with the people who know this stuff - which it turns out is a lot of programmers.

> In all honesty, despite reading all the above comments and the wikipedia entries, I still don't quite understand the definition of NP-complete other than as "we'll know it when lots of experts have seen it, tried, and failed."

Have you really read all of the above comments? Not just scanned over them but read them? You only know if a problem is NP-complete if the experts have seen it, tried, and proved that it's NP-complete. There's a precise definition and a set of criteria to demonstrate that something is NP-complete.

I'll have a crack at some explanations, I don't know how much help they'll be. I'm sure the book you linked to probably explains it better.

A Turing machine is a simple model of a computer. In its usual form its only output is to accept or reject and input, so instead of asking it what a + b is, you normally ask it if a + b = c - ie it solves decision problems.

A problem is in P (Polynomial time) if for all instances of a problem with size n (n = 100 if we have 100 objects to sort, for instance) the Turing machine takes a number of steps to come to a decision that is a polynomial in n - eg there's an uppper bound of 2 * n^5 + 4 *n + 5.

A lot of the time when we talk about the size we're talking about the number of bits - adding n-bit numbers is in P - O(n) - and so is ultiplying n-bit n-bit numbers - O(n^2) the simple way, although FFT methods do better.

This is where I back up for a second and talk about non-determinism. A regular expression is just a non-deterministic finite automaton or state machine. It has a state, and the next state is determined by the current state and the next piece of the input which is read.

Say we're scanning a bit string rather than a character string, and we're after strings that have a 1 as their tenth last bit. Using a regular state machine we're going to need about 2^10 states to handle this.

This is where we add non-determinism. We allow the machine to go to several different states based on the current state and the next piece of the input. Once we've read a 1 we're in a state with two paths - one leads back to the state we're in, and one leads to a string of ten states which all transition to each other on all inputs and end with an accepting state.

Rather than having the machine encode 2^n states we have the machine split into 2^n in different multiverses and we take a big OR of the results. The machine took n "guesses" that the current piece of input was the tenth last, and we were able to save some space describing the machine.

As a bonus you can easily implement a NFA engine with only a polynomial size increase in space / time for the processing over the deterministic equivalent, so they're both in P but I digress.

So we're processing the input of a Turing machine. Now we add non-determinism, so at every step the machine can split - so we have 2^n virtual-multiverse machines, if any of them say "yes" to the current decision problem then the answer is a yes. If the running time for each machine is bound above by a polynomial in n, we have a problem in NP.

That's why there's a polynomial size check (or certificate) of all NP problems - it's just an encoding of the n guess that were made to get to the accepting state. You can take the original non-deterministic turing machine and at each place where you would normally guess the certificate will tell you which path to take and guide you to the accepting state. It's like factoring the product of two large primes is hard but it's easy to check if you have one of them already via the mod operation.

The NP-complete problems are assumed to be the hardest of the NP problems. It all starts with SAT3 - just take it as a given for now that SAT3 is the first in the list of NP-complete problems.

Take a candidate for a second NP-complete problem. Using a polynomial number of copies (A) of this candidate problem and a polynomial sized amount of Turing machine glue code (B) try to show that you can solve SAT3 (or something else on the list). If you can, the candidate problem is NP-complete and you can add it the list.

So if solving NP problems takes exponential time, we have A * an exponential in n + B, and it takes a really long time to solve SAT3 and every problem between your problem and SAT3.

But if one of them can be done in polynomial time, we have A * a polynomial in n + B, and it takes polynomial time to solve SAT3 and every problem between your problem and SAT3.

You may have spotted that so far this means solving SAT3 in polynomial time doesn't do much for you. NPC is defined as the problems that are in NP and reduce to all other problems in NPC, and so for SAT3 to keep its crown you also build the tree in the other direction.

It becomes a crazy tangled web but it's clear that if you can solve one of the problems in polynomial time you can solve all of them in polynomial time just by stepping through the web from the P-time problem towards the problem you're after.

Dave on November 18, 2008 4:43 PM

Dave has it almost completely right above, except for a couple of details. The NP-complete problems aren't *assumed* to be the hardest problems in NP - they are proven to be, and the proof involves SAT instead of SAT3 (SAT3 is just SAT restricted to no more than 3 literals per clause).

This is what Cook's Theorem is all about, and it's why NP-completeness isn't as recursive as it looks. What Cook did was to show that given *any* decision problem in NP, we could take a description of the non-deterministic Turing machine that solves it, and produce in polynomial time an equivalent SAT problem that had a solution if and only if the NTM returned true along one of the paths in its original computation. So if we had a hypothetical polynomial-time algorithm to solve SAT, we could use it to solve every decision problem in NP in polynomial time - just take the description of the NTM that solves the original problem, produce the equivalent SAT problem, and turn our hypothetical SAT solver loose on that.

That's what showed that there was such a thing as NP-completeness in the first place, and showed that SAT was one such problem. All the other NP-complete problems have been shown to be NP-complete by showing that "If we could solve problem X in polynomial time, then we could solve this other problem in polynomial time ... and therefore, we could solve SAT in polynomial time."

Terminology note: a decision problem is one with a boolean true/false result. E.g., "does the following instance of the TSP have a solution with cost <= B?" Technically, NP (and therefore all the NP-complete problems) consists only of decision problems (by definition). The optimization problems that we are usually really interested in solving can be NP-hard, rather than NP-complete. A problem is NP-hard if a polynomial-time solution to it would imply a polynomial-time solution to an NP-complete problem. Thus "What is the lowest-cost solution to the following instance of the TSP?" is NP-hard, not NP-complete, but it should be pretty obvious that if we could solve it in polynomial time, we could also answer the related decision problem, which is NP-complete.

Dave W. on November 18, 2008 5:18 PM

> I actually want to understand why you took that credibility risk with the people who know this stuff - which it turns out is a lot of programmers

I wrote about it for the same reason I write about anything, I'm trying to understand it. Sometimes you're successful, other times not so much. Life is a credibility risk; what are you gonna do? Not say anything until you're guaranteed to be perfectly right every time?

> It becomes a crazy tangled web

Well, that part is clear.

Jeff Atwood on November 18, 2008 5:38 PM

@Jeff

You've mentioned the explorative nature of your posts before and it just slipped my mind - sorry about that.

The blog post does read like you're telling people about something that you're pretty familiar with, which might be why the people familiar with NP / NPC are exploding on here.

For instance, I read the "expert programmer" line like you were in the pool of people who understood NP / NPC and were asking the other folk to jump in and test the water. Could just be me though.

I guess it's not about credibility, it's about being able to tell when you're expounding and when you're exploring. And I guess that's really a problem occurring in the space between you and readers - lots of stuff is missing when you're communicating digitally. Maybe throw in a hint next time to make it clear to those of us that are struggling to tell the two apart? :)

Good luck with the exploration :)

@Dave W - you're spot on.

Although I didn't meant that it _is_ assumed, I was just asking Jeff to trust me on it since it was already an overlong post :) It took enough restraint to avoid mentioning the different kinds of reductions.

Of course, now that you've mentioned it I'm kicking myself for not just saying "you can emulate a NTM with a carefully configured instance of SAT".

Dave on November 18, 2008 6:55 PM

@Henry: great explanation. Rather than repeating the self-referential jargon that other explanations used, you did a pretty good job of demonstrating what the concepts mean. If you could have avoided the terms 'Turing machine' and 'polynomial', it might have been better, but I (and presumably most) knew those anyway.

travis on November 18, 2008 9:53 PM

I stopped reading the blog a while a go. Joel explains my reasoning nicely in his latest post:
"This is not the way to move science forward. On Sunday Dave Winer [partially] defined "great blogging" as "people talking about things they know about, not just expressing opinions about things they are not experts in (nothing wrong with that, of course)." Can we get some more of that, please? Thanks."

reader on November 19, 2008 12:29 AM

Okay, Jeff; I admit I haven't read all of these posts. But I'm going to try and clear up some of your confusion without using too many words. I aimed for 200, but I'm afraid I ended up needing 300:


We need to separate algorithms and problems. We write algorithms to solve problems, and they scale in a certain way. Although this is a simplification, let's label an algorithm with a 'P' if the scaling is good enough, and 'NP' if it isn't.

It's helpful to know things about the problems we're trying to solve, rather than the algorithms we use to solve them. So we'll say that all the problems which have a well-scaling algorithm are "in P". And the ones which have a poor-scaling algorithm are "in NP".

That means that lots of simple problems are "in NP" too, because we can write bad algorithms to solve easy problems. It would be good to know which problems in NP are the really tricky ones, but we don't just want to say "it's the ones we haven't found a good algorithm for". After all, I could come up with a problem (call it X) that I think needs a super-amazing algorithm. I tell the world that the best algorithm I could come up with to solve X scales badly, and so I think that X is a really tough problem. But tomorrow, maybe somebody cleverer than me invents an algorithm which solves X and is in P. So this isn't a very good definition of hard problems.

All the same, there are lots of problems in NP that nobody knows a good algorithm for. So if I could _prove_ that X is a certain sort of problem: one where a good algorithm to solve X could _also_ be used, in some roundabout way, to give a good algorithm for _every_ other problem in NP. Well now people might be a bit more convinced that X is a genuinely tricky problem. And in this case we call X NP-Complete.

Tom on November 19, 2008 4:03 AM

porges and Delmania: no, the existence of reductions from any NP-complete problem to any other doesn't mean that if one has good approximations then they all do. It can happen that there's a polynomial-time reduction from problem A to problem B but that it doesn't give you a good approximation for problem A even if problem B has one. (Crudely: suppose you've got something that turns solutions to B into solutions to A, but that turns "almost-solutions with error x" to B into "almost-solutions with error rapidly_growing_function(x)" to A. Then approximate solutions don't turn into approximate solutions even though exact solutions turn into exact solutions.)

It would be nice if people didn't use "NP" to mean "hard", just like it would be nice if people didn't use "exponential" to mean "big", but it probably isn't going to happen any time soon.

g on November 19, 2008 5:05 AM

Hi Jeff,

Please let me try a real simple explanation of the difference between P and NP... I hope I'm not repeating anyone, I haven't read all the posts. I won't bother repeating definitions I already saw above, just explain the difference (and why it isn't a recursive definition)

For a problem, M:

M is in P if, and only if, there exists an algorithm that solves M in polynomial time. In all cases, that means we know the algorithm and prove it's polynomial.

M is NOT in P if we have proven, absolutely proven, that there is NO algorithm for it that runs in polynomial time. This is, of course, much harder to prove, since you can't do with with an example - you have to prove it for all possible algorithms.

That would seem to cover all problems, but for quite a few problems, the answer is 'we don't know'. That's because there's no known polynomial algorithm for them, but also, there's no proof that there one doesn't exist.

This annoys computer scientists to no end.

In studying these problems which are unknown, it turns out that for a lot of them, you can devise a simple algorithm to tests a given solution, but not one to find a solution. The formal definition for them, given above in a lot of posts, is NP problems. Due to its definition, NP problems also include all P problems.

To complete things, it also turns out that quite a few of those NP problems are NP-HARD, which means that if you solve them in polynomial time, you can also solve all the rest in polynomial time.

A problem that is both NP-HARD and in NP is called NP complete.

So there you have it, a non-recursive and hopefully simple explanation.

Or

Or on November 19, 2008 5:20 AM

I don't know what's all the deal with NP, personally I found P-Space to be a much more interesting complexity class. Specially all that refer to the inclusion of the polynomial hierarchy into P-Space and the co-NP classes.

schizoid on November 19, 2008 7:29 AM

I just wanted to say thanks to both Jeff and all the folks who commented who know more about this subject than I do. I appreciate that Jeff brought the subject to my attention because as an amateur software developer this is a topic way out of my league. Furthermore, the follow-up comments that corrected Jeff's writing were very enlightening and given me a deeper appreciation for the topic and piqued my curiosity. Good read!

Gareth Conner on November 19, 2008 7:36 AM

The only thing good about this post was the xkcd strip and it was not even yours.

however, coding horror will still remain in my google reader subscription.

i just hope jeff learned something from this fiasco.

K on November 19, 2008 8:32 AM

> I stopped reading the blog a while a go.

Hmm. The mystery of the non-reading reader. Another np-complete problem, I think. :)

Jeff Atwood on November 19, 2008 9:10 AM

Well, I for one am sorry to hear that you redacted, "NP-complete problems are like hardcore pornography". What an analogy! And it really got me wondering about "simulated annealing" ;-)

Kramii on November 19, 2008 9:19 AM

Your average developer's need to understand this material is inversely proportional to their department's budget for contract work.

Robert C. Barth on November 19, 2008 10:04 AM

Jeff: The reason that NP completeness is not a matter of "we'll know it when we see it" is that there's a precise definition for NP completeness, and it's possible to formally prove that something is NP complete. What isn't known is if the class of problems known as NP is equivalent to the class of problems known as P. That means, we don't know if the existence of a polynomial time check of a result of a computation implies that there is an algorithm for solving the problem in polynomial time. We still can tell if something is in NP (just find an algorithm for verifying a solution, and prove that it runs in polynomial time). And NP-complete is just a subset of NP, the subset consisting of those problems that if you solved them in polynomial time, it means you could solve any other NP-complete problem in polynomial time. So, there is a precise definition of what's NP-complete, and you can prove if something is NP-complete. All we don't know is whether this implies that the problem can be solved in polynomial time, or if it implies that the problem cannot be solved in polynomial time.

Sorry, I haven't read every comment in the thread to tell if I'm being redundant; I just wanted to make sure it's clear that NP-completeness is well understood, and you can prove plenty of things about it, we just don't know whether or not there's a way to find polynomial time solutions to those problems.

Brian on November 19, 2008 10:49 AM

I think this has actually turned out to be a very constructive post. Aside from all the offense and anger of course.

Practicality on November 19, 2008 11:21 AM

But now I have to read the book to get the rest of the story. Ahhh! Now I see the point! ;)

Practicality on November 19, 2008 11:41 AM

My favorite NP-complete problem is picking which clothes I'm going to wear on any given day.

PBEAN on November 19, 2008 12:22 PM

I have never commented before and most probably will not again. There are two serious problems with the post, one of which has been fixed partly. But the second one hasn't.

1. Nobody can define what makes a problem NP-complete, exactly ....

It has an exact and precise definition, as pointed before. In your redaction you link to a poll about P != NP amongst computer scientists. How can this imply there is a confusion regarding the definition of NPC? Please take down the link as it is very misleading.

2. Your recommendation of Garey-Johnson - "this book has a timeless quality". Have you read the book? Clearly you haven't. Otherwise you wouldn't be confused by the definition of NPC. It is obvious, you are strongly recommending a book you haven't read at all. This completely destroys you credibility as an author. You must preface the recommendation with the text - "Disclaimer: I HAVE NOT READ THIS BOOK".


Since, you have claimed in your comments to still be confused by the definition of NPC, I will try and explain it in my own words, because it is quite simple.

Let U = set of problems.
Let P = set of problems that can be solved in polytime wrt size of the input
Let NP = set of problems whose solution can be verified in polytime wrt size of the input

Obviously P <= NP <= U

For example sorting an array of numbers
1. belongs to U
2. belongs to P, we can do this in nlogn
3. belongs to NP, given a array, we can check if it is sorted in polytime. In other words, Given an unsorted array, we sort it using a black box routine and verify if it is sorted in polytime. Let us say we do not know of any algo to sort numbers in polytime. This does not prevent the problem for belonging in NP. The problem may or may not belong to P. We don't know.

Another example, determine if a number is Prime
Before 2002
1. Belongs to U
2. Dont know if it belongs to P. No algo known
3. Dont know if it belongs to NP (I think). Simply stating that a number is prime, isn't enough, we need a proof of the fact, that can be verified in polytime. For a sorted array, it is simply the array laid out in sorted fashion.

After 2002 AKS polytime primality test, we KNOW that it belongs to P and hence NP.


Another example, does a graph have a Hamiltonian cycle (Google this).
1. Belongs to U
2. Don't know if it belongs to P
3. Belongs to NP. Given a cycle in the graph, we can verify if it covers all vertices of the graph.


Cooks theorem showed that if the 3-SAT problem, which belongs to NP also belongs to P, then ALL OF NP BELONGS TO P i.e. P == NP. Note that he did not prove that 3-SAT is in P. Just that if it were then, P == NP. Now many CS scientists are busy trying to prove that this implies a contradiction, i.e. P != NP, with noone succeeding so far. That's where the cartoon comes from.

Now we get to the definition of NPC - NP Complete.
After Cooks paper, Richard Karp proved that 21 problems that belong to NP are "atleast as hard" as 3-SAT, including the Hamiltonian cycle problem.

"at least as hard" means
1. Suppose you have a polytime solution for the Hamiltonian cycle. Then you can easily (in polytime) transform the 3-SAT problem into a Hamiltonian cycle problem. In other words, if you solve the Hamiltonian problem in polytime, you would have solved the 3-SAT problem in polytime as well. Like, if you can multiply complex numbers, you can also multiply real numbers.

Hence a polytime solution for Hamiltonian => polytime solution for 3-SAT => P == NP

Now, NP-Complete problems are those problems in NP that are "at least as hard as 3-SAT", i.e. if even one of these problems were to belong to P, it would imply that all of P == NP. Starting with Karp's 21 problems, thousands of such problems are now known.

However note that the definition of NPC depends on convertibility of 3-SAT to it, and no-where on P == NP. A problem is NPC if the 3-SAT problem can be transformed into it in polytime.

CSGradnowSoftEng on November 19, 2008 5:49 PM

Lastly

P <= NP <= NPC < U

CSGradnowSoftEng on November 19, 2008 5:52 PM

Please scratch the previous post, I meant

NPC &lt;= NP &lt; U

&&

P &lt;= NP &lt; U

CSGradnowSoftEng on November 19, 2008 5:57 PM

Nice summary here:

<a href="http://www.reddit.com/r/programming/comments/7ed4s/joel_on_software_calls_self_out/c06frjt">http://www.reddit.com/r/programming/comments/7ed4s/joel_on_software_calls_self_out/c06frjt</a>

on November 19, 2008 6:21 PM

Please check the approach from Chris Voudouris and Edward Tsang on GLS(Guided Local Search). My group is developing an optimization for this kind of problem for a well known governmental institution.But it can give you guys a good start.;-)

Andre.A on November 20, 2008 3:52 AM

The traveling salesman problem was solved back in the 80's by using a neural net. It's still slow but it could be solved in minutes instead of years. It is all a matter of using the correct hardware instead of software.
Many problems have much faster and even better answers if fuzzy logic is used instead of strictly formula based programming. Formula based programing is NOT a silver bullet for all problems. What happened to programming being about logic? Too much CS and not enough common sense.

Mark on November 20, 2008 6:05 AM

@Jeff
A mistake in a blog post is no big deal ordinarily. But here
1) The mistake is fundamental. When the entire post is about complexity theory, I think you should at least understand the basics before you write about it, even if you are only trying to explain it to "laymen" as some have suggested (btw I am not a computer scientist)
2) You recommend a book you have not read.


If people are still confused about the "recursive" nature of the whole thing, here's how you prove a problem is NP complete:

You start with any one problem that is NP complete. This first one is the hardest, but this was done a long time ago (Cook-Levin theorem). They proved (independently) that SAT is NP-complete.

Now, given a problem you try to "reduce" it to a problem that you already know to be NP-complete. Here, we say problem A "reduces" to problem B if, given a solution to B, I can find a solution to A quickly (polynomial time).
Oh and if you cannot reduce it, then you cannot conclude anything (unless you PROVE that it cannot be reduced which is totally different) I don't know where you got the "tried and failed" idea.

So this is "recursive" only to the extent that in order to prove 'A' NP-Complete you usually use a B that is already known to be NP-Complete. "Usually" is the keyword here. You need not rely on a known problem, and Cook & Levin did not. However, once ANY one problem is known to be NP-complete, we can "recurse" for the rest, which is much easier than proving from "scratch".

Hope this helps.

hellfeuer on November 20, 2008 6:07 AM

Mark,

I have read this exact paper and no it does not solve the TSP problem optimally. It only provides an approximation. There is no known kind of hardware that can solve this problem in polytime, not even quantum computers.

hellfeuer,

There is a mistake in your description. You need to reduce B to A to prove NP-completeness of A (given that B is NP Complete), and not the other way around.

CSGradnowSoftEng on November 20, 2008 4:35 PM

Jeff Ouch -

"I still don't quite understand the definition of NP-complete other than as "we'll know it when lots of experts have seen it, tried, and failed." "

An expert failing to solve a problem does not make a problem NPC. Anyone, expert or otherwise, providing a polytime reduction of a known NPC problem like 3-SAT or Hamiltonian cycle to the said problem makes the problem NP-Complete.

CSGradnowSoftEng on November 20, 2008 4:56 PM

@Mark

Cook has a paper on the science-fiction like results if P = NP. If it does turn out that way then I think everyone will "know" in the same way that people "know" that E = MC^2.

I don't think people are going to bluff their way to a decision about P ?= NP without understanding the reasons that it's difficult.

Not enough CS or common sense.

on November 20, 2008 6:30 PM

> Your recommendation of Garey-Johnson - "this book has a timeless quality"

I meant that it was published in 1979, and the concepts behind it are still relevant and useful. Like, say, Peopleware. Compare to: "Learn Visual Basic 5.0 in 24 Hours", or "Microsoft WPF for Dummies". My reading list is full of books with this quality.

> It is obvious, you are strongly recommending a book you haven't read at all.

Here's what I said: "I'll recommend a book Anthony Scian recommended to me." Anthony's email to me indicated the book was a classic, I liked the topic (NP-completeness), and my initial research was intriguing -- if obviously incomplete. I don't have a problem with books being recommended in this "friend of a friend" fashion and I stated as such in the post.

Jeff Atwood on November 21, 2008 4:15 AM

@reader
"I stopped reading the blog a while a go. Joel explains ..."
It's quite curious: how could you know of this post if you stopped reading the blog?!?

milkyway on November 21, 2008 7:14 AM

"Cook has a paper on the science-fiction like results if P = NP"

One of the consequences is very startling. Given the mathematical proof of a statement we can read it and check it for correctness in reasonable time. i.e. What is difficult is actually finding the proof which can be incredibly hard. Finding the proof of a mathematical theorem is in NP by definition. Now if we were to prove that P == NP, then it could possibly put all mathematicians out of business. This means that given a provable mathematical statement, a machine can find a proof for it in polynomial time. If the order of the polynomial isnt too high, it would mean that Maths would end up like Chess, with mathematicians being unable to match machines in theorem proving. This possibility is either startling or very obvious, because human beings themselves are arguably nothing more than Turing Machines.

CSGradnowSoftEng on November 21, 2008 5:45 PM

Jeff is such a dumb bitch when it comes to anything like real CS, but what do you expect from someone too stupid to learn C?

JeffIsDumb on November 21, 2008 6:46 PM

Jeff,
The recursive definition bottoms out at 3SAT:

"SAT was the first known NP-complete problem, as proved by Stephen Cook in 1971 (see Cook's theorem for the proof)."

http://en.wikipedia.org/wiki/3-SAT#3-satisfiability.

Steve Steiner on November 21, 2008 10:04 PM

>>> Finding the proof of a mathematical theorem is in NP by definition

This is only the case for theorems that have reasonably short proofs (polynomial in the length of the theorem).

Your point about the startling consequences of P=NP is still good despite this nitpick: even an algorithm that could efficiently find proofs for theorems as long as they were at most 100 pages long, would still completely revolutionize mathematics.

Gareth Rees on November 22, 2008 2:35 AM

Cooks paper isn't just about proofs either - he talks about how you'd be able to do some pretty interesting things with music, art, etc...

D on November 24, 2008 12:08 AM

http://www.experts-exchange.com/Q_20908880.html

After solving this problem, I gave a presentation of it to a local Comp Sci Department. One of the attendees had read a paper on the solving of such NP-hard problems with clever data representation. It was news to me.

Aikimark on November 24, 2008 3:39 PM

See www.timescube.com

stranger on November 24, 2008 9:51 PM

Wow Jeff, based on some of the incredibly scathing comments on this blog post, I've come to the conclusion that there are a lot dicks who read your blog ... Granted, there were some people who had the good sense to provide constructive criticism, but some of these folks need to chill out and realize that a blog is just a blog, not a CS textbook ...

-Ash

Ash on November 25, 2008 11:05 AM

To all of the "PHd"s ripping into Jeff, you need to understand that the following is what your average corporate programmer got from this article:

mwop mwop bibble dingdo efi hookle wimby "NP COMPLETE" runky dfew oogle bing "INCREDIBLY HARD" toodle joply dfe "CAN BE SOLVED ACCEPTABLY USING A CHEAT" ioph fi zingzizzy jolbick "GOOD BOOK: COMPUTERS AND INTRACTABILIY" qopple redmind yobless imbo lunk.

Then they (we) get back to that finishing that Web based ASP.NET registration form.

Whether you like it or not, most readers could not give a #^$#* about your "verifiably correct" proofs, they are working programmers who are just trying to make a living.

AJ on November 27, 2008 5:50 PM

Pretty funny that Aikimark posts a link to experts-exchange.com

I wonder if his question appears on stackoverflow?

glenn jackman on November 28, 2008 9:24 AM

@AJ

You should probably start to question your assumption that most readers of coding horror don't know or don't care about that NP Complete means - otherwise there wouldn't have been as many comments calling Jeff on talking about something he didn't really no about.

I can imagine that might change shortly - the blog post and Jeffs handling of the explosion on here seem to indicate the target audience may in fact be those doing Web based ASP.NET registration forms - and at least some of the people responding in these comments have probably already moved on.

And it's not like NP is something you need a Phd to learn about - it is / should be a standard part of a programmers education - whether at a learning institute or through self education. If I was a gardener and did gardening all day eventually I'd want to know things about the plants. Otherwise I'd just be a guy that digs holes and calls himself a gardener.

D on November 28, 2008 8:05 PM

WOW! So a lot of people read this, I wonder if Jeff writes the way he does (which I like a lot) just to piss off the people who's only job is to correct the Internet.

To you guys, the protectors of knowledge on the Internet I have to say, great job here.

You're next mission Wikipedia... good luck.

AnnMex on December 3, 2008 10:59 AM

NP-Complete http://xkcd.com/287/

Dennis on December 6, 2008 10:40 PM

<html>
<b>A<b>
</html>

Vineet on December 8, 2008 1:34 AM

comments tl;dr

I don't know if this has been mentioned yet but there are problems harder than NP-complete.

mr. fappy on December 10, 2008 9:40 AM

Answering the original question, my favourite NP-complete cheats are Metaheuristics: evolutionary algorithms, simulated annealing, particle swarm optimization, ant colony optimization, estimation of distribution algorithms... what poetic names :-)

Those algorithms have the particularity to be stochastics, and they can't even guarantee that you will have an approximation of the optimum. But they do work for hard problems, and they sometimes are the only solution you can use...

nojhan on December 15, 2008 2:47 AM

Now you know why this is "The World's Most Dangerous Programming Blog"
;)

Practicality on December 19, 2008 6:57 AM

Is it warmer in the country or in the summer?

Answer the result.

No-Prob Compleet on January 21, 2009 2:27 PM

If NP-completeness generates some much traffic, I look forward to the post about Undecidable Problems.

Undecided on January 21, 2009 11:36 PM

After solving this problem, I gave a presentation of it to a local Comp Sci Department. One of the attendees had read a paper on the solving of such NP-hard problems with clever data representation. It was news to me.
http://proektbaza.ru/

Olof on January 29, 2009 1:50 PM

For your next trick I strongly recommend trying to explain the Halting Problem. That should provide no end of 'interesting' feedback.

Oh yes, I apologise for having a PhD in CS, but I suppose we all have our crosses to bear!

Rik on April 18, 2009 4:06 PM

You should probably start to question your assumption that most readers of coding horror don't know or don't care about that NP Complete means - otherwise there wouldn't have been as many comments calling Jeff on talking about something he didn't really no about.
http://basetour.ru

Zigmunt on May 13, 2009 1:00 PM

Oftentimes, I'll find myself trying to find the solution to some random problem (or even a whole family), only to get stumped; monte carlo algorithims have often times bailed me out; being able to trade time and accuracy with a single variable can be very handy, especially since many problems rapidy approach specific values as the sample size increases. Thank god for logarithms.

Ghost2 on May 15, 2009 9:01 AM

Wow, many of the commentors here could easily be award-winning douchebags...

No wonder people hate nerds with such a passion. Many of you lack anything even similar to tact or social grace. It makes me feel better that, for all your arrogance, you'd die in the wild with the rest of the sheeple if it came down to it...

Learn to balance yourselves, people. Otherwise you'll turn out like M.

Jacob on May 15, 2009 11:55 AM

Jacob: Pot, meet kettle.

Joe Chung on June 1, 2009 10:34 AM

#include
#include

int A[]={
267, 493, 869, 961, 1000, 1153, 1246, 1598, 1766, 1922
};

int T = 5842;

#define len(a) (sizeof(a)/sizeof(*(a)))

int N=len(A);

int Ls[len(A)];

int compar(const void *a, const void *b){
return *(int *)a-*(int *)b;
}

int exist(int S,int T){
int R;
for( R=S-1; R>=0 && Ls[R] >= T; R-=1 ){
if( A[R] > T ){ continue; }
if( A[R] == T ){
printf("Solution: %d",T);
return T;
}
if( exist(R,T-A[R]) ){
printf(" + %d",A[R]);
return T;
}
}
return 0;
}

int main(int argc, char *argv[]){
int I;
if( argc==2 ){ T = atoi(argv[1]); }
qsort(A, N, sizeof(*A), &compar);
Ls[0]=A[0];
for( I=0;++I<N;Ls[I]=A[I]+Ls[I-1] ){}

int k;


if( !exist(N,T) ){
printf("Solution not exist\n");

}else{
printf("\n");
}


}

Cap'n Foobar on June 30, 2009 7:54 PM

#include
#include

int A[]={
267, 493, 869, 961, 1000, 1153, 1246, 1598, 1766, 1922
};

int T = 5842;

#define len(a) (sizeof(a)/sizeof(*(a)))

int N=len(A);

int Ls[len(A)];

int compar(const void *a, const void *b){
return *(int *)a-*(int *)b;
}

int exist(int S,int T){
int R;
for( R=S-1; R>=0 && Ls[R] >= T; R-=1 ){
if( A[R] > T ){ continue; }
if( A[R] == T ){
printf("Solution: %d",T);
return T;
}
if( exist(R,T-A[R]) ){
printf(" + %d",A[R]);
return T;
}
}
return 0;
}

int main(int argc, char *argv[]){
int I;
if( argc==2 ){ T = atoi(argv[1]); }
qsort(A, N, sizeof(*A), &compar);
Ls[0]=A[0];
for( I=0;++I<N;Ls[I]=A[I]+Ls[I-1] ){}

int k;


if( !exist(N,T) ){
printf("Solution not exist\n");

}else{
printf("\n");
}


}

Cap'n Foobar on June 30, 2009 7:55 PM

Morons. If you blaim the blog of being bad, then I challenge you to write a better explanation we can scrutinize.

M,
explain NP completeness here, write a few words and see if you can do it better. If you can not, then stop being a prick. This is just a blog, nothing serious. People wont die if there are some inaccurieces. I dont complain if you use bad english or misspell words.

Kebabbert on August 4, 2009 3:27 AM
Content (c) 2009 Jeff Atwood. Logo image used with permission of the author. (c) 1993 Steven C. McConnell. All Rights Reserved.