I <3 Steve McConnell*
Coding Horror
programming and human factors
by Jeff Atwood

November 15, 2008

Your Favorite NP-Complete Cheat

Have you ever heard a software engineer refer to a problem as "NP-complete"? That's fancy computer science jargon shorthand for "incredibly hard":

The most notable characteristic of NP-complete problems is that no fast solution to them is known; that is, the time required to solve the problem using any currently known algorithm increases very quickly as the size of the problem grows. As a result, the time required to solve even moderately large versions of many of these problems easily reaches into the billions or trillions of years, using any amount of computing power available today. As a consequence, determining whether or not it is possible to solve these problems quickly is one of the principal unsolved problems in Computer Science today.

While a method for computing the solutions to NP-complete problems using a reasonable amount of time remains undiscovered, computer scientists and programmers still frequently encounter NP-complete problems. An expert programmer should be able to recognize an NP-complete problem so that he or she does not unknowingly waste time trying to solve a problem which so far has eluded generations of computer scientists.

You do want to be an expert programmer, don't you? Of course you do!

NP-complete problems are like hardcore pornography. Nobody can define what makes a problem NP-complete, exactly, but you'll know it when you see it. Just this once, I'll refrain from my usual practice of inserting images to illustrate my point.

(Update: I was shooting for a poetic allusion to the P=NP problem here but based on the comments this is confusing and arguably incorrect. So I'll redact this sentence. Instead, I point you to this P=NP poll (pdf); read the comments from CS professors (including Knuth) to get an idea of how realistic this might be.)

Instead, I'll recommend a book Anthony Scian recommended to me: Computers and Intractability: A Guide to the Theory of NP-Completeness.

Computers and Intractability: A Guide to the Theory of NP-Completeness

Like all the software engineering books I recommend, this book has a timeless quality. It was originally published in 1979, a shining testament to smart people attacking truly difficult problems in computer science: "I can't find an efficient algorithm, but neither can all these famous people."

So how many problems are NP-complete? Lots.

Even if you're a layman, you might have experienced NP-Completeness in the form of Minesweeper, as Ian Stewart explains. But for programmers, I'd argue the most well known NP-completeness problem is the travelling salesman problem.

Given a number of cities and the costs of travelling from any city to any other city, what is the least-cost round-trip route that visits each city exactly once and then returns to the starting city?

The brute-force solution -- trying every possible permutation between the cities -- might work for a very small network of cities, but this quickly becomes untenable. Even if we were to use theoretical CPUs our children might own, or our children's children. What's worse, every other algorithm we come up with to find an optimal path for the salesman has the same problem. That's the common characteristic of NP-complete problems: they are exercises in heuristics and approximation, as illustrated by this xkcd cartoon:

xkcd: Travelling Salesman Problem

What do expert programmers do when faced by an intractable problem? They cheat. And so should you! Indeed, some of the modern approximations for the Travelling Salesman Problem are remarkably effective.

Various approximation algorithms, which quickly yield good solutions with high probability, have been devised. Modern methods can find solutions for extremely large problems (millions of cities) within a reasonable time, with a high probability of being just 2-3% away from the optimal solution.

Unfortunately, not all NP-complete problems have good approximations. But for those that do, I have to wonder: if we can get so close to an optimal solution by cheating, does it really matter if there's no known algorithm to produce the optimal solution? If I've learned nothing else from NP-complete problems, I've learned this: sometimes coming up with clever cheats can be more interesting than searching in vain for the perfect solution.

Consider the First Fit Decreasing algorithm for the NP-complete Bin Packing problem . It's not perfect, but it's incredibly simple and fast. The algorithm is so simple, in fact, it is regularly demonstrated at time management seminars. Oh, and it guarantees that you will get within 22% of the perfect solution every time. Not bad for a lousy cheat.

So what's your favorite NP-complete cheat?

[advertisement] Peer code review without meetings, paperwork, or stopwatches? No wonder Code Collaborator won the Jolt Award.

Posted by Jeff Atwood    View blog reactions

 

« Stop Me If You Think You've Seen This Word Before We Are Typists First, Programmers Second »

 

Comments

I would have to say 3-SAT. Simply because it sits at the root of most NP proofs and nearly any other NP problem can be reduced to it.

Good book BTW. Its still used in most universities for teaching advanced algorithms and algorithm theory.

Blair on November 15, 2008 06:39 AM

I say this often and I think it applies here rather well: The best solution to any problem is avoiding having that problem (not avoiding solving it) in the first place.

Calvin Spealman on November 15, 2008 06:55 AM

Regarding not being satisfied by approximate solutions, I like to point out that the original problem was itself an approximation. If a salesman really wants to visit a bunch of cities, there are lots of little things that go into determining what really is the best route that are not part of the simply-stated computer science problem -- avoiding roads with construction, not driving through major cities at rush hour, etc. -- and so your solution that's within 2% of optimal may be just as good as the exact solution at solving the real problem.

John on November 15, 2008 06:57 AM

approaches that use evolution in one way or another are pretty damned good. for the traveling salesman problem you could use a generic algorithm, for example. they're really fun programs to write too!

nanreh on November 15, 2008 07:19 AM

Up Up Down Down Left Right Left Right B A Select Start

Matt on November 15, 2008 07:26 AM

AAAARGH! So... many... inaccuracies... must... bleach... brain...
By the ancient Hynerian gods, Jeff, please never ever write about computer science again. Seriously.

Yes, maybe you got your main point across, but you could just have easily explain it all to laymen without making useless mistakes on the way. Well, *someone* could have gotten it right. It is evident that you couldn't... Or didn't even try.

The two most annoying are probably calling approximations "cheats", and your obvious lack of research into what "NP-complete" means (or even, what it IS).

Approximation algorithms and their bounds don't just appear out of thin air. They are more often than not invented, researched, proven correct, and upper-bounded by the same "fancy computer scientists" studying "intractable" problems. They are often quite a bit harder to reason about than the obvious (but slow) optimal solutions. Calling them "cheats" is like calling dynamic programming a cheat "because it just uses more memory".

Regarding NP-complete, it is not only a "fancy computer science jargon shorthand for incredibly hard". In the same vein, "gravity" is not a fancy physics jargon for "things fall down". There's a meaning behind the term, there's theory. Thank you for trivializing complexity and computability.
"Nobody can define what makes a problem NP-complete, exactly,"? Perhaps *you* can't. After all, it's just "fancy jargon", isn't it?
For those reading my rant this far, here's a quick informal definition from the National Institute of Standards and Technology:

A problem is in the NP-complete class if answers can be verified quickly (polynomial time in length of input), and quick (polynomial time again) algorithm to solve this problem can be used to solve all other NP problems quickly (this is called reduction).

You can also find it in the top of wikipedia's page on NP-complete here http://en.wikipedia.org/wiki/Np_complete .

M on November 15, 2008 07:33 AM

When doing the traveling salesman problem; do you include layovers as being a city that he/she "visits"? :P

Suroot on November 15, 2008 07:41 AM

Like M said (but less abrasively), there's a concrete definition of what NP-Complete means. It's mathy, but that just means it's a CS thing instead of a programming thing.

I don't mind calling approximations cheats, whatever gets you through the day.

Bill Weiss on November 15, 2008 07:51 AM

NP-Complete is perhaps the most non-intuitive name for a very very hard problem!

Wouldn't an optimal solution fall under the realm of premature optimisation? Shouldn't 2-3% off optimal be fine unless profiling determined otherwise?

It amazes me that the realm of computers, somewhat uniquely, is subject to the constraints of pure efficiency. For thousands of years we have done things the slow way then when we finally have a method of solving a problem 1 million times faster, we're expected to do it 1.001 times faster. Baffles the mind.

Take for instance, traffic lights. The most horribly inefficient system in existance today - affecting millions of people and I'd argue costing millions of dollars annually in inefficiency. Work on solving that problem before worrying about saving that poor salesman $100 in gas money.

`Josh on November 15, 2008 07:55 AM

3D Software Rendering

Hoffmann on November 15, 2008 08:33 AM

+1 for @M and @Bill Weiss: There are well-accepted terms of art for these; please don't call them cheats. Also, you failed to communicate the distinction between an approximation algorithm and a heuristic. The latter is an algorithm that works well in practice but isn't guaranteed to do so, while an approximation algorithm has a specific bound on how nearly optimal its answers are guaranteed to be. (There are also randomized approximation algorithms, which provably meet a bound from optimality with a specific probability, often dependent on specific characteristics of the input.)

However, I'll agree with the notion that every software engineer should understand NP-completeness at least well enough to recognize the possibility that he/she is encountering it.

Joe Ganley on November 15, 2008 08:42 AM

John:
"I like to point out that the original problem was itself an approximation. If a salesman really wants to visit a bunch of cities, there are lots of little things that go into determining what really is the best route [...]-- avoiding roads with construction, not driving through major cities at rush hour, etc. ..."

Thats why people invented weighted graphs.


Josh:
"NP-Complete is perhaps the most non-intuitive name for a very very hard problem!

Wouldn't an optimal solution fall under the realm of premature optimisation? Shouldn't 2-3% off optimal be fine unless profiling determined otherwise?"


It might not be intuitive, but it is rather logical when you know the theory behind it.

What the hell are you talking about? Profiling refers to running time, 2-3% off refers to optimality. If you already know that your problem is NP-complete and your instance sufficiently large, you won't have to do profiling (which will be rather unsuccessful anyway). But if you don't get that TSP is just an analogy to describe a kind of graph problem, which can come up in totally different contexts, you're a lost cause anyway.


Jeff, I have to concede with M here. Maybe you managed to give a somewhat perverted introduction to the topic.

But this one makes me cringe: "Nobody can define what makes a problem NP-complete, exactly, but you'll know it when you see it."

Do you really think that? Do you think that thousands of papers and hundreds of books have been written about a topic, which is not clearly defined and that in theoretical computer science?

I'm the first to admit, that it might be difficult, if not impossible to define NP-completeness in a blog post and I would have been really amazed if you had pulled that off. But, blatantly lying and saying it can't be defined was the worst thing you could have done.

H on November 15, 2008 09:11 AM

With respect, M, Jeff did not say that there is no definition of the term "NP-Complete". He clearly knows what the term means, and even links to the Wikipedia article you cite. What he said is there is no way to define the characteristics of a problem that make them NP complete. That's a much more interesting problem, and he's right. We don't have an a priori way to tell which problems are going to be NP Complete, though we can recognize when a specific problem is. Its worth reading Jeff's wording carefully, as he's usually quite precise.

There's really no need to be so aggressive and patronizing. Especially if you aren't correct.

Dan

Dan on November 15, 2008 09:15 AM

"Nobody can define what makes a problem NP-complete, exactly, but you'll know it when you see it."

I think what Jeff means is that nobody can define what makes a particular problem belong to NP, but not P. Of course, we don't know if any problems actually do satisfy this, but some (like the traveling goofball) seem to.

Eam on November 15, 2008 09:22 AM

Traveling Salesman is not NP-Complete, the decision variant is.
http://en.wikipedia.org/wiki/Travelling_salesman_problem#NP-hardness

I sortof expected little mistakes like that, though. Algorithmic complexity can be very counter-intuitive. For example, if you have to decide colorability on a planar graph:
2-coloring: P
3-coloring: NP-Complete
4-coloring: constant time [4-color theorem]

Strilanc on November 15, 2008 10:02 AM

That might be what Jeff wanted to say, however there is a strong definition of NP-completeness (a language L is part of NP and all other languages in NP are polynomial time reducible to L). As all problems in P are part of NP, this remains the same even if P=NP. In that case it won't be very interesting, as all problems in P, except always accepting and never accepting are P-complete, but it is still a valid definition.

H on November 15, 2008 10:16 AM

Strilanc,
if you start with Jeff not mentioning that he talks about decision problems, you might as well mention that he didn't state what type of reduction he is talking about.

H on November 15, 2008 10:25 AM

@dan:
I've set it before. Jeff's MO is the "google expert" MO (he calls it "just-in-time learning"). It may be fine if you're just trying to solve this or that problem and get on with your work. But if you're gonna talk about something, you had better do the research. And Jeff seldom does.

Obviously Jeff puts in considerable time in *writing* his blog posts. I appreciate that. Is it that hard to spend more minutes for *researching* them?
Or here's an idea! Don't talk about things that you clearly DON'T know about. Reading Wikipedia does not make one a domain expert, and is definitely not enough to start explaining this to others.

Every single time Jeff talks about something I know about, he make many small but not insignificant errors, betraying his ignorance on the subjects. Sometimes Jeff makes gross errors. The kind of errors that anyone who has a little bit of knowledge in that area would know.
It makes me wonder how much of what Jeff says about stuff I don't know about, is a mistake.

Also, trivializing the work of hundreds of computer scientists is extremely condescending. I actually avoided calling it what it is, in my previous post.
How can Jeff, with his 30 minutes of reading Wikipedia, claim that "no one can define what makes a problem NP-complete"?
And calling something 'fancy computer science jargon shorthand for "incredibly hard"' is not only condescending, it's down right insulting, when it comes from someone who very obviously don't know what their talking about.

My point? Such articles (read back a bit, this is not an isoladed incident) are superficial and shallow, poorly researched pieces.
I don't think it was Jeff's intention, but they promote and ENCOURAGE the notion that programmers don't need to study CS at all, that it's just a bunch of B.S.
I mean, why study intractable problems when we can "cheat"?

And like I said, it's insulting. I don't insult Jeff's work as a web programmer. I know NOTHING about web programming. I don't write about web programming in such manner.

Dan, "We don't have an a priori way to tell which problems are going to be NP Complete, though we can recognize when a specific problem is"
That's a great sentence. Very clear. But that's totally not what Jeff said. Again, I encourage you to read back through the archives, on posts dealing with subjects bordering on CS. Jeff is often wrong on the basic details, and you can read the comments and see him being corrected time and again.

M on November 15, 2008 10:30 AM

Wow, these comments have gotten pretty nasty.

I think what's coming out more than anything is evidence of the growing animosity between computer science (theory) and software development (practice).

@M: It's insulting? How many computer scientists have I heard in the past week imply this analogy:
computer science : mechanical engineer :: programmer : auto mechanic
I've known brilliant head-in-the-clouds computer scientists (PhDs and all) that write code at a rate of 5 lines per month (with comments).

Until both sides start respecting each other as completely distinct yet dependent professions, this kind of thing will be par for the course.

Anonyous Coward on November 15, 2008 11:06 AM

To all of the current posts detractors, if you dislike this site so much, why on earth do you even bother to read it? Does it make you feel superior to tear down and trash someone elses observations and comments?

On a somewhat unrelated note, I consider the "orange" CAPTCHA system that Jeff setup was brilliant, especially after reading his explanation of it.

I feel there is an IQ test somewhere regarding peoples complaints against it...

Davide on November 15, 2008 11:22 AM

"Nobody can define what makes a problem NP-complete, exactly, but you'll know it when you see it."

Is this a joke?

Peter Davis on November 15, 2008 12:25 PM

Accusing Atwood of not explaining NP-completeness well enough here is like expecting your friendly college science professor to explain the mathematical and cosmological underpinnings of E=MC2 in 1000 words or less.

Or when the kindergarten teacher doesn't explain the Greco-Roman-Arabic derivation of the letters of the alphabet to his four-year-olds, do you start ranting and raving in his face, spittle flying, cursing his stupidity?

The piece was (obviously) an attempt to take a very difficult subject and make it palatable. I'd say it succeeded. It's a pale and weak sort of ego that would find anything offensive here.

Just my .02.

James Devlin on November 15, 2008 12:27 PM

I was wondering about the not having approximations for all NP-complete problems. The thing that is getting me is I thought that part of the definition of being in the NP-complete set is that a problem in the set is isomorphic with every other problem in the set. ( That is why proving that p=np or the inverse for one problem proves it for all of the problems. ) Given that why is it that we can not us a approximation for one on any of the others? Am I really confused about something here?

-Jonathan

Jonathan Moore on November 15, 2008 12:28 PM

My favorite NP-Complete problem is bin-packing. Maybe shape separability.

Dan Lecocq on November 15, 2008 12:35 PM

I, for one, rather enjoy Jeff's comments about computing. I work as a project manager for a small software company whose customers are not technical by any stretch of the imagination. Being able to communicate computer science ideas to the lay person, and make it enjoyable and understandable is not a simple task. Sometimes you have to dumb it down and stretch some ideas to make it work. So go bleach your brain, crawl back into your cave, and read your IEEE spec documents. Damn, can't we have more well rounded people in this industry???

Jon on November 15, 2008 12:37 PM

Why do you make some many laymen sort of posts about how we should all be learning more about computer science (when many of probably have CS degrees and have known about this since sophomore year of undergrad days) but you think learning C is useless?

Just wondering.

Charles on November 15, 2008 12:44 PM

To Jon who is railing about "more well rounded people in this industry:" You realize this is a technical blog right? Sure there may be non-technical "project managers" reading this but its kind of pointless to have a laymen sort of discussion with a manager of a K-Mart about Approximation Algorithms. Why would you even do that? You're just showing your ignorance by perpetuating some stereotype about technical people going back to their caves.

Charles on November 15, 2008 12:47 PM

@M Wow. You really know how to "take the bait" in every sense of the phrase!

Articles like these are exactly _why_ I read Coding Horror. I get a brief introduction to a subject I will find interesting, that I ought to be aware of the existence of, or both. I also get a little humour, and not so much as a trace of condescension about my lack of abilities or knowledge.

It's never occurred to me, *ever*, to just take Jeff's word for anything. Why would I? No subject that I find interesting is likely to be completely exposed by a few paragraphs, however accurate or well written, especially when a substantial amount of the writing is dedicated as it is to sarcasm, memes, or humourous links. Single-sourcing my learning is unlikely to be helpful either, so I never count on a single source, no matter the level of expertise involved.

Now I have a starting point for any research I care to conduct, a few terms that I can now spell properly, and a useful working definition of the subject at hand. That's right, not a correct definition; it doesn't need to be absolutely correct to be partially useful. I can also count on Jeff's unstoppable sense of self-promotion to require him to keep an eye out for humourous or light updates in future posts , so he can self-link. I don't have to keep a scan out for developments if I don't want a deep understanding of the subject.

Just to be clear, it is cheating, if you are the kind of person who believes that every program you write must be totally optimized right out of the gate. Where would Jeff get the idea that a substantial number of his readers are like that, I wonder?

Dustman on November 15, 2008 01:05 PM

Jeff rocks. His blog is wonderful, informative, and entertaining.

M is an anonymous coward.

Keep it up Jeff.

Swami Atma on November 15, 2008 01:37 PM

@Dustman: Thanks, you just saved me a LOT of time writing to make the same points but not as well. I'm a CS grad, and still got something useful from the post: a book recommendation. Thanks, Jeff.

A. L. Flanagan on November 15, 2008 02:02 PM

@Charles: Ok, so the cave comment was knee-jerk and stupid, I admit it. My only point was that the industry _needs_ technical people who can communicate to and work with non-techies. This to me is a skill that is hard to find among your run of the mill software engineer.

By the way, I have a degree in C.S. and worked as an engineer for 6 years prior to my current position, so I am not "ignorant" to computer science principles. "You're just showing your ignorance by perpetuating some stereotype" that all project managers are non-technical :)

Jon on November 15, 2008 02:08 PM

"Wouldn't an optimal solution fall under the realm of premature optimisation? Shouldn't 2-3% off optimal be fine unless profiling determined otherwise?"

You're confusing micro-optimisations with algorithmic complexity. Selecting the right algorithm is a very different thing from obscuring your code by trying to be clever and fast.

[ICR] on November 15, 2008 02:30 PM

Please revise your comment "NP-complete problems are like hardcore pornography. Nobody can define what makes a problem NP-complete, exactly, but you'll know it when you see it."

That's plainly wrong. It's quite common that people misunderstand the the details of complexity theory, overestimate their grasp of the topic, and spread their misconceptions. I'm surprised that even some of my most talented developer friends have this trait. And I certainly had it all wrong until grad school.

People who DO


Jason on November 15, 2008 02:43 PM

..oops hit enter too soon.

People who DO know the topic CAN explain it in accessible ways, and DO NOT utter blatant falsities. Some examples of this is:

"This problem, like many others, is NP-complete. Sigh, everything good in life is NP-complete" (meaning, many important problems that we need and would love to solve are, unfortunately, NP-complete. It is ABSOLUTELY TRUE that NP-completeness crops up more than we'd like... and sometimes in unexpected places)

"P ?= NP issue is analogous to why ordinary people can't compose like Mozart, but can still appreciate his work for the genius that it is." (listening and appreciating music is like being a verifier, you can check that a solution to a hard problem is correct, but you can't produce the solutions yourself. An ordinary person could brute force it by continuously composing until something came out that was magnificient... but then that could take a very very long time).

THESE comments are constructive... AND correct.

Jason on November 15, 2008 02:54 PM

@Jonathan: The transformations that convert an instance of one NP-complete problem into another aren't necessarily (and typically aren't) approximation-preserving. Similarly, a heuristic that works well for problem A often won't work well on an instance of problem B transformed into an instance of problem A.

Joe Ganley on November 15, 2008 02:56 PM

Dan --

I agree that M is being a bit of an asshole, but I startled a bit when I read that NP-complete was a "know it when I see it" situation. It's a precise statement of fact, which is contrary to fact. In other words, it's WRONG.

I really don't see what you're trying to say. The characteristics of a problem that make it NP complete are completely and unambiguously defined -- the definition is the same as the definition of NP-complete. What sort of characteristics are you looking for, by way of analogy?

Ens on November 15, 2008 02:57 PM

Problem: people.

Solution: hammer.

dnm on November 15, 2008 03:16 PM

I think Jeff's writings regarding hardware are his best.

You would be kind of naive to not think that there is a psychological plan to Jeff's posts: regular (which may limit his research), self linking, Amazon links, designed to get fan-boys typing and pulse rate up.

Jeff is wrong or incomplete or addressing issues he's not completely familiar with. If you have read this blog for a number of years then you know in your heart that it has become somewhat "unclean".

Steve on November 15, 2008 03:32 PM

For me, spreading ignorance (or bad information due to ignorance) is an issue.
If you are gonna talk about subject X, make sure you know subject X, well enough to talk about it. At least, make sure that you are not making gross errors about subject X. Is it really too much to ask?

It bothers me on both the practical level and as a matter of principle.

I've been working as a professional programmer for years, and I've encountered many without basic scientific background. And that's FINE! Not everyone needs it, or wants it. But then these people read a blog post, and it sounds right, so they believe it. After all, they lack the knowledge to figure out which parts are true and which are don't. That's why they are reading the blog!
And they learn stuff that is WRONG. And then they are going to implement this stuff, and argue about it, and generally BELIEVE that anything having to do with CS is either unpractical or is easily enough learned in 30 minutes of reading.
And then I have to work with these people, and manage them! They have no grasp of how much they simply don't KNOW. And at some point, their knowledge simply ain't gonna cut it. And they are going to argue with me, or write me off as a user of "fancy computer science jargon". I mean, it's just register allocation, right? How hard could it be? It's only BSP tree optimization, let's just check all the options!

So I am trying to combat this ignorance for practical, selfish reasons. Programmers need to understand the problems they work on. They need to understand when they are out of their depths, and its time to hit the books, or call someone who knows. Or reject that project, or bump up the cost and time estimate. At least map out the areas of your understanding, so you'll know when you're on treacherous ground.

The other reason is a matter of principle. Ignorance is pretty bad, and I reject mediocrity for its own sake.
In less fancy terms, if you are writing technical posts, get the technical details RIGHT! Isn't that a given? But apparently I am an asshole for pointing it out :)

For the person who asked, the reasons I read Jeff's posts are two:
a) Jeff often finds interesting links. I sometimes only skim through what he writes, looking for the links.
b) On occasion like this, I have a chance to make my contribution to education. Maybe help some of the people who would otherwise be misinformed, by pointing out mistakes. People that some day I might work with. I'd rather their eyes don't glaze over when I mention fancy computer science jargon.

@Swami: calling me an "anonymous coward" is fine, and perhaps far from the truth, but is also an "ad hominem argument".

M on November 15, 2008 03:37 PM

Want to have some fun with similar problems? Go give Project Euler a shot: http://projecteuler.net/

Some of the last ones will drive you crazy! >.<

Mitch on November 15, 2008 03:48 PM

I don't know if this is considered an NP-Complete problem, but the algorithm I'm still still trying to find a solution to is comparing a data list problem.

Given a table(s) with no primary key or true consistency
---------------------------------------------
Thomas Smith 123 Any St Suite 1 Dayton 45424
Tom Smith 123 Any St Apt 1 Huber Heights 45424
Roberta Smith 123 Any St Apt 2 Huber Heights 45424
Bobby Smith 123 Any St Apt 2 Dayton 45424
ABC Realty 123 Any St Apt 1 Dayton 45424
---------------------------------------------

Try to de-duplicate and/or assign a consistent reference to the data programmatically has always been a nightmare.

The other problem I've run into is capitalization of proper names. Easy to do for a human, but tell a computer to do it.

Jim P. on November 15, 2008 04:07 PM

Sorry, I guess I should have put a smiley face next to this comment to convey my (poor) humour:

"Wouldn't an optimal solution fall under the realm of premature optimisation? Shouldn't 2-3% off optimal be fine unless profiling determined otherwise? [hehe]"

Better?

@M
"But then these people read a blog post, and it sounds right, so they believe it. After all, they lack the knowledge to figure out which parts are true and which are don't. That's why they are reading the blog!"

I'm a non-scientific person, of the type you allude to. I don't read this blog as an insight to all things CS. I read this blog for its entertainment value and as an introduction to areas I may otherwise not have come across. Is Jeff responsible for mine and thousands of others learning habits? It is MY job to research areas that I may be implementing.

Further, Jeff has ALWAYS stated not to take his word verbatim. He uses the 'shock and awe' method. If people are in a position where they need to analyse NP-Complete problems, do you think their sole reference would be here?

Once I got to uni I was angry at my high school teachers for teaching me 'incorrect' definitions. Once I hit third year uni, I was thankful to my high school teachers for not burdening me with unnecessay complexity until it was required.

Finally - a blog post exists for the author to speak about whatever they want to. That people subscribe is their business. I don't believe there is any responsibility that comes with being a successful blogger. You have no right to tell Jeff he may not speak about something. The comments are here to point out inconsistencies. And if the quality of the post is in question (or that of the entire blog) people will stop reading. This, Jeff has also explained previously.

In summary, M - get off your high horse and write a blog of your own, then direct those that require greater understanding to yours.

`Josh on November 15, 2008 05:28 PM

> Please revise your comment "NP-complete problems are like hardcore pornography. Nobody can define what makes a problem NP-complete, exactly, but you'll know it when you see it." That's plainly wrong.

How is it wrong?

--
http://max.cs.kzoo.edu/~kschultz/CS510/ClassPresentations/NPCartoons.html

The theory of NP-completeness provides many straightforward techniques for proving that a given problem is "just as hard" as a large number of other problems that are widely recognize as being difficult and that have been confounding the experts for years. Armed with these techniques, you might be able to prove that the bandersnatch problem is NP-complete and march into your boss's office and announce: I can't find an efficient algorithm, but neither can all these famous people.

--

I guess I make a distinction between "reducing problem to another known NP-complete problem" and "proving a problem is NP-complete". Apparently we can't *prove* anything is NP-complete, we can only throw a wall of experts at it and see them all fail.

So, a NP-complete problem is not mathematically proven, it's one that no experts can solve, eg, "I'll know it when I see it."

Or more specifically "we'll know it when lots of experts have seen it, tried, and failed."

Is this wrong?

Jeff Atwood on November 15, 2008 05:43 PM

I write video games and I 'cheat' all the time. As long as the results look the same the only thing important to me is how fast can I get those results?

NP - Complete is a mathematical term. I think it should stand for Nobel Prize Completely in the bag! Shame mathematicians can't get one. Instead they have to be happy with a Fields medal (if they’re under 40)

I'm not really bothered about nit picking myself and I believe that absolute answers are seldom necessary. That's the difference between the philosophy of Maths (pure) and realistic Maths (applied). No one cares what the billionth digit of PI is. It would just be fun to be able to know what it is in an instant if we wanted to.

I suspect that some questions themselves are actually flawed. It's a bit like the answer to life the Universe and Everything being 42. Unless the question is understood the answer means nothing.

There is no absolute answer to PI. There never can be in decimal terms. Approximations are what survival is all about.

With the travelling salesman; who says there is 'the quickest route' when there might be 100 routes of exactly the same length. Also where does the salesman start? That is not mentioned in the original question, also the layout of the towns, what if they were all in a straight line? Start at one end and work to the other etc... So I believe the question is flawed in this sense.

I think the words 'cunning approximation' might pacify the opinions of the masses instead of 'cheat'

Like the article though.
:)

Can I also mention that I dislike the idea of the Turing test? Why should a computer try to pass itself off as a human? This is a silly test. A computer is a computer and should do what they are good at.

Do we have a test whereby a human has to pass itself off as a computer? No.

Is there a test whereby an apple has to pass itself off as a banana? No.

Alan Turing may have been a genius, but some times they can come out with some pretty stupid stuff, make mistakes and come up with pointless exercises. If a Philosopher wants to spend the rest of their life pointlessly attempting to extrapilate meaning from an inane and misguided question - let them.

:)

Sci-Fi Si on November 15, 2008 05:47 PM

My favourite NP-complete cheat is Ant Colony Optimisation. It's great: intuitive and interesting, if a little difficult to implement.

Dom on November 15, 2008 05:51 PM

Thought I'd just add that there are still at least 21 Nondeterministic Polynomial questions to solve and plenty of money to make if anyone fancies having a go at them

:)

Sci-Fi Si on November 15, 2008 05:54 PM

Jeff Atwood:
------------------------

I guess I make a distinction between "reducing problem to another known NP-complete problem" and "proving a problem is NP-complete". Apparently we can't *prove* anything is NP-complete, we can only throw a wall of experts at it and see them all fail.

So, a NP-complete problem is not mathematically proven, it's one that no experts can solve, eg, "I'll know it when I see it."

Or more specifically "we'll know it when lots of experts have seen it, tried, and failed."

Is this wrong?
----------

No Jeff, with all due respect, this is completely wrong. There is no mathematical problem called "The NP-Complete Problem." There are certain problems are NP-Complete. You are confusing this with the "P=NP Problem."

You JUST wrote that "we *can't* prove that a problem is NP-Complete" that is completely false. We *very well* know how to prove a problem is NP-Complete. You really need to go through some introductory texts first. It is not like "Well, it's like porn; I know it when I see it."

The issue is with P=NP. Also, it's not "cheating" and its not as if Computer Scientists are trying in vain to find optimal solutions. That's just silly. Saying that "no experts can solve it so I know it when I see it" reveals a fundamental lack of understanding in the theory. If you want to talk about what is "practically done" or something like that then you can say something about how in many cases we proceed as if P != NP; however there is no proof of *that*. However, we can very well prove that problems are NP-Complete.

The bit you quoted:

"The theory of NP-completeness provides many straightforward techniques for proving that a given problem is "just as hard" as a large number of other problems that are widely recognize as being difficult and that have been confounding the experts for years."


All that is saying is that you may find some new problems which are NP-Complete but it is also very likely that those problems can be restated in the form of a "well known" NP-Complete problem (probably one of Karps 21 NP-Complete problems).

Charles on November 15, 2008 06:10 PM

All you who are making out the whole issue of NP-Complete and NP versus P to be just jibber jabber and "practical" answers are what people really care about don't really understand the issues involved. I mean, surely, you must realize if you certain problems (like the RSA problem) are able to be solved efficiently then most cryptographic algorithms are basically useless right? How's that for practical?.... ::shrugs::

Charles on November 15, 2008 06:13 PM

Jeff you have one thing wrong there. You can prove NP-completeness, thats what Cook's Theorem comes down to. What you can't prove is that there is no way to solve an NP-complete problem in polynomial time. We assume there isn't, because a lot of intelligent people have been looking very hard and haven't found one. Now if someone did, every NP-complete problem would still be NP complete, they just wouldnt be "difficult" anymore.

H on November 15, 2008 06:21 PM

Wow. Just wow.
Jeff, why did you post about a subject about which you know so little? Infact, why did you even link to a book you didn't read? I don't understand how you can be in a position to recommend it at all.

Daniel on November 15, 2008 06:56 PM

If you look carefully you will see that I'm an apple.

However, there is no way of working out how long it will take you to see that I am an apple.

I will say between 3 days to 17.2 years depending on how many degrees of Kevin Bacon there are between us

Banana on November 15, 2008 07:06 PM

Jeff:

As many people have pointed out, you're completely wrong with saying nobody knows what NP-complete is. NP-completeness is perfectly clear and actually has a definition (just look on Wikipedia and you'll see it).

P is the class of problems that can be solved in polynomial type. For example, mergesort is O(nlogn) which is polynomial, therefore mergesort is in P.
NP means non-deterministic polynomial time, which means that given a problem and a solution to that problem, you can verify whether the solution is correct for the problem in polynomial time (which means if we have some kind of quantum computer that is capable of doing all possibilities simultaneously, then we can solve any NP problem in polynomial time).

An NP-complete problem is a problem that is more difficult or as difficult as any problem in NP. We have proven many problems are NP-complete - look at SAT and reducibility on Wikipedia and you can see for yourself.

The problem (as some people have pointed out) is proving that P is the same set as NP. Computer scientists haven't been able to prove this right or wrong. Don't get this confused with people not knowing what NP-complete is.

Also keep in mind that NP-complete isn't as hard as it gets. There are complexity classes that are harder than NP like NEXPTIME.

Rob on November 15, 2008 07:54 PM

I don't think Jeff ever claimed to be an expert in the complexity field, or whatever field of CS this falls under... So try to be a little nicer in your smarter-than-thou critiques.

the porn quote---
NP-complete problems are like hardcore pornography. Nobody can define what makes a problem NP-complete, exactly, but you'll know it when you see it.
----
I don't think Jeff (correct me if wrong) is talking to people who have studied the subject... he's talking to the SOFTWARE ENGINEERS (not all software writers are computer scientists, you egocentric SOBs). He is talking to the people who replied thinking the traveling salesman problem is about a salesman.


A software engineer does not need to be able to memorize a list of infinite manifestations of NP-Complete problems (TSP being an optimal network path vs an actual salesman, vs google maps vs how much wood could a wood chuck chuck). A software engineer need only to have a 'spider-sense' about these problems and the know-how to recognize or prove that his/her problem is NP-Complete.


I think it's funny that this article is under such fire...
Jeff may not be able to write a scholarly article on NP-Completeness... but he knows it when he sees it. Isn't that the point of this?

Chees0rz on November 15, 2008 08:19 PM

@Chees0rz:

[quote]
I don't think Jeff (correct me if wrong) is talking to people who have studied the subject... he's talking to the SOFTWARE ENGINEERS (not all software writers are computer scientists, you egocentric SOBs). He is talking to the people who replied thinking the traveling salesman problem is about a salesman.
[/quote]

Perhaps your definition of a software engineer differs from mine. I'd expect any decent SE to have a fundamental grasp of algorithmic complexity -- to the point that they can discuss problems with different complexity classes and tell me why a given problem is very hard. I'd also expect a software engineer to know that the TSP has very little to do with salesmen.

I suggest that anyone who cannot deal with (or at least be able to learn) such theory is in the wrong profession. Computers are hard. Programming is hard. In order to be proficient programmers we need to understand the nature of that hardness lest we make stupid mistakes that cause our software to fail spectacularly.

I don't think any of this is unrealistic. In Australia at least, Software Engineering is just as rigorous in these areas as Computer Science and nothing being discussed in this article or the posts that follow it are beyond undergraduate level education.

Daniel on November 15, 2008 09:40 PM

For those who try to defend the author, we are not blaming him because he is being brief. We are blaming him because he is WRONG.

NP Completeness is a term that can be easily defined, but he is taking a wrong approach. NP is NOT about a problem being hard. It is about the fast verification of a given solution. So in short, NP completeness only has two aspects:
1) fast verification of a given solution
2) If you can solve one NPC problem quickly, you can solve them all quickly.

I don't think this is that much more complicated that what the author said, but this is the correct explanation.

Also, saying that NPC problems can be spotted by expert programmer is completely and utterly false. One excellent example is primality check. For many many years people believe that to test whether a given integer is prime is NPC, WRONG. With the discovery of AKS primality check in 2002, we know that actually primality check is actually in P. Even the best mathematicans/ programmers wouldn't have known before that.

Freddy on November 15, 2008 10:14 PM

This is my first post to this discussion, since I just saw the post. Well, I am really curious why people are 'defending' Jeff.

Nobody here is trying to show off.

The information presented in the blog post is plain wrong, and therefore misleading. I am not going to discuss the relevant topics here (i.e., NP-completeness and approximation algorithms - called 'cheat's in the blog post). This is not a matter of attack or defense, but maybe an opportunity for Jeff to do some self-criticism.

At various posts, I disagree with Jeff, but most of the posts are on very subjective topics, and since there is no objective metric, argument is time spent for nothing. But disagreement is not bad, I am sure in some of the topics Jeff's are more accurate than my thoughts, and in some vice versa.

The worrying part about the post is not only its inaccuracy, but its reflection on the credibility of the author himself. At most times, when I am ignorant on a topic he posts on such as 3D cards, I took what he said for granted - not totally granted, but with high confidence I should say. This is not true from today on, because today he posted on a topic that I am very knowledgeable about, and the information is wrong, which forces me automatically to degrade his credibility on the other posts from today on.

But I hope that the comments on this blog post will at least force Jeff to reconsider his posting strategy.

Jeff: I think you are a good writer, and I don't have to tell you this, because you make your life from blogging which says enough. In any case, I think you should stick to the following basic principles:

* Do not post about something you don't know as if you have an idea about it: Posting about something you don't know is fine unless you make it absolutely clear you don't know it, want to learn more about it, or asking the reader's for assistance and comments.

* Do not give references to text you haven't read unless you make it clear you haven't read them: If you had read the first 2 chapters of Garey & Johnson, you'd not have written this post.

I suggest that you retract this post, or the misinformation propagated will be your fault. I think one of the readers can make a guest post about the topic, which will be informative to the people who don't know the topic.

tuncay on November 15, 2008 11:19 PM

How much wood would a woodchuck chuck, if a...

Reviculous on November 15, 2008 11:51 PM

Is NP-problems like hardcore pornography? I never thought there were a substitute for that, thanks for now I'll be solving Travelling Salesman problem instead... hope it will get me the same pleasure..

Jack on November 16, 2008 12:28 AM

Genetic algorithm. It has worked well for TSP, thus I've always kept a good GA library (with some timeout stoppers for example no improvement in solution during X seconds, or no exceeding X seconds of total computation ) in order to solve NP-Probs when they have arisen.

Jok on November 16, 2008 02:36 AM

When reading this post, it surprised me how it depends on the circumstances when you're first heard the term "NP-complete". Jeff calls it a shorthand for "incredibly hard", but I always remembered it as "easy". How come?
At university, I wondered whether I should study computational linguistics and attended a first lecture about syntax theories to get a glance about what that subject is like. The problem is to syntactically parse a sentence in natural language using a computer. In that lecture, if a syntax theory is "NP-complete", that meant "good, quickly solvable in polynomial time." "Not NP-complete" meant "bad, takes exponential time to solve". But the worst case (in that lecture some version of a Chomsky syntax theory) was "plainly unimplementable".
My interest for computational linguistics originated in my enthusiasm for text adventures (today called "interactive fiction") many years before (on a Commodore 128), so I'd call "well written textadventures with bewitching graphics" (Magnetic Scrolls) as my favorite NP-complete cheat.

tarnold on November 16, 2008 02:58 AM

While I and some of the other critics here have done a pretty good job covering where Jeff went wrong, I decided to give this a go myself: http://joeganley.com/2008/11/np-completeness-i.html ... I plan to write further installments in the coming weeks on attacking NP-hard problems: heuristics, approximation algorithms, exact algorithms, restricted problem inputs, and general optimization algorithms (like simulated annealing).

Joe Ganley on November 16, 2008 04:46 AM

We know exactly what NP-Complete means: it is defined as the set of decision problems that can be solved in polynomial time on a nondeterministic Turing machine.

P can be solved by a deterministic Turing machine using a polynomial amount of computation time, or polynomial time.

We know exactly how to show that a problem is NP-Complete: to show that with polynomial time on a deterministic Turing machine, we can convert solving that problem into solving all NP-complete problems. Reducing a problem is another word for encoding it, and it's NP-Complete if the encoding work is small.

<b>If you wanted to, you could write a program to convert any np-complete problem to some program that would run in polynomial time on a non-deterministic Turing machine, and your program would run in polynomial time if you did it correctly.</b>

The definitive definition paper
http://www.cs.toronto.edu/~sacook/homepage/1971.pdf.gz

I was in a class where we did this, making the transition from 3-SAT to a nondeterministic Turing machine notation. It was boring.

John the Statistician on November 16, 2008 04:51 AM

Can't you all see Jeff is just trying to provoke you? After all he's quoting a cartoon as evidence in his comment.

I think this is a good strategy...
1) pretend you really don't get the point
2) have a buch of foolproof explanations attached to your blog-post for free.

Well, have fun.

Alex on November 16, 2008 05:02 AM

@Alex: just because Jeff has no regard for mathematical truth doesn't mean that it isn't worth our time if we do. (I don't mean this negatively, necessarily. I think it's a natural outcome of combining a practical, only-when-I-need it outlook with the strong opinions weakly held position. Just because it's bullshit in the Harry Frankfurt sense, doesn't mean it isn't a fine way to go about things.)

John the Statistician on November 16, 2008 05:10 AM

Writing the optimal blog post would have taken a very very long time.

Nice approximation, Jeff.

zing on November 16, 2008 05:25 AM

Poor old Jeff. He tries, he really does.

Well after the beating he's taking by various over nerds I feel I ought to make him feel better.

Ok here goes.

I had absolutely no idea what you were talking about, and even less about what you're detractors are talking about.

You're not the only guy who doesn't understand stuff Jeff

PixelGnome on November 16, 2008 05:31 AM

I think what just happened here is called market segmentation.
Some part of Jeff's readers have a course in computability theory under their belt, so they are rightfully angry at the informal definition of a very formal entity.
The other part, whose lives were full without using the term "NP-complete", may have gotten a glimpse of this interesting notion.
Kudos for this, Jeff.
Here's Dorit Hochbaum's book about approximations (not hacks) for NP-hard problems:
http://www.ieor.berkeley.edu/~hochbaum/html/book-aanp.html
(I guess this is more for the first crowd...)

Yuval on November 16, 2008 06:11 AM

Can someone solve this?

a=b
so, a^2 = a*b
so, a^2 - b^2 = (a*b) - b^2
factor: (a+b)(a-b) = b*(a-b)
cancel: (a+b) = b
so, 2b = b
so, 2 = 1

No Googling!

Steve on November 16, 2008 06:50 AM

division by zero exception

Alex on November 16, 2008 07:05 AM

Umm, you definitely CAN define that a problem is NP hard or complete. It's called a proof. If you can prove that a problem is NP*, you can thus prove to yourself (or others) that the actual solution is intractable. So you can stop trying to find one, and focus on
a.) redefining the problem
b.) using approximation algorithms

Chris on November 16, 2008 07:23 AM

Travelling SalesPerson is one of my favorite examples of why you should spend some time refining the requirements of you problem when you run into something that might be NP-Complete. Not only is the TSP problem easy to approximate, only the general case of TSP is NP-Complete. TSP when all of the points are co-planar (a pretty common case on the planet Earth) is *not* NP-Complete.

Sean Reilly on November 16, 2008 09:57 AM

Jeff Atwood: "I guess I make a distinction between "reducing problem to another known NP-complete problem" and "proving a problem is NP-complete". Apparently we can't *prove* anything is NP-complete, we can only throw a wall of experts at it and see them all fail."

This is incorrect - the Cook-Levin theorem (http://en.wikipedia.org/wiki/Cook%27s_theorem) proves that the problem 3-SAT is NP-Complete. My understanding of the history is that Cook published in 1971; Levin's work was done around the same time, but not published until a bit later, and it didn't become so well known in the West, as Levin is Russian. For that reason, the result is often known as Cook's theorem.

Since that time, thousands of problems have been proved NP-complete, usually using a problem known to be NP-Complete, like 3-SAT, and a reduction. Garey and Johnson contains several hundred such problems, and many more are known today.

What is not yet known is how to prove that NP-Complete problem cannot be solved in polynomial time. If that could proved, it would solve the famous P vs NP problem.

(That aside, thanks for your blog, which I greatly enjoy.)

Michael Nielsen on November 16, 2008 09:57 AM

I very seldom see much acknowledgement that there are *lots* of complexity classes far far higher than NP (and thus problems much much harder than NP-complete ones), and that in practice the limits of worst-case tractibility are substantially lower than NP: algorithms which run slower than O(n^2) are seldom tolerable in practice.

In many cases there's no such thing as an "approximate" solution: you can approximate a "best" path, but figuring out whether or not there is *any* path satisfying some set of conditions is all or nothing. In those cases, the best approach is to set aside the worst case and focus on realistic cases. There are plenty of very hard problems (NExpTime and higher) which can be solved in time close to O(n log n) on the kinds of data that tend to be encountered in practice.

If I were a manager and programmer came to me saying "this problem is in such a high complexity class it would be impossible to solve!" I'd encourage the programmer to identify patterns in the type of data we expect that could be exploited for optimization. A formal definition of this "tractable fragment" of the original problem may sometimes be difficult to codify, but that's beside the point if you can solve the problems you encounter. It's easy to phrase any easy problem as a subset of a much harder problem; the trick is to identify the truly necessary bits of the problem and see how hard that part is on its own.

The criticism I'd level at high-complexity research is that academics can dwell on the really hard problems (because they're interesting) and skip quickly over simplifications (because they're trivial and not worthy of study).

Rob Shearer on November 16, 2008 10:35 AM

Having scanned over the other comments, it seems that a lot of people are complaining about the article again. Well, I don’t. It might not be the most accurate description but it brings the point across (and I just like the analogy). Plus, some of the comments are sooo much worse. Consider the very first one: “… because … nearly any other NP problem can be reduced to [3-SAT].” This reveals a complete misundestanding of NP-completeness and the concept of reduction.

With that out of the way, I want to answer your question. I’m undecided between three very different cheats. The first uses cutting planes to literally “cut away” parts of the (mathematical formulation of the) problem to make it easier. I like it of its practical importance. Take TSP. Nowadays, this problem has very good solvers that use this approach (often combined with branch-and-bound to yield branch-and-cut), combined with raw computation power, to tackle even extremely large problem instances. For instance, CONCORDE has found the exact solution for the 24,978 cities of Sweden.

The second approach, Lagrangean relaxation just has to have a place in the list because of its elegant mathematical derivation. By applying a Lagrangean multiplier (that can be found easily) to one term of the problem, the overall runtime of the problem can be drastically decreased.

The last cheat is ant colony optimization. The underlying analogy is just fascinating. Also, implementing this solution is very easy and just plain fun. With the right sets of parameters, this heuristic often leads to very good results. Unfortunately, there’s no guarantee for good results which is simply unacceptable for many problems.

konrad on November 16, 2008 10:55 AM

Hey Now Jeff,
Those cartoons are really funny.
Coding Horror Fan,
Catto

Catto on November 16, 2008 11:13 AM

-----
So, a NP-complete problem is not mathematically proven, it's one that no experts can solve, eg, "I'll know it when I see it."
-----

This is so wrong it hurts.

To add something constructive to the Jambalaya, I should note that not all "high complexity - won't solve in a million-billion years for a large problem" problems are NP-Complete...

... there are problems that are just as hard, or harder, and don't have the decency of being NP-Complete, but still require the use of approximations and heuristics.

Eric Londaits on November 16, 2008 11:56 AM

Why can't I vote-up these answers? :p

Dean on November 16, 2008 02:41 PM

NP = Not Possible [in linear time]

BTW, Jeff, I don't know if you would read my comment, but since your captcha is always orange, and I never see any spam in any of your blog, how did you manage that?


Kevin Le on November 16, 2008 03:30 PM

@Steve,

You asked:

>>Can someone solve this?

>>a=b
>>so, a^2 = a*b
>>so, a^2 - b^2 = (a*b) - b^2
>>factor: (a+b)(a-b) = b*(a-b)
>>cancel: (a+b) = b
>>so, 2b = b
>>so, 2 = 1

There are many variations to this type of puzzles. I knew or rather learned from 7th grade that all of them either involve (1) dividing by 0 or (2) Take a square root and ignore the fact the it's rather the absolute values that are equal.

Kevin Le on November 16, 2008 03:36 PM

-----
NP = Not Possible [in linear time]
-----

Not quite... but in line with the blog post.

NP = Nondeterministic polynomial

it means that it can be solved with a polynomial order algorithm in a Non-deterministic Turing Machine. A non-deterministic Turing Machine is a Turing Machine which can take many paths of conditional execution at the same time.

Eric Londaits on November 16, 2008 04:09 PM

My favorite NP-problem and hack must be the Quake 3 engine's sqrt() function.

http://www.codemaestro.com/reviews/9

Martijn Laarman on November 16, 2008 04:13 PM

Jeff: "Unfortunately, not all NP-complete problems have good approximations."

The very definition of NP-complete shows this to be false; that a problem is NP-complete means all problems in NP can be reduced to this problem. If you have an algorithm for one NP-complete problem, you have an algorithm for all of NP.

porges on November 16, 2008 04:21 PM

I'll answer the original question:

- Simulated Annealing
- Great Deluge
- Genetic Algorithms

Steve Moyer on November 16, 2008 05:13 PM

Hey Jeff,
Cool cartoon.

No, I do not take what you read as gospel, anyone who takes a single opinion from the interweb, and embarks upon a course of action based upon that, deserves whatever they get.

I read this blog because I find it entertaining, and sometimes it prompts me to go and learn something new, or revisit something that I had long since forgotten.

I am not going to defend you as a mathematician or computer scientist, as that is not the point of this blog. Had you posted this in a scientific journal, then it may well be appropriate for others to take such vocal umbrage at your lack of academic rigour.

Having inherited a few systems from mathematicians and academics, and suffered the horrors of making them work in the real world, that is not a perspective that I seek unless I have to.

This is edutainment, has never pretended to be anything else, and there is nothing wrong with that.

Like I said, cool cartoon. Works for a code monkey like me.

seanb on November 16, 2008 06:23 PM

It's not about rigor. It's completely wrong. You are dumber as a result of having read it.

seeuo on November 16, 2008 08:32 PM

Is it just me, or has this blog turned into ads for books lately?

Grom on November 16, 2008 09:26 PM

I think we're missing the main definition in this post:

"NP-complete problems are like hardcore pornography. Nobody can define what makes a problem NP-complete, exactly, but you'll know it when you see it."

So, we must define "hardcore pornography" first, and then discuss if it's or not NP-Complete, isn't it?

alberto on November 16, 2008 11:22 PM

These approximations to NP-Complete problem solutions, i'd be surprised if the quality of the approximation didn't degrade significantly with the input size - to the point perhaps where they're not useful anymore.

I would expect this but i don't remember seeing it confirmed anywhere. I mean, otherwise we'd have a probabilistic polynomial-time algorithm for solving NP-Complete problems (and i don't think we do because last i checked, 5 secs ago on wikipedia, the relationship between BPP and NP was another unsolved problem). In fact i think i remember seeing that likely BPP and P are equivalent (something about derandomization), so since probably P != NP, BPP != NP and we can't have a good approximating algorithm (for all input sizes).

Meaning, these approximating algorithms are no good for large input sizes, but please correct me if i'm completely wrong.

Bobby on November 16, 2008 11:26 PM

Not that I think some of the criticisms of this article aren't warranted, but to those of you who keep screaming about the formal definition of NP:

While Jeff's definition is certainly flawed in some respects, citing the formal definition of NP would be next to useless in the context of this blog. I challenge you to explain the practical implications of NP's formal definition to a lay-programmer in a way that you think is more "correct" than simply saying "known algorithms for finding optimal solutions to NP problems have extremely high computational complexity."

I mean, saying here that NP is about "quick verification of solutions and reducibility" is the same as telling a layperson that e=mc^2 means "energy equals mass times speed of light squared." No *practical* insight about the matter at hand is given by either definition without a lot of heavily mathematical and scientific pipelaying.

People usually appreciate having things explained to them in a way that doesn't make them feel stupid. I make such explanations all the time in my work. However, when I describe things to people using oversimplified abstractions, I make it a point to tell them that it is not *strictly correct* to think of it this way, and that the *truth* of the matter is much more complicated. But most of the time they are satisifed with, and can make much practical use of the simpler version plus caveat.

I don't think most of the serious commenters that are being identified as "defending" Jeff are defending his factual errors here as much as they are recoiling to the somewhat hostile and excessive pedantry being displayed. Hard and exact science is very important and (to me) fascinating. But it is not useful to shove *unnessecarily* large chunks of background theory down the throats of people who just want to get their work done, when some slighly innacurate but much simpler abstraction will almost always suffice for them.

Also, I disagree that your average software developer can't be expected to spot NP problems. In practice they are fairly easily spotted because almost ALL of them encountered in the wild are more or less isomorphic to some other known NP problem, like the Knapsack or TSP. Any good developer should be familiar with the relatively short list of "infamous" NP problems, and *should* be able to tell if the problem at his feet is just a variation on one of those.

Kyle S on November 17, 2008 12:21 AM

As an example, writing a non trivial web site that displays exactly the same in all major browser is NP-complete, and any semi experienced web developer have learned its way to the 3% from optimal solution.

jwickers on November 17, 2008 02:00 AM

I came up with a generic solution for the TSP a couple of years ago. Here's a hint... fractals. And lagrange polynomials. I leave the rest as an exercise for the reader.

Peter Maxwell on November 17, 2008 03:14 AM

My favorite algorithm is the Ant Colony Optimization, actually a solution to the traveling sales man. It is in fact a brute force algorithm and still it's better than real brute force. It starts as pure brute force but already after several iterations some solutions will turn out as being so sub-optimal, that the algorithm will hardly consider them anymore and thus the algorithm finds a solution much faster than real brute force. Real brute force would try all solutions, even if a solution seems to be very suboptimal at first, it might turn out to be an optimal solution in the long run. This is however very unlikely when usang ACO. Solutions that look sub-optimal in the very beginning are usually sub-optimal. If some ants are already back home while others are still on the way, the route used by the other ones is definitely worse than the one of the ants already back home.

Mecki on November 17, 2008 04:06 AM

Jeff is actually right here, and the CS types are barking up the wrong tree. This discussion is about the _software engineering_ term NP-complete, not the mathematical term of the same name. As he says:

--
Have you ever heard a software engineer refer to a problem as "NP-complete"? That's fancy computer science jargon shorthand for "incredibly hard":
--

Lots of other fields have similar confusion caused by a shared use of a word or phrase - for example, both religious studies and architecture use _postmodern_, for completely different things that some people tie themselves into knots trying to justify as essentially the same.

It might avoid future similar confusion if developers were to use the more mathematically correct term _superpolynomial_ when they want to say 'really hard, so we shouldn't do it' without sounding lazy. The only downside is other developers won't understand it they way they would the less mathematically correct term.

soru on November 17, 2008 04:08 AM

> sometimes coming up with clever cheats can be more interesting than searching in vain for the perfect solution

Word.

Smart: understanding and recognising an NP-Complete problem.

Gets things done: uses a suitable approximation.

Paul D. Waite on November 17, 2008 05:11 AM

"Unfortunately, not all NP-complete problems have good approximations."

Err, wait a minute. I thought one of the characteristics that made a particular problem NP-Complete as opposed to NP-Hard was that all other NP-Hard problems could be reduced to it. Therefore, wouldn't it stand that if you have a good approximation for the one NP-Complete problem, you'd have a good approximation for them all?

I have to admit I fell into the trap of refering to the TSP as NP-Complete, but only because it's the one that we're always taught.

"Jeff is actually right here, and the CS types are barking up the wrong tree. This discussion is about the _software engineering_ term NP-complete, not the mathematical term of the same name"

No, Jeff is not right here. NP-Completeness is a shining example of where the theory and the practice meet. It defines the limitations on what can be done with a computer. It's more than knowing that a problem is hard, it's knowing why it's hard, and what you can do to work around it. The software engineering "definition" is incredibly misleading and simplistic, as it ignores the fundamental point that all NP-Hard problems can be reduced to an NP-Complete problem. That's very powerful, because that means that if you're trying to come up with a way to get a result from something you know to be NP-Hard, you can reduce it to an NP-Complete problem and approximate that.

Delmania on November 17, 2008 05:21 AM

I can see both sides of the argument here. Jeff has at least brought to light the idea of NP-completeness to many people who otherwise wouldn't have even heard of the concept. Ok, there are inaccuracies in his post but hopefully those that are interested will follow it up. However, that is where the problem lies. Many people are not that great at thinking for themselves and don't check out the facts that they have been given are actually correct. I see this every day. They take a snippet from a blog and think it's gospel. The problem seems therefore to be not Jeff's post per say, but those that don't expand and "complete" their knowledge of the subject. Anyway, Jeff's post inspired me to write a post on NP-completeness in a little more depth over at own blog (http://www.equivalence.co.uk/), hopefully this may be of use to Jeff and others.

Gregg on November 17, 2008 06:00 AM

Jeff, I think you went too far this time.

This is not the "We have no idea about IT, but we want to look knowledgable" blog. This is "programming and human factors" blog.

Couldn't you write the post without all those childish mistakes?

NP-complete is not another name for "very hard". No.

Imagine, there are NP and NP-complete classes of problems - and both of them are "very hard". Know the difference between them? If not, please do not write about computer science.

Jakub Anderwald on November 17, 2008 06:34 AM

In cryptography, getting the right answer is all that's important, and getting an approximation is next to useless. Take an example of creating a parallel cryptographic hash function (like md5 or sha-1) where you hash small chunks, but then add all the outputs as large integers, and hash this sum one more time. This is essentially the subset sum problem. There are many ways (through dynamic programming and such) to find approximate solutions, but you haven't compromised the cryptography unless you find an exact match. Getting it within 2-3% of the correct solution only reduces your search space from 2^80, in the case of sha-1, down to 2^74 or so, which is still an intractable problem.

Matt Ball on November 17, 2008 06:37 AM

It's actually NP-hard, not NP-complete. There's a distinct and important difference between the two. It doesn't look like you bothered to even read the Wikipedia article you linked.

Jason on November 17, 2008 07:00 AM

Interesting to see you bring up the bin-packing problem. I used variants of the known implementations to create an output image that tried to have the least possible area for a given image set.

Theres source code on my site with it in 12/12/07, I was smitten with finding variants of different sizes and outputs. Really fun stuff.

Jake Heidt on November 17, 2008 07:08 AM

great post jeff. but the alt text on the xkcd img should really be fixed to read as on the actual cartoon page.

Sean on November 17, 2008 07:35 AM

'as it ignores the fundamental point that all NP-Hard problems can be reduced to an NP-Complete problem'

Actually, not true: many 'Np-hard' (s/w eng definition) are, in strict mathematical terms, exponential, or just polynomial with impractical coefficients. Outside military cryptography, there is rarely a pragmatic difference between 'would take a CPU-week' and 'if you turned the Sun into computronium, would still take longer than the heat death of the universe'.

Either is too long to wait for a web page to refresh.

To a zoologist it's Bos Taurus, to a farmer it's a cow. The more precise terms are sometimes useful, but if you want to talk to a farmer, you don't have to use their language, but you'd better understand it.

soru on November 17, 2008 08:10 AM

By the amount of scrolling I had to get to the bottom of the page, can I safely assume that writing a blog post about NP-Complete that doesn't freak people out is NP-Complete?

Neil C. Obremski on November 17, 2008 08:48 AM

Waitaminute...! A reference to a traveling salesman, and not one crack about a farmer's daughter?

Incredible.
Improbable.
Infinitely improbable.

Jon on November 17, 2008 09:03 AM

It's not cheating it's engineering. That's the diff betwixt science and engineering - otherwise we'd never get anything done.

iFubesi on November 17, 2008 09:33 AM

Remove money and no salesmen are needed.

Silvercode on November 17, 2008 09:42 AM

I'm beginning to think that pagination on Stack Overflow is an NP-complete problem, because it's still broken after all this time!

See <a href="http://stackoverflow.com/questions/63241/what-is-the-strangest-programming-language-you-have-used">http://stackoverflow.com/questions/63241/what-is-the-strangest-programming-language-you-have-used</a>; and notice the two page fours. And yes, I have logged this on UserVoice: <a href="http://stackoverflow.uservoice.com/pages/general/suggestions/65993">http://stackoverflow.uservoice.com/pages/general/suggestions/65993</a>;

John Topley on November 17, 2008 09:49 AM

> Apparently we can't *prove* anything is NP-complete

What I think you mean to say is: We can't *prove* NP-complete problems are "hard".

"NP-complete" is not a synonym for difficult. Some people think it is, and you may hear it used that way, but that does not mean you should encourage it. We already have a perfectly cromulent word for "difficult", and that word is "difficult".

Michael G on November 17, 2008 10:41 AM

I prefer genetic algorithms for NP-hard problems, that are transformable into a genetic algorithm (i.e. TSP).

MTZ on November 17, 2008 12:10 PM

NP-complete?

That's fancy computer science jargon shorthand for "incredibly time-consuming", or "incredibly hard to solve quickly", or more specifically "incredibly time-consuming to solve, and pretty quick to verify the solution"

Matt V on November 17, 2008 12:24 PM

Oh please.

His definition is sufficient (and sarcastically enjoyable) for those who are trying to solve these problems in the real world.

The differences are irrelevant to dealing with them in the real world. They are really hard... too hard to solve in a reasonable amount of time.

That's all I need to know. For the rest, who cares?

Some people take these things too seriously. For the intended audience of this blog (read: developers, not computer scientists), this is an excellent definition.

Practicality on November 17, 2008 12:54 PM

"That's fancy computer science jargon shorthand for "incredibly time-consuming", or "incredibly hard to solve quickly", or more specifically "incredibly time-consuming to solve, and pretty quick to verify the solution"

You got it backwards. For most NP-Hard problems, devising a solution is easy. It's verifying the solution is correct that is difficult.

"His definition is sufficient (and sarcastically enjoyable) for those who are trying to solve these problems in the real world. "

The ever wonderful "real world" argument tha plagues software development. It's not sufficient. I honestly don't really care what the software engineering defintion is, it's incorrect and misleading. The reduction componenet of the definition is crucial because it can help you devise approximations to solve the problem. After all, "it's too hard" doesn't always work.

Delmania on November 17, 2008 01:02 PM

As someone with an advanced degree in CS, I don't think his explanation was all that bad.

Compare it with Wikipedia's definition, which is effectively infinitely recursive:

"a class of problems having two properties:
* Any given solution to the problem can be verified quickly (in polynomial time); the set of problems with this property is called NP.
* If the problem can be solved quickly (in polynomial time), then so can every problem in NP."

Not nearly as useful to the layman, IMHO. I do believe there's a better definition involving a combination of Turing machines and set theory, but if you don't already know about Turing machines that wouldn't be particularly useful to you either, without a twelve paragraph exposition on the subject.

T.E.D. on November 17, 2008 01:15 PM

How about having a new classification, for those us for whom mathematical proofs don't play such a large part in our working lives, so as not to offend our more erudite colleagues:
FK Hard - you'll know it when you see it

seanb on November 17, 2008 01:23 PM

Sigh. So, to those who want to fight over the correct definition. What do you do for a living?

I would really like to know if anyone who is actual "front line" developer (not a researcher or comp sci teacher) has ever found the technical difference to be useful.

I have been in the field for quite a while now, and most everything you are taught in CS is absolutely useless.

In addition, I would like to know if any of you have been out of college for more than two years.

Prove me wrong. Really. I would like to know if there is such a case where proving how hard a problem is would actually be useful in a business environment. (Not a theoretical case, a real one, that actually happened).

Practicality on November 17, 2008 01:27 PM

Is this blog NP Complete?

OG on November 17, 2008 01:55 PM

Practicality --

Embedded system to automate paint designs on walls efficiently. I won't bore you with the finer details (and there were many), but it was trivially reproducible to a travelling salesman problem. As such, I was able to go through the existing body of work and determine which well-known algorithms had sufficiently low complexity to work given my constraints (which translate to constant-factors) before trying to program every bloody one of them in assembly and then watching to see if it moved right away. It was critical and saved countless hours. And I can only do it because we know with true rigour that the TSP is NP-complete.

Now I do mostly systems programming in C++ on desktop computers in performance-critical applications working for, let's keep this somewhat anonymous, one of the top-5 software development companies as ranked by revenue. NEXPTIME isn't an important term here, but a basic idea of complexity analysis that's a little more sophisticated than "I know it when I see it" is and has been absolutely necessary. More sophisticated and more not completely wrong.

You're right though, I'm not long out of school -- ~1.5 years. However, I didn't take CS, I took Engineering (not Software or Computer or Electrical Engineering), and while we briefly covered big-O notation analysis we didn't really deal with P=NP. I picked this up partly on my own time and partly on the job.

I don't mind if you don't know these facts, particularly if it isn't in the least bit relevant to what *you* do. But I do mind complete falsehoods being paraded around as facts. Even that wasn't so bad, but I'm aggravated to see people "defending" the error (including Jeff!? I sure hope he's just confused and asking for clarification instead of claiming it's not wrong). It's an error. We all make errors, even popular bloggers. That doesn't make it not an error.

Ens on November 17, 2008 01:57 PM

@Ens

Neat. Thank you for the informative story. Believe it or not, your real life explanation of how you solved that problem would fascinate me, but that's what I am looking for :).

However, I would argue that the fact you looked it up yourself would validate that we don't need to know this information ahead of time to enter the field.

Also, I would agree that, technically, Jeff made an error. He should say most of us can't define it (or don't care to), rather than nobody can.

However, I do think this a minor and picky error, when the intent of the post was to say that when you encounter such a problem, you need to find a workaround.

I think the harsh insults directed at Jeff for making a technical error of "very few" versus "nobody" are far worse than making the error itself.

Practicality on November 17, 2008 02:15 PM

@Practicality: I've been out of school for 14 years, and spent the first 10 of those years working on problems (many of them NP-hard) of optimizing the layout of computer chips. Making real software that is sold to and used by chip designers at semiconductor companies like Intel, AMD, nVidia, and the like. You doubtless own many pieces of electronics that were designed using software that I wrote. (I'm just making the point that I am neither a researcher nor an academic, but a working software engineer.) Being able to determine which of the problems I encounter is NP-hard is critical to doing this well. Often, trying to determine that a problem is NP-hard when it isn't provides the critical insight to solving it. Only a really sorry engineer writes a heuristic for a problem that can in fact be solved efficiently, just because he doesn't know the difference between "NP-hard" and "I can't easily figure out how to solve this."

I don't know what kind of work you do, but I use much of what I learned in CS every day, and have done so in every job I've ever held.

Joe Ganley on November 17, 2008 05:01 PM

It seems like all the people defending Jeff here are also trying to defend themselves for not knowing a basic CS term.

I really, really hope that I don't have to do any significant amount of work with any of the people who are saying it's of no practical significance "in the real world".

I've been out of college more than 2 years, and I use a very large proportion of the CS that I learned in my job. I'd leave the industry if I was just cranking out boilerplate code all day - and pretty much any time you get ahead of boilerplate there's usually something helpful from CS.

While you can look up stuff and pick it up if you go a) you'll pick it up faster if you have the basics of CS under your belt to start with and b) you'll be able to make better choices various data structures and algorithms throughout the less challenging work that improve your code.

There is no fight over the correct definition - there is one correct definition and Jeff wasn't really close. I think people might be complaining because they've mentally filed this blog in the 'reliable information' and they're begrudging the mental work to re-file it elsewhere.

My personal reason for adding my voice to the fray is that I think having a highly visible, incorrect and misleading piece of information out in the wild is really going to hurt those that are going to try to pick these things up on their own.

Dave on November 17, 2008 05:06 PM

Jesus people, he's not writing a university text-book here, it's a blog! Lay off a bit, or at the very minimum be more constructive with your criticism.

I don't think introducing someone to a new, interesting subject is a bad thing in any way. And maybe even a little mis-information (not that it is even there!) will weed out the ones not willing to research the topic on their own.

I also find it amusing that people will actually spend the time to read the post and then berate the author. If you don't like what he has to say, the solution is simple, DON'T READ IT! It's not an NP-Complete problem!

Kory on November 17, 2008 06:40 PM

I think saying a CS degree has no practical value is a big stretch. But, then again, having a CS degree doesn't necessarily make you a good programmer either!

dismountSoapbox();

Kory on November 17, 2008 06:46 PM

"NP-complete problems are like hardcore pornography".

Jeff, I really think you need to get out more.

Kramii on November 18, 2008 02:05 AM

Reminds me of this:
http://xkcd.com/386/

MyKey_ on November 18, 2008 03:42 AM

> I think what Jeff means is that nobody can define what makes a particular problem belong to NP, but not P.

That is what I was trying to say. Poorly, obviously. I have updated the post.

I still say the definition of NP-complete is awfully.. *conveniently* .. recursive, as T.E.D. mentioned.

I also agree with this particular sentence in the Wikipedia (http://en.wikipedia.org/wiki/NP-hard) article: "The NP-family naming system is confusing".

And how.

Jeff Atwood on November 18, 2008 05:35 AM

@T.E.D.:

The definition on Wikipedia is not recursive. I think the confusion may arise because "NP" and "NP-Complete" mean different things, and the definition of "NP-Complete" depends on the definition of "NP".

A problem is *NP* if (roughly speaking) it's quick to check whether a proposed solution is correct. Note that this includes both "easy" and "hard" problems.

This definition of NP is not recursive. You can decide whether a problem is NP independent of other problems.

A problem is *NP-Complete* if it's both *NP* and if finding an "easy" algorithm to solve it would automatically give you an easy algorithm to all *NP* problems.

This definition of NP-Complete depends on the definition of NP, but it does not depend on itself.

Weeble on November 18, 2008 05:53 AM

Another reason it sounds recursive is that generally the way you prove a problem is NP-Complete is by showing that it's NP and then reducing another NP-Complete problem to it. So how do you prove that anything is NP-Complete in the first place? The answer is that this is only the easy way. The hard way is to prove that *every* problem in NP can be reduced to your problem. That's what the Cook-Levin theorem did for the SAT problem. Once you know *one* NP-Complete problem, it becomes easier to prove other problems are NP-Complete.

http://en.wikipedia.org/wiki/Cook–Levin_theorem

Weeble on November 18, 2008 06:01 AM

> A problem is NP-complete if it is both [quick to check whether a proposed solution is correct] and finding an [efficient] algorithm to solve it would automatically give you an [efficent] algorithm to solve *all* [quick to check whether a proposed solution is correct] problems.

Isn't this.. er.. still.. partically recursive? We know it because solving one would solve all the others, which in turn solves the others, which in turn solves the others.. ad infinitum?

I have to say I find this topic *incredibly* confusing, even after reading all the above comments (many of which were quite helpful, thank you) and the relevant wikipedia articles.

Jeff Atwood on November 18, 2008 06:19 AM

"Isn't this.. er.. still.. partically recursive?"

The definition involves infinite stuff (the set of all NP problems), but it is not recursive. The definition of NP stands alone, and the definition of NP-Complete depends only on the definition of NP.

The presence of infinite stuff isn't as big a deal as you'd think. You can prove that adding together any two odd numbers will give you an even number, even though there are an infinite number of odd numbers. I don't really understand the Cook-Levin theorem myself, but I'm happy to know that it shows that all of the infinite possible problems (including all the ones nobody has thought of yet) in NP can be converted to an instance of the SAT problem. That might seem surprising, but it's not all that different to proving something about all the infinite possible odd numbers. Then, with that in hand, you can go on to show that other stuff is NP-Complete, because if you can convert SAT to your problem, you know you can convert *any* problem in NP to SAT and then convert it to your problem.

Weeble on November 18, 2008 06:51 AM

Ok, not enough time to read all the comments that don't have the answer to the question, "what's your favorite cheat?" Mine: refactoring a hard problem such that I can avoid the <part I want to avoid> as much as possible. I imagine that there is a CS term for it, but I majored in English :) Clearly a lot of the commenters are smart--probably smarter and certainly more educated (in this field, anyway) than I; how does the flamage help anyone "get things done"?

That One Guy on November 18, 2008 07:34 AM

Im sorry Jeff but your update is still somewhat wrong. You should abandon the thought NP-complete == hard-problem . It should be more like NP-comple => we think it's a hard problem.

I'll try to give an understandable and maybe partly informal definition of NP-complete below, bear with me (or ignore it, if you think it's not interesting).

NP-completeness is defined by two criteria:

1. The problem is in NP (aka the problem can be solved in polynomial time by a non-deterministic Turing machine. (Before I get flamed for mentioning Turing Machines: Regarding some previous posts by Jeff I think he knows what I'm talking about.)
Note this is trivially true for all problems in P (those which can be solved in polynomial time by a deterministic Turing machine). So P is a part of NP.

2. The problem L is one of the "hardest problems in NP".
Let's say we have some magic box, which accepts the input for our problem L and gives us the correct output, taking only a fixed amount of time, independent of the input. Now to prove, that our problem L is "the hardest", we have to show that we can solve all other problems in NP in polynomial time using our magic box (This is known as Reduction).
We do this by giving an algorithm, which transforms the input for other problems into input for our problem. Now showing something for ALL problems sounds difficult, but it can be done. (If you want to know how look up the Cook-Levin Theorem).

If we have one problem L', which is known to be "the hardest" it is enough to show that our problem L is equally hard. This means we only have to give an algorithm which solves L in polynomial time using a magic box which solves L'.


So, the Cook-Levin Theorem shows that we can transform every problem in NP into SAT. Going on from there we show that we can transform SAT into other problems, like Traveling Salesman, Bin Packing, etc. take your pick.

We think that we can't solve this problems in polynomial time on a non-deterministic Turing Machine (or on your average computer), but we can't proof that and if someone would come up with a way to solve one of these problems in polynomial time, we could solve all of them in polynomial time (we could just apply the algorithms, that we defined when we proved their NP-completeness).

However, even if someone comes up with such a solution (which is highly unlikely) the definition of NP-complete I gave above still applies to all of the problems, so they remain NP-complete. We only revise the deduction NP => difficult.

H on November 18, 2008 10:00 AM

Wow. Jeff Atwood, master troll!

I think I'm going to add Jeff's explanation of "NP-complete" to Conservapedia. http://www.conservapedia.com/Public-key_encryption

It's a pretty good imitation of a "pop science" explanation of CS terms, but with glaring errors that stand out to anyone with an undergrad CS education. ("Nobody really knows what NP-completeness means... Problems are only NP-complete because we can't prove they aren't.")

Maybe Jeff can explain one-way hash functions next. ("A one-way hash function is a way of implementing a read-only hash table, similar to Ada's 'const final' attributes.") Or branch prediction. ("Branch prediction is the ability of a programming language to conditionalize certain instructions on flags in the hardware's ISR register.") Or garbage collection. ("Languages such as Java are only considered garbage-collected because we can't prove they aren't.")

Anonymous Cowherd on November 18, 2008 10:54 AM

(BTW, please don't "fix" the Conservapedia articles. The "errors" are intentional parody.)

Anonymous Cowherd on November 18, 2008 10:55 AM

> please don't "fix" the Conservapedia articles.

Well, except if *I* "fix" them, they'll be correct, no?

I've redacted the offensive sentence from the post, with explanation.

In all honesty, despite reading all the above comments and the wikipedia entries, I still don't quite understand the definition of NP-complete other than as "we'll know it when lots of experts have seen it, tried, and failed."

There has to be a simpler definition out there somewhere. I talked to Joel Spolsky about this and he had a pretty good analogy that might work; I urged him to write a blog post about it.

Jeff Atwood on November 18, 2008 03:28 PM

Jeff -

NP-complete is actually pretty simple, so I bet whatever sources you have been reading have just over complicated it.

'NP' is a class of problems - that is a set of problems that all share a particular property. Here, it's the property that they can all be solved in polynomial time by a Turing machine (or equivalent computer) that can - every time there's a choice point in the algorithm to solve it - magically guess the right decision to make.

So rather than have your Turing machine deterministically make the same decision at every choice point, our new machine somehow knows which is the right way to go.

That's what the NP stands for - 'non-deterministic polynomial'.

Imagine you're trying to solve the Travelling Salesman problem by brute force. Enumeration of all those graphs takes at least O(2!) time. However, our non-deterministic Turing machine magically guesses which is the right way through the search space, and can construct the circuit in polynomial time - by choosing the right edge (again, magically) at every choice point.

Now, the class of problems that can be solved in deterministic polynomial time is called P, and P is a subset of NP since every problem in P can also be solved by our non-deterministic Turing machine in polynomial time. What we don't know is if there is a deterministic polynomial time solution to all the problems in NP - if there was, then P=NP and bingo, you've won a million dollars.

To help us find out whether P=NP, computer scientists have discovered a kind of problem that is the *hardest problem in NP*. By hardest, we mean that any algorithm to solve the hardest problem will take as least as long to run as any other algorithm to solve any other problem in NP.

We call these problems 'NP-complete'. If we could find an efficient deterministic polynomial time algorithm to solve an NP-complete problem, we'd have shown that *every* problem in NP has an efficient solution, because remember by definition NP-complete problems are the hardest in NP.

(Sidenote: we show that a problem is 'harder' than another problem by showing that the easier problem can be reduced in polynomial time to an instance of the harder one. Then a solution to the easier one is found by reducing it to the harder problem and then solving the harder one, so at worst we can solve in the time it takes to solve the harder one. If you think about this carefully, you'll see this is exactly what we probably want 'harder' to mean.)

So that's all NP-complete problems are: the hardest problems that can be solved by a non-deterministic Turing machine in polynomial time. We don't know if a deterministic polynomial time solution to them exists - it may do, but I doubt it - but that doesn't change the fact that we can recognise them, prove them to be NP-complete, and formally identify them much more easily than we can hardcore porn :)

Henry on November 18, 2008 03:48 PM

Well, that just begs the question - why the blog post in the first place? And why recommend a book that you haven't read and/or didn't understand? Not cheap shots here, I actually want to understand why you took that credibility risk with the people who know this stuff - which it turns out is a lot of programmers.

> In all honesty, despite reading all the above comments and the wikipedia entries, I still don't quite understand the definition of NP-complete other than as "we'll know it when lots of experts have seen it, tried, and failed."

Have you really read all of the above comments? Not just scanned over them but read them? You only know if a problem is NP-complete if the experts have seen it, tried, and proved that it's NP-complete. There's a precise definition and a set of criteria to demonstrate that something is NP-complete.

I'll have a crack at some explanations, I don't know how much help they'll be. I'm sure the book you linked to probably explains it better.

A Turing machine is a simple model of a computer. In its usual form its only output is to accept or reject and input, so instead of asking it what a + b is, you normally ask it if a + b = c - ie it solves decision problems.

A problem is in P (Polynomial time) if for all instances of a problem with size n (n = 100 if we have 100 objects to sort, for instance) the Turing machine takes a number of steps to come to a decision that is a polynomial in n - eg there's an uppper bound of 2 * n^5 + 4 *n + 5.

A lot of the time when we talk about the size we're talking about the number of bits - adding n-bit numbers is in P - O(n) - and so is ultiplying n-bit n-bit numbers - O(n^2) the simple way, although FFT me