A Modest Proposal for the Copy and Paste School of Code Reuse

April 21, 2009

Is copying and pasting code dangerous? Should control-c and control-v be treated not as essential programming keyboard shortcuts, but registered weapons?

keyboard: ctrl-c, ctrl-v

(yes, I know that in OS X, the keyboard shortcut for cut and paste uses "crazy Prince symbol key" instead of control, like God intended. Any cognitive dissonance you may be experiencing right now is also intentional.)

Here's my position on copy and paste for programmers:

Copy and paste doesn't create bad code. Bad programmers create bad code.

Or, if you prefer, guns don't kill people, people kill people. Just make sure that source code isn't pointed at me when it goes off. There are always risks. When you copy and paste code, vigilance is required to make sure you (or someone you work with) isn't falling into the trap of copy and paste code duplication:

Undoubtedly the most popular reason for creating a routine is to avoid duplicate code. Similar code in two routines is a warning sign. David Parnas says that if you use copy and paste while you're coding, you're probably committing a design error. Instead of copying code, move it into its own routine. Future modifications will be easier because you will need to modify the code in only one location. The code will be more reliable because you will have only one place in which to be sure that the code is correct.

Some programmers agree with Parnas, going so far as to advocate disabling cut and paste entirely. I think that's rather extreme. I use copy and paste while programming all the time, but never in a way that runs counter to Curly's Law.

But pervasive high-speed internet -- and a whole new generation of hyper-connected young programmers weaned on the web -- has changed the dynamics of programming. Copy and paste is no longer a pejorative term, but a simple observation about how a lot of modern coding gets done, like it or not. This new dynamic was codified into law as Bambrick's 8th Rule of Code Reuse:

It's far easier and much less trouble to find and use a bug-ridden, poorly implemented snippet of code written by a 13 year old blogger on the other side of the world than it is to find and use the equivalent piece of code written by your team leader on the other side of a cubicle partition.

(And I think that the copy and paste school of code reuse is flourishing, and will always flourish, even though it gives very suboptimal results.)

Per Mr. Bambrick, copy and pasted code from the internet is good because:

  • Code stored on blogs, forums, and the web in general is very easy to find.
  • You can inspect the code before you use it.
  • Comments on blogs give some small level of feedback that might improve quality.
  • Pagerank means that you're more likely to find code that might be higher quality.
  • Code that is easy to read and understand will be copied and pasted more, leading to a sort of viral reproductive dominance.
  • The programmer's ego may drive her to only publish code that she believes is of sufficient quality.

But copy and pasted code from the internet is bad because:

  • If the author improves the code, you're not likely to get those benefits.
  • If you improve the code, you're not likely to pass those improvements back to the author.
  • Code may be blindly copied and pasted without understanding what the code actually does.
  • Pagerank doesn't address the quality of the code, or its fitness for your purpose.
  • Code is often 'demo code' and may purposely gloss over important concerns like error handling, sql injection, encoding, security, etc.

Now, if you're copying entire projects or groups of files, you should be inheriting that code from a project that's already under proper source control. That's just basic software engineering (we hope). But the type of code I'm likely to cut and paste isn't entire projects or files. It's probably a code snippet -- an algorithm, a routine, a page of code, or perhaps a handful of functions. There are several established code snippet sharing services:

Source control is great, but it's massive overkill for, say, this little Objective-C animation snippet:

- (void)fadeOutWindow:(NSWindow*)window{
	float alpha = 1.0;
	[window setAlphaValue:alpha];
	[window makeKeyAndOrderFront:self];
	for (int x = 0; x < 10; x++) {
		alpha -= 0.1;
		[window setAlphaValue:alpha];
		[NSThread sleepForTimeInterval:0.020];
	}
}

To me, the most troubling limitation of copypasta programming is the complete disconnect between the code you've pasted and all the other viral copies of it on the web. It's impossible to locate new versions of the snippet, or fold your features and bugfixes back into the original snippet. Nor can you possibly hope to find all the other nooks and crannies of code all over the world this snippet has crept into.

What I propose is this:

// codesnippet:1c125546-b87c-49ff-8130-a24a3deda659
- (void)fadeOutWindow:(NSWindow*)window{
        // code
	}
}

Attach a one line comment convention with a new GUID to any code snippet you publish on the web. This ties the snippet of code to its author and any subsequent clones. A trivial search for the code snippet GUID would identify every other copy of the snippet on the web:

http://www.google.com/search?q=1c125546-b87c-49ff-8130-a24a3deda659

I realize that what I'm proposing, as simple as it is, might still be an onerous requirement for copy-paste programmers. They're too busy copying and pasting to bother with silly conventions! Instead, imagine the centralized code snippet sharing services automatically applying a snippet GUID comment to every snippet they share. If they did, this convention could get real traction virtually overnight. And why not? We're just following the fine software engineering tradition of doing the stupidest thing that could possibly work.

No, it isn't a perfect system, by any means. For one thing, variants and improvements of the code would probably need their own snippet GUID, ideally by adding a second line to indicate the parent snippet they were derived from. And what do you do when you combine snippets with your own code, or merge snippets together? But let's not over think it, either. This is a simple, easily implementable improvement over what we have now: utter copy-and-paste code chaos.

Sometimes, small code requires small solutions.

Posted by Jeff Atwood
149 Comments

Just to go down the overthinking path.
It should be reasonably easy to create a repository where someone could own a code snippet and people could (with annotations like you suggested) copy these snippets.
If a third party modified this snippet for their use for whatever reason, they could then submit this code which could be accepted/rejected by the original author (or current owner) and if rejected could then be submitted as a new snippet related to the original.
Then with this repository, we could build a plugin to our ide of choice to automatically get the latest version of our snippets and apply them if we choose.
I suggest snippets.stackoverflow.com :)

John B on April 23, 2009 5:27 AM

Instead of a GUID, I'd propose reverse dns + comment :

// codesnippet:com.codinghorror/fade-out-window
- (void)fadeOutWindow:(NSWindow*)window{
// code
}
}

This makes it Google-friendly : you can now search for all snippets containing 'fade', 'window', or originating from codinghorror.

Patrick Geiller on April 23, 2009 5:45 AM

Um guys, an URL (therefore a URI) *is* a GUID.

Yes, when I 'borrow' code that I see to be a general-purpose snippet I also add a link in the func declaration comments detailing where I found it. Usually because documentation and analysis there is a little more in-depth than is needed for my single-use routine.

If php.net goes away one day, then maybe that will no longer help those who come after me, but at least I tried. Anything weird like a random GUID will just be ignored or discarded out-of-hand.

As to the article, my thoughts were that the first 2 strikes *against* code re-use listed above were just a lack of betterness not actual bad points against.
# If the author improves the code, you're not likely to get those benefits.
# If you improve the code, you're not likely to pass those improvements back to the author.

Well no, but adding an intentionally cryptic comment doesn't give you that benefit either.
The other three strikes against are valid - as to the actual quality of the code. But the first two are not.

Drop an URL to the source you copied from. Then move on.
We copy snippets because we don't want the overhead of full-blown libraries and dependencies and extra library files. If it's less than a screenful - and does the job - copy paste away.

dman on April 23, 2009 6:51 AM

Just wanted to agree with proposal for snippetoverflow.com - stackoverflow is a great thing ;)

ivucica on April 23, 2009 6:53 AM

Here's what I wrote about this earlier this month:

Across the unimaginably vast stretches of the public Internet, a tentacled, grotesque being is destabilizing every layer of creation from core routers to half-finished Myspace pages. Itís not a malicious virus. It is not a nefarious plan hatched by a cackling mastermind. The greatest threat to the entire continent of information technology is available on every computer with a few swipes and clicks of the mouse. The ubiquitous, self-replicating destructive force which might usher in the end of everything is the tragically thoughtless copy-and-paste.

From http://www.robbyslaughter.com/blog/?2009-04-06

Robby Slaughter on April 23, 2009 7:09 AM

I commented earlier about implementing this capability into my code snippet manager, Snip-It Pro.

I couldn't wait until the weekend, so I ended up implemeting everything last night and I just published an updated beta version on my blog. More details and a download link here:
http://www.mtelligent.com/journal/2009/4/23/snip-it-pro-20s-newest-feature-auto-commenting.html

I ended up implementing four types of auto commenting for snippets: The Guid ID that's unique to each locally stored snippet, a Reference Url that can be tagged to a snippet, the user name of who inserted the snippet and the Date the snippet was inserted. Each of these can be toggled as an option, so you can set which ever auto comment types you want.

Thanks again for the idea, Jeff.

David San FIlippo on April 23, 2009 7:43 AM

I devised a method to deal with this problem a couple of years ago.
It consists of a ranking system for programmers, expending on the available IDE and language features as one's experience progresses.

you can find the details here:
http://subjectively.blogspot.com/2007/07/role-playing-gmes-we-play.html

and while i agree that guns don't kill people, i do think people need some training before they are allowed to use a particular weapon...

Yossi on April 23, 2009 8:17 AM

Copy Paste is only as dangerous as the developer.

I find myself copy pieces of code from other projects where I have already solved something.

Steve on April 23, 2009 11:38 AM

Guns don't kill people. Bullets kill people

Red on April 23, 2009 12:09 PM

Oh boy, copy and paste some stuff found on some site (could be code project) into the product ... these snippets are almost never product quality, I mean even where's the error checking, and get semi blindly pasted into product code. The product failures which result, the aggravations which results, the misdirection of PHBs which results, ... unbelievable, what a complete waste of time and money.

These snippets could be really useful for learning, to give programmers a bit of a how to, but in the end have net negative value, becuase there are so many more copy/paste idiots in the world than any other kind of programmer. I know, first hand, I have the displeasure of working with a number of them, because they're cheap.


Oldster on April 23, 2009 1:00 PM

Perhaps we need a simple to use, open source, well documented, software library.

Sort of like the Enterprise Library, only easy to use, because when I try to use the Enterprise Library I end up googling the internet and copying and pasting.

Andrew on April 23, 2009 1:20 PM

Quote from the back of an O'Reilly book I received in the mail today:

Safari is designed for people in a hurry to get the answers they need so they can get the job done. You can find what you need in the morning, and put it to work in the afternoon. As simple as cut, paste and program

Are all O'Reilly readers / Safari users bad programmers?

Good post all the same ...

mosaic on April 23, 2009 1:41 PM

Copy paste programming is real life solution to problem with sophisticated libraries and frameworks.

It is often easier to copy some code, that does what you need, and later customize it, than to use some general library, that has many things you won't use. Also - advantages of using library are often very minor, and removing dependencies and making build your code easier are often more important that getting security fixes.

Also - frameworks and libraries imposes some design choices on you, that sometimes are wrong for your project. Changing them means forking a library and if you want to receive updates from trunk, you have to maintain your fork.

It's a lot of work and when all you need is 1% of functionality of library, it's too much to use that library.

In languages like python, ruby or perl this problem is mostly solved - solution is called cookbook/cpan/ruby gems.

I think that a lot of appeal of python for me is that it enables easy copy paste oriented programming. And I mean it in a good sense.

Good, complete solution to some problem in c++/java/c#/other such languages often won't fit to one blog post. And if it will, it would be too big to easily copy paste it to your code, and (the most important) - it will be too big to read and understand before copying (I mean - 2 pages of clean code I can read, 20 pages of code with 15 pages of boilerplate in it - not so easily, and at this point using library doesn't seem so bad).

In dynamic languages easy things are easily done in small amounts of LOC. Go figure it - http://code.activestate.com/recipes/tags/

odrzut on April 24, 2009 2:48 AM

Ben (about how Mac copying is better than PC copying):

In Unix world you select text and bang - it's copied, then you middle click somewhere and bang - it's pasted.

Beat that :)

odrzut on April 24, 2009 2:54 AM

I copy, therefore I paste.

Steve on April 24, 2009 3:30 AM

What's wrong with a simple URL to where you found the code?

Alex on April 24, 2009 5:38 AM

Monkeys with guns also kill people.

c on April 24, 2009 5:47 AM

When the title of this blog post was a modest proposal, I have to admit that I was hoping for something along the lines of Jonathan Swift's A Modest Proposal, laced with pointed sarcasm.

Still, not a bad modest proposal. It would certainly be nice for updating the various Java snippets I find when searching for how to do something.

Katie on April 24, 2009 7:27 AM

Simon Wright: Because it's the wrong tool for the job. A hash will change every time the snippet is changed, whereas the GUID will remain constant. This idea relies upon having a reliable, life-long identifier.

On the contrary, an ID that changes if the code it identifies changes is actually the right tool for the job. It's just that Jeffs scheme completely ignored the problem of versioning.

In the real world a majority of people get a snippet and then make some sort of change to it for their specific needs. How long do you think it will be before there are 100 versions of a popular snippet that have been posted on blogs, all with the same GUID?

2+ different snippets with the same GUID, just doesn't work. Using a SHA1 along with some other versioning is is a better solution.

AJ on April 24, 2009 7:27 AM

i'm blown away by the simplicity. top marks!

It's the simplest thing imaginable and i think it's worth trying it out.

Steve Pavlina (life guru who began as a shareware guru) recommends using a 30 day trial to decide on a potential life-habit.

i recommend that SO quietly injects a guid into each code snippet by adding a //guid comment to each code tag entered.

And ideally the guid comment portion would be colored in a very very soft grey -- so that it is deliberately hard to read. What Tufte calls minimal differentiation or something like that. Your eyes will see it only enough to detect that it is present, and to recognise that it's a guid. But not enough to squint and bother reading it.

Once the entire world has learnt this convention from stack overflow, they would all adopt it for their own blogging.

It would be built into all blog engines and it would be recognized/searchable by all IDE's. (Including the browser-based ide's which dominate our future).

And then, checkmate. SO wins. Something.

Nice article Jeff ;-)

lb

secretGeek on April 24, 2009 8:32 AM

Impressed with this little idea of GUID.
DRY's main lifeline is the Copy Paste.
Just only only the code, also design pattern, ideas etc.

- maheshexp

maheshexp on April 24, 2009 10:30 AM

I mean no offense and perhaps I do not understand your article, but are you talking about two different kind of copy and pastes here?

The first one you site from IEEE appears to be about copy and pasting code that you did in many places instead of having a routine that handles it once. So, instead of having a routine that cleans up a string in five different places, copied and pasted each time you needed it, you have it in one place and all five places reference that one place.

The second set of copy and pastes is about copying and pasting from other sources because you don't have it in your code. And, that this could be bad because of x,y,z.

I think the first one is good. The second one has the inherent problems you pointed out and was the inten of your article. Mostly, for me, it is not understanding the code right off, but using it anyway.

In short, I think the first citation has nothing to do with the intent of your article. From that article, Aside from the invention of the computer, the routine is arguably the single greatest invention in computer science.

Sorry if I misunderstand.

john on April 24, 2009 10:43 AM

That's not the simplest thing that could possibly work. It's overenginering. A code snippet found on the web already has a unique identifier. The URL at which you found it.

The poster of the snippet no longer has to jump through the additional hoop of publicizing a GUID along with the code, and when looking for the source, I no longer have to use Google to find it. I can just copy the URL directly into my browser.

jalf on April 25, 2009 5:53 AM

Jeff,

Another problem with copy-and-pasting snippets from the internet is that you think you understand how to use an API, but actually you miss a lot of important information that the original writer may have been aware of.

If you remember my studies with the Java samples, this was often the case. (http://www.cs.cmu.edu/~udekel/papers/udekel_emoose_icpc2009.pdf)

Uri on April 25, 2009 7:50 AM

The article he linked to said the simplest thing that could possibly work. Jeff said the STUPIDEST thing that could possibly work.

AJ: A SHA1 hash would not solve the problem, so it's clearly the wrong tool. The whole point of using a GUID is to find 2+ different snippets with the same GUID. Remember, the goals here were the following:

1. Identify the author (GUID works, SHA1 works if there's no alteration whatsoever, including tabs vs. spaces type formatting, and point 3 coming up...)
2. Find new versions of the snippet. Hash algorithms fail here.
3. Fold new versions back into the original snippet. Not only do hash algorithms fail here, but it also destroys the ability to use SHA1 for point 1, identity of the author.

Now, a URL to the author's site is in fact a type of GUID, though not conformant to the UUID standard.

The real problem here is the problem of GUID collision of ultimately unrelated functions caused by, ironically, copy-pasting. It might work better if there were a GUID cascade, such that every time you pasted a snippet with a GUID, a new GUID was prepended, so you get a sort of version history.

And then, of course, there's the problem of somebody just deleting the GUID.

Ens on April 25, 2009 12:08 PM

I don't think I have ever found something I could use without modification. Maybe, I just write Java in any language, and so I always modify semantics, like indentation and variable name. The situation I always encounter runs something like this: look and look and look, and after hours of looking, write my own thing anyway. Besides, I consider myself a perpetual student anyway and enjoy the challenge of coding from scratch, scary isn't it.

Michael on April 25, 2009 12:23 PM

Why not start with http://stackoverflow.com ? that seems like a right place for that kind of thing. http://gists.github.com has a great version control for snippets and allows you to group multiple files/snippets as one gist.
However, what would really be better is, some way to point to relevant articles on the internet (using google or something else more relevant,intelligent) that might explain what the code snippet does. Encouraging the snippet poster to point to such articles would be a huge plus. So, instead of being
//Magic Code
while (!magic){
//do something cool
}
//fire the missiles

We have:
START OF CODE --------------------------------
----- UUID here ----
//Magic Code
//::description:: The code was taken from blah blah blah... that does blah blah blah and uses QuickSort
while (!magic){
Reflections.xxxx
Hibernate.xxxxxxx()
}
---------------------------------- END OF CODE
Links:
Reflections API
Hibernate
Relevant Discussions:
StackOverflow- How do I make Reflections API go Kazooo?

I guess you get the idea. We probably have most of these things already. We just need some sort of push in this direction so that we can all benefit from it. Of course, no matter how much we like RAD, Rome wasn't built in a day. Yes, even with RoR

Prasanna Gautam on April 25, 2009 12:32 PM

Ever paste several hundred lines of your clipboard to a shell prompt,
by mistake (ouch, especially if it is a production server)?

If what you paste is from the scroll buffer of your terminal,
(for example an ssh session managed by 'screen'), then a shell
prompt that starts with '#' will save you some grief,

yz4 on April 26, 2009 2:39 AM

Shows that computer geeks don't read enough literature. As a 'writer' you should know better than to co-opt the title of 'Modest Proposal' and not even attempt a Swiftian satire. Then again maybe you did, but we just never noticed :)

http://en.wikipedia.org/wiki/A_Modest_Proposal

not so swift on April 27, 2009 4:13 AM

Small error: advocate disabling cut and paste completely - Don't you mean COPY and paste ?

I don't think anyone takes issue with CUT and paste, as it's pretty essential for refactoring.

Alex on April 27, 2009 1:52 PM

Great idea, but it'll never work. Not because it's not a great idea, it is. It won't work because the folks that are making code snippets aren't capable of generating GUIDs. ;)

Scott Hanselman on April 28, 2009 2:10 AM

Jeff-

I like the idea, but there are a couple of kinks to work out. For example, when I search for the sample snipet you posted here (1c125546-b87c-49ff-8130-a24a3deda659) I get a lot of results (between 32 and 68 depending on your search engine of choice), so I am not sure which is the site of authority. However, if I had to pick I would go to:

http://catsandtoasters.com/

(yes that was really in the search results)

-Clarkin

Larry Clarkin on April 28, 2009 5:49 AM

He's proposing a system of manual version control on code ripped from blogs to legitimize 'coding via Google', and you think it's not satire?

AndyL on April 28, 2009 7:41 AM

Assigning GUID's in some form to code snippets for copy/paste reuse is silly and laughable but certainly doesn't go the full 9 yards to make it a Swiftian satire. Judging by all the comments, it's not such an absurd idea after all to some people. Monolithic enterprisey WTF's are being dreamed up as we speak.

So this modest proposal is in no way in the same form as Jonathan Swift's 'A Modest Proposal'. This modest proposal isn't in the same ballpark as suggesting poor people sell their children for food, nor is it a comment on the state of the industry.

not so swift on April 28, 2009 8:54 AM

I think Jeff may have created a variant of Poe's Law,

Without a winking smiley or other blatant display of humor, it is impossible to create a parody of Coding Horror that won't be mistaken for the real thing.

Steve W on April 29, 2009 6:10 AM

Regarding the command-key on the Mac platform. They don't use Ctrl for all their key-commands, for a damn good reason:

One advantage of this scheme, as contrasted with the Microsoft Windows mixed use of the Control and Alt keys, is that the Control key is reserved entirely for its original purpose: entering control characters in terminal applications. (from Wikipedia article on Command key)

amatecha on May 4, 2009 12:15 PM

I tried posting this a while ago, but it seems the message did not go through (or perhaps there's some moderation system in place that I didn't notice).

This explanation of the Command key on Mac systems is written on the Wikipedia article for Command key:

One advantage of this scheme, as contrasted with the Microsoft Windows mixed use of the Control and Alt keys, is that the Control key is reserved entirely for its original purpose: entering control characters in terminal applications. (Indeed, the very first Macintosh lacked a Control key; it was soon added to allow compatible terminal software.)

So, in fact, the adoption of a new unique key makes complete sense, because, as you should already know, the Ctrl key is used for console applications. Makes complete sense!

amatecha on May 4, 2009 1:02 PM

Actually Xerox PARC would be God, Ctrl+zxcv were Xerox PARC norms that Apple copied except the use of the command key instead of Ctrl.

pataphysician on May 5, 2009 12:21 PM

www.numly.com

close but a little more evil :)

seth@mailinator.com on June 7, 2009 2:55 AM

that GUID stuff is ingenious.

wizzard0 on June 25, 2009 10:25 AM

I think that it's nice. What needs to happen is the central system needs to have a list of derived snippits when you view one. I.e.:

1) You copy a snippit, with the GUID.
2) You edit that snippit to add error handling or what have you.
3) Paste the snippit, WITH GUID, back into the snippit site.
4) The site sees the old GUID, creates a link between the two snippits, and replaces it with a new GUID.

Now when you view the page for either GUID, you see the link to the other {parent,derivation} of the snippit you looked up.

Daniel Danopia on August 19, 2009 10:08 AM

Instead of a GUID, why not use a SHA1 hash

Because it's the wrong tool for the job. A hash will change every time the snippet is changed, whereas the GUID will remain constant. This idea relies upon having a reliable, life-long identifier.

Simon Wright on February 6, 2010 11:16 PM

yes, I know that in OS X, the keyboard shortcut
for cut and paste uses crazy Prince symbol key
instead of control, like God intended.

Given that the assignment of Z, X, C, and V as accelerators for undo, cut, copy and paste was chosen by Apple developers and first appeared on the Apple Lisa, I think it's safe to say Apple is God (not FSM, as once hoped).

http://en.wikipedia.org/wiki/Cut_and_paste

(Hmm, has Prince been assigned a unicode code point?)

Simon Wright on February 6, 2010 11:16 PM

What a great idea! Watch as I, a commenter, take it very seriously, like the absolutely sincere and serious idea it is!

anon on February 6, 2010 11:16 PM

I once worked very briefly at a company where copying and pasting were the norm. My team lead would actively copy and paste entire classes from previous projects and send it via *email* to me (no such thing as source control). Then I would just 'plug and play' (modify lines 8, 45, 58-65) and the code would work 'beautifully'.


simon on February 6, 2010 11:16 PM

I really thought you off your rocker until I saw a comment pointing out the title starting with A modest proposal...

For all you taking this seriously, what do you think people are going to do with these GUID's? Go back and update their copypasta code?

anon on February 6, 2010 11:16 PM

Is there any reason to use your own NSTimer when there are the UIView animation blocks?
(I'm aware this is beside the point of the article but interested none the less)
- (void)fadeOutWindow:(NSWindow*)window{
float beginAlpha = 1.0;
float beginAlpha = 1.0;

window.alpha = beginAlpha;
[UIView beginAnimations:@transition context:nil];
[UIView setAnimationDurration:0.02];
window.alpha = endAlpha;
[UIView commitAnimations]
}

Brandon Tennant on February 6, 2010 11:16 PM

Not so swift is a pretty good handle for that comment.

anon on February 6, 2010 11:16 PM

I guess if you're poor and heartless enough, you might seriously consider selling your children for food, and if you're a shortsighted enough programmer, you might think that putting GUIDs on big slabs of source code and copying them around the web is a good idea.

anon on February 6, 2010 11:16 PM

«Back

The comments to this entry are closed.