A Modest Proposal for the Copy and Paste School of Code Reuse

April 21, 2009

Is copying and pasting code dangerous? Should control-c and control-v be treated not as essential programming keyboard shortcuts, but registered weapons?

keyboard: ctrl-c, ctrl-v

(yes, I know that in OS X, the keyboard shortcut for cut and paste uses "crazy Prince symbol key" instead of control, like God intended. Any cognitive dissonance you may be experiencing right now is also intentional.)

Here's my position on copy and paste for programmers:

Copy and paste doesn't create bad code. Bad programmers create bad code.

Or, if you prefer, guns don't kill people, people kill people. Just make sure that source code isn't pointed at me when it goes off. There are always risks. When you copy and paste code, vigilance is required to make sure you (or someone you work with) isn't falling into the trap of copy and paste code duplication:

Undoubtedly the most popular reason for creating a routine is to avoid duplicate code. Similar code in two routines is a warning sign. David Parnas says that if you use copy and paste while you're coding, you're probably committing a design error. Instead of copying code, move it into its own routine. Future modifications will be easier because you will need to modify the code in only one location. The code will be more reliable because you will have only one place in which to be sure that the code is correct.

Some programmers agree with Parnas, going so far as to advocate disabling cut and paste entirely. I think that's rather extreme. I use copy and paste while programming all the time, but never in a way that runs counter to Curly's Law.

But pervasive high-speed internet -- and a whole new generation of hyper-connected young programmers weaned on the web -- has changed the dynamics of programming. Copy and paste is no longer a pejorative term, but a simple observation about how a lot of modern coding gets done, like it or not. This new dynamic was codified into law as Bambrick's 8th Rule of Code Reuse:

It's far easier and much less trouble to find and use a bug-ridden, poorly implemented snippet of code written by a 13 year old blogger on the other side of the world than it is to find and use the equivalent piece of code written by your team leader on the other side of a cubicle partition.

(And I think that the copy and paste school of code reuse is flourishing, and will always flourish, even though it gives very suboptimal results.)

Per Mr. Bambrick, copy and pasted code from the internet is good because:

  • Code stored on blogs, forums, and the web in general is very easy to find.
  • You can inspect the code before you use it.
  • Comments on blogs give some small level of feedback that might improve quality.
  • Pagerank means that you're more likely to find code that might be higher quality.
  • Code that is easy to read and understand will be copied and pasted more, leading to a sort of viral reproductive dominance.
  • The programmer's ego may drive her to only publish code that she believes is of sufficient quality.

But copy and pasted code from the internet is bad because:

  • If the author improves the code, you're not likely to get those benefits.
  • If you improve the code, you're not likely to pass those improvements back to the author.
  • Code may be blindly copied and pasted without understanding what the code actually does.
  • Pagerank doesn't address the quality of the code, or its fitness for your purpose.
  • Code is often 'demo code' and may purposely gloss over important concerns like error handling, sql injection, encoding, security, etc.

Now, if you're copying entire projects or groups of files, you should be inheriting that code from a project that's already under proper source control. That's just basic software engineering (we hope). But the type of code I'm likely to cut and paste isn't entire projects or files. It's probably a code snippet -- an algorithm, a routine, a page of code, or perhaps a handful of functions. There are several established code snippet sharing services:

Source control is great, but it's massive overkill for, say, this little Objective-C animation snippet:

- (void)fadeOutWindow:(NSWindow*)window{
	float alpha = 1.0;
	[window setAlphaValue:alpha];
	[window makeKeyAndOrderFront:self];
	for (int x = 0; x < 10; x++) {
		alpha -= 0.1;
		[window setAlphaValue:alpha];
		[NSThread sleepForTimeInterval:0.020];
	}
}

To me, the most troubling limitation of copypasta programming is the complete disconnect between the code you've pasted and all the other viral copies of it on the web. It's impossible to locate new versions of the snippet, or fold your features and bugfixes back into the original snippet. Nor can you possibly hope to find all the other nooks and crannies of code all over the world this snippet has crept into.

What I propose is this:

// codesnippet:1c125546-b87c-49ff-8130-a24a3deda659
- (void)fadeOutWindow:(NSWindow*)window{
        // code
	}
}

Attach a one line comment convention with a new GUID to any code snippet you publish on the web. This ties the snippet of code to its author and any subsequent clones. A trivial search for the code snippet GUID would identify every other copy of the snippet on the web:

http://www.google.com/search?q=1c125546-b87c-49ff-8130-a24a3deda659

I realize that what I'm proposing, as simple as it is, might still be an onerous requirement for copy-paste programmers. They're too busy copying and pasting to bother with silly conventions! Instead, imagine the centralized code snippet sharing services automatically applying a snippet GUID comment to every snippet they share. If they did, this convention could get real traction virtually overnight. And why not? We're just following the fine software engineering tradition of doing the stupidest thing that could possibly work.

No, it isn't a perfect system, by any means. For one thing, variants and improvements of the code would probably need their own snippet GUID, ideally by adding a second line to indicate the parent snippet they were derived from. And what do you do when you combine snippets with your own code, or merge snippets together? But let's not over think it, either. This is a simple, easily implementable improvement over what we have now: utter copy-and-paste code chaos.

Sometimes, small code requires small solutions.

Posted by Jeff Atwood
149 Comments

Jeff: add a concise method for version tracking (including branches), and then I think you've got something good. Without version tracking, I doubt any of the benefits of your system will hold up in the real world.

Trav on April 22, 2009 2:20 AM

I kept waiting for the part about eating script kiddies, but it never came...

I would definitely appreciate a database of random code snippets. There are so many interesting little coding tricks and algorithms, but they're scattered all over the internet and buried in larger applications.

A system sort of like Wonderfl (http://wonderfl.kayac.com/) would be pretty neat. They basically just have a text box on every page where you can randomly fork the code and store modifications. Other people can see the evolution of the apps through forks and revisions.

James on April 22, 2009 2:43 AM

Real programmers bind copy and paste to their extra mouse buttons.

coward on April 22, 2009 2:50 AM

@coward

Real programmers don't own a mouse at all... :)

HB on April 22, 2009 3:00 AM

What if you copy-paste code to move it to routine of it's own?

nc on April 22, 2009 3:03 AM

To nc:
I think that to copy-paste some code to make a routine of it is not bad at all, but I don't think this is what Jeff is talking about.

I think that the most important problem with copy-paste are bad programmers, who copy-paste it without understanding what it does completely. Another problem may be the lack of consistency with the rest of code, in error tracking policies for example, or to a lesser extent, the lack of code formatting consistency.

To sum up, I think that code snippets around are better to get ideas of how to solve something rather than copy-pasting them.

Unknown Programmer on April 22, 2009 3:23 AM

Hopefully you're cutting and pasting, not copying and pasting. Subtle distinction but there is one.

Instead of a GUID, why not use a SHA1 hash that's based on the contents of the code snippet, like what git does?

Joe Chung on April 22, 2009 3:24 AM

(yes, I know that in OS X, the keyboard shortcut for cut and paste uses crazy Prince symbol key instead of control, like God intended. Any cognitive dissonance you may be experiencing right now is also intentional.)

Are you saying Apple came before God? (Well, at least the Lisa.)

gassit on April 22, 2009 3:26 AM

I think something like http://gist.github.com/gists is a better approach. That's what I try to do when I grab a snippet.

Ben on April 22, 2009 3:29 AM

Legalizing copying of assignments, huh?

Amarghosh on April 22, 2009 3:36 AM

I agree on gist.github.com. Distributed version control is the way to go when dealing with snippets. Github gists offer version control with a low barrier, and you can easily fork snippets from other users, while the link between those snippets is maintained. Using tools that integrate with your editor, it is very easy to publish snippets.

Willem on April 22, 2009 3:40 AM

I am amazed to see all the apple stuff here. Is the objective-c animation example on purpose? Because with core animation doing this for you how many programmers are finding this code snippet and using this. It makes a point about copy and paste programming.

grrr on April 22, 2009 3:42 AM

www.SnippetOverflow.com

Kale Kold on April 22, 2009 3:44 AM

Where the crazy Prince symbol key came from:
http://www.folklore.org/StoryView.py?story=Swedish_Campground.txt

Left Ctrl+X/C/V is just a recipe for RSI btw. Right Ctrl would be better...

Why not just put the URL in the code snippet instead of GUID? Sites should be quite good at keeping URLs valid by now shouldn't they?

dan on April 22, 2009 3:47 AM

I couldn't help but notice that you included the picture of a Mac keyboard and marked the Control key...

Copying and pasting under Mac OS X is done via Command + C/V. Soo... Jeez.

Mr Jay on April 22, 2009 4:00 AM

This is the most retarded idea I've heard in my entire life.

Either that or Atwood is trolling us, AGAIN.

Steve on April 22, 2009 4:07 AM

I like this idea, a lot.

So I expanded on it.

If you go and download the freeware tool made by Whole Tomato, the makers of Visual Assist, an addin for Visual Studio, specifically, this tool: http://www.wholetomato.com/products/sourcelinks/FogBugzBundle.asp

And then add this rule to it (Tools-Options-SourceLinks):

Name: codesnippet
Keywords: codesnippet:
(yes, add the colon)
Value: String
Url/Exe: http://www.google.com/search?q=%s

the rest as default, then the comment becomes double-clickable in Visual Studio and you'll search for it on Google if you do.

Lasse V. Karlsen on April 22, 2009 4:08 AM

@Mr Jay:

Copying and pasting under Mac OS X is done via Command + C/V. Soo... Jeez.

Jeff clearly outlined the irony of the picture. Anyway, the real tragedy is that the Caps Lock key is where God intended the Control key to be.

guns on April 22, 2009 4:12 AM

Interesting idea, i kind of like it. I don't CP all that much, but it would be great to find a GUID i could search for when i find some arcane CP'd piece of code in an application.

I used to put the url to the snippet in a comment next to it, but sadly that solution works only as long as the page exists.

CP web snippets should of course never happen for trivial stuff, but when you are using some arcane library or extension it is often the best way to get a head start in learning how it works. Combinded with inevitable deadlines means that some of the code is bound to stay alive.

Having a GUID next to such code would be perfect to get some context when a bug arises.

Erik on April 22, 2009 4:12 AM

It doesn't necessarily have to be in code you use for production applications, but if you, as an example, answer a question on Stack Overflow, and finds a code snippet somewhere that shows what you mean, a guid in both places would tie them together, even if nobody ever copies that code into a live application.

Perhaps Stack Overflow could do something automatically with this? Auto-insert a comment above/below every code snippet? Or would this add too much noise to the code? Would be technical hurdles as well, figuring out how to inject it safely, with respect to comment symbols, etc.

Lasse V. Karlsen on April 22, 2009 4:17 AM

Oooooooh, are you talking about Brice Richard here?
http://discuss.joelonsoftware.com/default.asp?joel.3.568375.67

Jivlain on April 22, 2009 4:21 AM

Looks like we're gonna have StackOverflow iPhone client pretty soon!

Mehrdad on April 22, 2009 4:29 AM

I can see it now...

A language that incorporates snippets by reference. Pages and pages of code that consists entirely of GUIDs!

Dennis on April 22, 2009 4:39 AM


I haven't ever sweated about copy and paste from the web, though I think Jeff's GUID idea is neat. My big issue is with people who copy and paste inside their own code instead of properly modularising it. I have come to the conclusion that this practice is the single most reliable indicator of a poor programmer that one can encounter.

tragomaskhalos on April 22, 2009 4:45 AM

More often you can't remember exactly how you wrote a routine (function,method) - instead of rewriting the whole thing again, you can copy and paste the whole shabang. I say that much of programming is not about writing new things, instead it's writing old crap over and over again.

Wanko on April 22, 2009 4:56 AM

Guns don't kill people.. Rappers do!

Scrimmers on April 22, 2009 4:57 AM

I'm in agreement with almost everything you said - except the whole cognitive dissonance thing - it would actually be expectation disconformance.

But besides that, I have seen copy and paste made a so-so programmer actually fairly good. They would take components that worked, slot them together then fart-arse around until the whole things worked. They aren't a good enough programmer to have written it from ground up by themselves, but they are good enough to make sure the entire thing works.

In fact - I think most programming seems to be going this way.

Also - I've got to agree with SteveC above, re-editable is much bettr than re-usable. I don't want to re-invent the wheel, but I don't want the exact same wheel I already have.

Oh - and you listed codeproject, that has quickly become one of my goto places.

Philip on April 22, 2009 5:03 AM

This is a great idea, but I don't think it will work as intended. People who copy and paste code verbatim from the Internet aren't likely to know what a guid is or how to create one for their own code snippets, plus a lot of people don't want you to know they copied their code from the Internet instead of writing it themselves so they'll probably strip the guid comment out anyway. But for people who know what they're doing, this is definitely a keeper.

tkrehbiel on April 22, 2009 5:10 AM

I think you missed the best argument against copy'n paste code vs. usage of team leader abstraction:

Not using the (maybe or not) established abstraction of your team might make you miss the opportunity of handling similar things the same automatically and getting later improvements for free.

Johannes on April 22, 2009 5:11 AM

Maybe when pasting a code snippet from browser into IDE, the editor could automatically get the original URL from the browser and add it as code comment to the pasted text? Note sure if it's technically possible, though...

Also, if the snippet site then adds the snippet GUID to the URL right from the start, you could also find the snippet even when the original site is long gone.

Oliver on April 22, 2009 5:16 AM

Seems like two very different things, copying and pasting within your own code and copying and pasting from an example on the web. The first is generally a no-no, the second doesn't seem so bad to me. I guess the GUID might be useful for large code snippets but generally I only copy small bits of code from the web, which then get modified so much they bare little resemblance to the original, so who cares if the original gets updated?

Doogal on April 22, 2009 5:17 AM

beware of the right-click warrior

whocares on April 22, 2009 5:20 AM

I see stackoverflow code sharing coming.

Would be good snippets in posts could somehow link out to an overall database of code snippets which are updatable and version controlled.

The GUID sounds good to me, may need a bit more detail possibly assigning GUID's at different levels i.e. assembly, class, method also maybe include the simple stuff language, version, date, authors/contributors... so it's more searchable.

pete on April 22, 2009 5:25 AM

I'm going to leave the Copy/Paste software engineering discussion alone, and I'd like to throw my own twist on the proposed solution.

Instead of :
// see codesnippet:1c125546-b87c-49ff-8130-a24a3deda659

I propose a URI solution, which includes traditional URLs:
// see http://www.example.com/foo/bar

As well as the URN solutions:
// see urn:uuid:1c125546-b87c-49ff-8130-a24a3deda659

If you are interested, I wrote up some more complete thoughts here:
http://www.fort-awesome.net/blog/2009/04/22/uuid_snippets_and_rfc

Cheers!

Erik Karulf on April 22, 2009 5:25 AM

Jeff, usually an essay titled A Modest Proposal... is written in the vein of Jonathan Swift's famous one in which he suggested the Irish stave off famine by eating their own babies.

I recognize this is an honestly modest proposal, but you should be aware that most people will be assuming satire when they see the title.


That said, cool idea!

Aidan Ryan on April 22, 2009 5:30 AM

I think you confuse the issue by jumping from copying and pasting internally, and copying and pasting from external sources. The two have entirely different sets of problems.

The main problem with internal copying is that it's usually a sign that you should be reusing the code.

The main problems I see with copying from external sources, are that there's no guarantee of code quality, and that you're using the code without understanding it. Far better would be to examine the code, and then write your own implementation.

ìTo me, the most troubling limitation of copypasta programming is the complete disconnect between the code you've pasted and all the other viral copies of it on the web. It's impossible to locate new versions of the snippet, or fold your features and bugfixes back into the original snippet. ì

I donít really see why this should be necessary. Unless you are copying large amounts of functionality, in which case you really ought to be using a library, why would you need to keep searching the internet to find updated versions of the code snippet. Youíve inspected the code, youíve modified it for your own purposes; what advantage could there be in keeping up with every minor variant out there?

Steve W on April 22, 2009 5:35 AM

Copy Paste within a single project is bad, because unnecessary. If you see you need the same code again elsewhere in the project, make it an own function (or an own object or method) and just call it twice. Why? Because it makes the code base smaller (less memory usage), because it cuts compile/parsing times, because if the code has a bug, you only fix it in one position (not in 20 others where you copied it) and so on and so on.

However, copying code from another project is not necessarily bad, if you are allowed to do so and if this is good code. Your UUID idea looks like a great idea to me... but it is incomplete. E.g., what if I copy code and improve it? It is not the same code anymore, should it have the same UUID? Most likely not. But if I just give it a new one, where is the reference to the original code? So if you just copy code, I would say do it like you described

// codesnippet:{uuid}

but if you improve it (it may look like this)

// codesnippet.original:{uuid}
// codesnippet:{new-uuid}

That way you are using Google as a world wide, global SVN (or CVS if you prefer). If someone sees my code, he sees the UUID of this code (and hopefully copies it), so I can find my copied code in his project by UUID. However, he also sees the original UUID. In case he wants to see how the code looked *BEFORE* my improvements, he can just search for this UUID and will find the original code snippet.

If every author modifying this code keeps backtracks of all UUIDs and always adds a new UUID, you will have a *HISTORY* of code changes. This is nothing more than SVN, just that you use webpages to store the code and Google is your SVN indexing and history tracking service.

I think this is a very interesting idea and you should invest some more time into it. You may come up with a completely new way of tracking code world wide over the Internet, with all changes. Making this idea popular, Google may even create a special service for that. It searches its whole index and builds dependency trees where you can see how code evolved... you probably need to add name and date to the comment for that. In the end, Google can show you who originally created the code and when, who copied it, who modified it, who took a modification of it. Who took two modifications and merged them together again. See where I'm going here?

Mecki on April 22, 2009 5:42 AM

is it april 1 the second time this year?

laugh on April 22, 2009 5:43 AM

I recognize this is an honestly modest proposal, but you should be aware that most people will be assuming satire when they see the title.

Are you sure this isn't intended as satire? Most of it seems reasonable, but the actual GUID proposal seems a bit far fetched.

Steve W on April 22, 2009 5:50 AM

I know others have said it already, but I think gist is what you're looking for.

http://gist.github.com/

Aren't you just a little sad when you find out someone has already had your great idea?

Leah Culver on April 22, 2009 5:51 AM

What a great idea and a perfect post at the perfect time for me. Just blogged a more detailed response here:
http://www.mtelligent.com/journal/2009/4/22/copy-and-paste-code-reuse-proposal-accepted.html

I have been in beta for the last two weeks of the second version of a product I created to manage code snippets. New features include integration with Snipplr, one of the code sharing sites Jeff mentioned.

I will implement this auto commenting feature in the product this weekend. I will probably need to generate the Guid's internally, making them only searchable through my search interface (or through the xml files they are stored as), but I can also automatically comment out a reference url, another property I allow snippets to be tagged with.

I will try to get this done this weekend.

David San Filippo on April 22, 2009 5:52 AM

Thanks for the GUID! I'd been looking for one all day!
Now I'll just copy paste 1c125546-b87c-49ff-8130-a24a3deda659 onto my code snippets, thanks for the advice!

:)

Robert on April 22, 2009 5:58 AM

This boggles my mind. I cannot think of one single time in the last 15 years that I have copied code from some place and not modified it to suit my needs. So none of my code would *ever* have the same GUID as some snippet on the net.

If I use a good piece of code and I want to add a comment as to where I found it (where there might be a good explanation of what it does for the next programmer to look at my code) then I will add a url to the source in my comments, not some useless GUID.

Dan A on April 22, 2009 6:02 AM

I was once completely blown away by Eclipse when I was working as a maintenance programmer in Java land.

Lots of repeated code in a module that went on for about 200 lines (gah! don't go there!). I selected the code and went to the nice refactoring tool and selected turn into method.

I thought I would have to trawl through the code and pick up all the rest of the repetition. Eclipse refactoring just identified it and rewrote the whole module.

I 'lost' 100 lines of redundant code in about 20 seconds and, when I noticed some bits of the stupidly-long case statement didn't refactor, even found a couple of bugs.

In Ruby land this doesn't happen anywhere near as often because it encourages you to think small.

Francis Fish on April 22, 2009 6:04 AM

c p programming IS DANGEROUS in the hands of careless, non commited, irresponsible - chicken 'programmers', who don't bother about what they deliver as long as they get an output and are aversed to think (brain killer)- they are at the most - 'involved'.

No measures can regulate this unless there is a fundamental change in the thought process and commitment levels.

For the rest - 'ham' types - c p is just another good n fast mechanism (not a practise/way of life)to analyze, learn, contemplate, modify/extend and apply something they don't know or have not done yet!

Atmost, everybody who publishes code for sharing can and should be regulated rather than the ones who use the code. This way atleast wrong code is not spread.

In terms of analogy - Regulate the gun manufacturing and easy access to help save lives.

radicalfish on April 22, 2009 6:09 AM

A GUID makes sense if it can be used to traceback to a code snippet - which is regulated.

Infact, a combination of GUID the MD5 hash will be good - where the guid can be used to identify, track n categorize snippets, while the hash can be used to track and maintain the variations.

radicalfish on April 22, 2009 6:22 AM

Isn't the simplest thing that could possibly work just putting the URL where you found it in a comment above it? :)

Over here the running joke is that .cpp actually stands for Copy Paste Programming. Thankfully it's getting better...

Benoit Miller on April 22, 2009 6:36 AM

But how often do you copy paste code and not change it?

When I'm doing copy and paste, it's because the methods are similar enough that it'll save me time, but different enough that I'd have to get into some crazy abstractions for relatively simple code to be able to refactor them.

Even in the rare case when I copy something off of the web, I at the very least rename everything to follow our conventions.

Maybe other people do more copy and paste work off the net, but in my experience, cargo cult programming sits at the end of that road. There will be no code in my project that I do not understand, and if it is a) small enough to cut and paste b) simple enough that it is doing exactly what I want without modification c) simple enough that I can understand it without a lot of work, I'll probably just go write it myself.

Also, Apple was the first to do copy and paste, and they did it with the command button. Isn't it easier to use your thumb than your pinky? Also, I'm not sure if this is intentional or not, but when working with a command line it is very nice to not have your copy command and your force-exit program command bound to the same shortcut :) Use the terminal on OS X for a half hour, then its windows cmd.exe equivalent for a while, and I think you'll agree with me :) You could always change the command line shortcut, but I think ctrl-c to exit your current process has been around even longer than cmd-c to copy.

Mike on April 22, 2009 6:44 AM

Didn't we solve this problem once through precompiled libraries? The idea of code components is pretty old by now.

The article is long gone now of course, but MSDN used to have a piece that admitted code reuse failed in C/C++ even though it became one of VB6's biggest successes via COM.

Old Joe on April 22, 2009 6:44 AM

On sun hardware keyboards have one button each for cut, copy paste, so no extra dexterity (which may cause RSI) needed......

http://www.xahlee.org/emacs/i/kb/sun_keyboard_left.jpg

Fionn on April 22, 2009 6:47 AM

I read an interesting research paper (I could dig up the citation if people are interested) about copy paste programming. They used some techniques to find repeated code in a couple different projects, then tracked what happened to it over the version history of said projects. What they found was that there was actually surprisingly little code that was repeated. Most of the time, the repeated code was quickly factored out into a separate function or the two copies diverged enough that it'd be next to impossible to generalize.

I know I've had similar things in what I've done. I'll need a near copy of some code, but there will be some difference that makes it hard to generalize. (The sort where the only way I see to do it is to have the function take other function pointers as parameters, factor said functionality out into a ridiculous class hierarchy just for that purpose, or pass in some extra flag that just says 'which of these three functions should I call'?. I'm having trouble thinking of concrete examples though.) So I'll copy and paste it. I'll check to see if it works, and *then* I'll see what I can do to take care of the code reuse. Sometimes I figure something out that works well, sometimes not. But if I do try to generalize and it *doesn't* work, I'll know it's a problem with the generalization and not a problem with the foundational code.

Evan on April 22, 2009 6:52 AM

Adding a GUID seems kinda useless to me. You want it to track a snippet. Why not just subscribe to updates on the blog of the author?

Otherwise, just use the snippet databases (or even StackOverflow) to keep track of snippets.

Michel Billard on April 22, 2009 7:01 AM

I love the double meaning of Undoubtedly the most popular reason for
creating a routine - mostly programmers that simply do a routine are the ones that copy and paste (too much)

I love the double meaning of Undoubtedly the most popular reason for
creating a routine - mostly programmers that simply do a routine are the ones that copy and paste (too much)

I love the double meaning of Undoubtedly the most popular reason for
creating a routine - mostly programmers that simply do a routine are the ones that copy and paste (too much)

*smug smile*

Seth on April 22, 2009 7:10 AM

How about instead of just a GUID we add some sort of general meta-data? Include the GUID for uniqueness, but add some tags or other data as well.

If I want to search and find a quick-sort algorithm, I'm not going to search by GUID. I want to search by sort or quick-sort.

Brian on April 22, 2009 7:23 AM

This sounds like a solution in search of a problem. A unique ID for every code snippet? WTF are you talking about? :-)

Andy Roublev on April 22, 2009 7:47 AM

github's gists work well, but your idea is nice too as it does not rely on a single vendor.

Austin Wise on April 22, 2009 7:47 AM

We had a coworker who instead of copy-pasting from preexisting code into new code, he cut-and-pasted from the preexisting code into the new code. Naturally, hilarity ensued.

Doug T on April 22, 2009 7:55 AM

Hey, I can post some copyrighted (or GPL-ed) code with a snip-it on SO anonymously. Don't mention the license. Then screw everyone using that code. I'll know where you got the code - you'll have the link in your code. How can you prove it was me who actually posted the code?

Aardvark on April 22, 2009 8:11 AM

I Like It.

josh on April 22, 2009 8:32 AM

Yeah, agree with the others who have pointed out that they never copy/paste without modifying the snippet immediately afterwards. The GUID idea, while clever, seems very much like overkill for most use cases.

Ryan C on April 22, 2009 8:36 AM

This must be satire.

Having said that, if you are going to post snippets just use gist.github.com and move along.

Anthony Eden on April 22, 2009 8:40 AM

How are you going to learn without Googling and looking at someone else's code? If you crack open the manual or API reference, you get overkill on how each object was designed (written by the object designer), but few samples on how to call them or how a program could link them together.


Let's say you want to make an awesome chicken cacciatore for dinner. Do you :
A) Go to culinary school, learn the most advanced techniques of food preparation, read books on how Emeril cooks, go to the store and read all the nutrition information on each package, use your genius to architect the perfect design of ingredients and preparation?

B) Google chicken cacciatore and look at the first couple of recipes, print one, and try it out.

If you choose B, you are still building the meal yourself, and can decide if you like the approach. You make use of shared community knowledge, and you save a lot of time.

Rob McCauley on April 22, 2009 8:46 AM

Careless copy paste is a bad idea. Even good programmers at times do this and fall into the category of bad programmers at times. It is just a to become a bad programmer. When I am stuck with some thing I usually stick to the help provided, but at times do search on google and use the code, but I use caution to first understand the code written, and aslo make sure that I include the source in the comment, and if there is a email, I send a think you note to the author (I started this recently actually). On a recent project that we worked, unfortuantely I was taken off the project. Now they came back to me because they want more change, and the guy that took from where I left refused to work on the furthur. I was going through the code wirtten and I was shocked. Insted of writing one function for populting different ranges, same code is repeated for 6 different ranges. The code basically actictivates a wroksheet, picks a rnage, activates another workbook and pastes the code. So we have six different fucntions that do the same thing, and making corrections to the code is a nightmare. Worst part is I have cannot run the fucntions seprately. I have to run another fucntion which copies and formats the data, after that fucntion, these sxi different fucntions are run step-by-step. I thought that my design was bad, but after going through, my confidence is back that I could write neat code, and to keep my feet on the ground, and not to get over confident, I attribute this to god and god the credit being on his god side.

Anand.V.V.N on April 22, 2009 8:50 AM

I always thought Curly's law made more sense with the Single Responsibility Principle (Every object should have one responsibility and do it well / An object should have only one reason to change) as opposed to DRY.

Eagan on April 22, 2009 8:55 AM

I think what a lot of people are missing is that a GUID would cause the snippet to show up in search results, no matter where the code is pasted.

So the internet, in a sense, can have a discussion about a snippet and you can see the most relevant discussions/updates by simply searching the GUID.

Practicality on April 22, 2009 9:09 AM

What about copyright issues on code snippets? Aardvark mentioned GPL-ed code, but _any_ code I write is copyright me, and you can't use it unless I explicitly say you can. I see SO puts every comment under cc-wiki, but many code sites, blogs, newsgroups, etc that I run across don't even have a license. And with our lawyer all up in my face about documenting where every line of our code comes from, that share-alike clause in cc-wiki would probably freak him out. He'd probably want SO blocked at the firewall!

MarcT on April 22, 2009 9:13 AM

@MarkT. You probably should get a new lawyer :)

Practicality on April 22, 2009 9:27 AM

Get your guid needs taken care of here:

http://www.givemeaguid.com/


Will Shaver on April 22, 2009 9:32 AM

This seems like a workable proposal.

Many have commented that we often or always immediately modify a code snippet found on the Internet. You'll only generate a new, additional ID for *your* version if you publish your updated code on a web site.

It might be a solution in search of a problem as Andy Roublev suggests, but at least it's an easy-to-implement solution.

Zack on April 22, 2009 9:36 AM

Nit-pick: CodeProject isn't a snippet site, at least not in the sense that you're describing (yes, they do advertise snippets on their front page...). Indeed, articles that consist of little more than a code snippet have traditionally been discouraged, although more recently they've opened it up a bit with a blog integration service.

Which isn't to say there aren't snippets. CP hosts an extensive forum system, where questions are often answered with code. Think: web-based USENET.

And, IMHO, that's where your suggestion falls apart. I rarely find code solutions on actual snippet sites; by far the most common come from blog posts, newsgroups, forums like CP, and QA sites like SO. None of which *require* any actual separation between text and code, even if they allow it.

Fortunately, it doesn't matter: forget GUIDs, use the URL where you found the code. It's a more direct link back to the source, and holds a better chance of being preserved in an attribution for derivative works as well....

Shog9 on April 22, 2009 9:45 AM

One thing,
Pagerank means that you're more likely to find code that might be higher quality
If this were true then googling for apache docs would get you apache 2.2 docs instead of 2.0. Googling for Subversion docs would get you svn 1.5 docs instead of 1.4 (or far older). I've been bit by this when I searched and didn't pay attention to the version in what I found.

I can only assume it's worse when searching for random snippets.

Rob Russell on April 22, 2009 9:50 AM

How would spammers use this to make our lives miserable?

Timothy Lee Russell on April 22, 2009 9:51 AM

I'd rather hunt down the code snippet I need than peppering my source code with gross-looking GUIDs.

Marc on April 22, 2009 9:59 AM

I hate the Misfortune in a previous job of working with a CPR programmer. Copy Paste Rename a few variables and you are done. It was so great trying to debug imaging code inside a database project. (*NOT*) It seemed this guy just renamed variables based on what he was doing at the time with absolutely no understanding of the intent of the code.

Paul on April 22, 2009 10:11 AM

I can't believe this article. Not a single reference to the legality of pasting code off the web? I never copy code off the web because the copyright is never mine and licensing is usually non-existent. If people start GUIDing all these little code snippets and you paste the GUID into your code, all you have done is made the copyright infringement case against you air-tight.

It's an interesting idea. But it needs a front end where submitters can agree to license out the snippets according to some license, preferably a non-attributed license, viral or not.

jmucchiello on April 22, 2009 10:51 AM

Why not include the original URL instead of the UID? Improvements on the code should appear there.

kikito on April 22, 2009 10:53 AM

Good concept, but for this to really work there needs to be a common repository of these guids and who the original source is, as well as a version history and where those versions are located. While its good in concept, to use google to find the id and where it is refered to, the context of the original author is lost, as well as the version history of the code in question.

Imagine if a world changing algorithm gets published to the web with a guid, gets copied to thousands of blogs, then the original author finds and corrects a serious bug and reposts. The problem here will be what is the current version your searching for on the web? If I make a change to someone elses algorithm, what version do I call it? Am I working with the original or someone elses copy? Did they make any changes?

The other problem with this concept is that it requires that everyone plays by the same rules. A single repository can enforce those rules.

We currently have services for sharing and serving images on demand, why not have one for code snippits?

SKamradt on April 22, 2009 10:55 AM

Bah! The gods (inventors of Unix) decreed long ago that Ctrl-C shall be the INTERRUPT key. Using it for COPY is an abomination.

It was probably accidental, but Apple's decision to use Cmd-C for COPY meant that the proper use of Ctrl-C was easily accommodated when they introduced Unix underpinnings with OS X.

CorkyAgain on April 22, 2009 11:08 AM

If we start using GUIDs for everything we're going to run out of them.

fschwiet on April 22, 2009 11:09 AM

Very nice point. I post on a beginner's VB .NET forum and see all too often coders more interested in a copy-paste-email-to-professor solution. It makes me sad because even when I have to find a snippet on a blog, I won't paste it until I know what it does and how it does it.

It makes me sad that things like Curly's Law are so easy to learn, do so much for your code quality, and aren't taught in any schools but Hard Knocks.

Owen on April 22, 2009 11:10 AM

I read an article in ACM Queue proposing copy/paste as a method for developing code. The editor would keep track of these actions and prompt for multi-editing whenever code that was copied or pasted is edited. Seemed like a good idea.

Brad on April 22, 2009 11:27 AM

I'm noticing a bit of a disconnect here, brought on by language.

Copy and Past is not bad. It never has been. Duplicate code (in the same project) is PURE EVIL. It is the first and last thing you should be concerned with as you are coding. People use copy/past as a shortcut term for Creating duplicate code within a project, but then others hear that and think So I can't use cut and paste to move code??

Bringing in code from another project or the web (be it via copy and paste, a single object or a library) isn't a problem at all.

Cut and past, by the way, is never an issue since you aren't even creating a copy!

Finally, as for the idea of tracking where you got something from the web--I consider a snippit something to learn from. You slide it into your toolbox and keep it if it's valuable.

Once you've learned the principles, why would you ever have to go back to the source? Your version will almost certainly be adapted and combined with other code anyway--I can't remember ever taking code off the web and just leaving even a single line as-is.. (but then I'm not a web programmer, there is a lot more boilerplate there, which is likely an indication of a bad programming language/environment anyway)

If you can't import a snip of code into your own brain, understand and re-write it, however, don't put it in your code!

Bill K on April 22, 2009 11:27 AM

(yes, I know that in OS X, the keyboard shortcut for cut and paste
uses crazy Prince symbol key instead of control, like God
intended.

So... let's assume you're a real programmer, who uses real servers, with a real command line.

You SSH into your server, do a locate for a particular file, then wish to copy that filename.

On a mac you select that text, hit command-C, and bang it's copied.

On a PC you hit control-C and bang you've terminated some process...

That's if you can remember to right click first and select Mark. Blech.

The Apple convention for the command key to be completely local to the computer makes sense. The PC convention to occasionally send control keys and occasionally capture them makes no sense at all.

As usual Microsoft copies Apple, but misses the point.

Ben on April 22, 2009 11:30 AM

Instead of trying to predict whether it's going to be useful, why not just try it for a while and see whether it starts providing some value? If it does, I'm sure we'll hear about it. That's the Agile way! Try anything and see what sticks! ;)

Allann on April 22, 2009 11:31 AM

I don't DRY so much as DRYMTO (don't repeat yourself more than once). So I will copy and paste one time and if I find a need to do it once more I will refactor. I don't consider 2 instances to be compelling enough to necessitate a method. Having two instances is also a warning sign for me that there is a design issue. Just blindly methodizing it means you might miss the forest for the trees (two trees in this case).

The larger issue is that you can clutter up an object with too many methods that don't have true utility so adding methods willy-nilly is not the best policy - especially if you're being paranoid about it and making methods for things that *might* be repeated some time in the future.

I try to add the obvious stuff and then see how the code grows on it's own - refactoring as I go.

Also, there are languages (I'm looking at you, Java) that force you to write a lot of boilerplate code. Why wouldn't you copy/paste that stuff? If that's the idiom of the language then trying to refactor it would just cause a lot of confusion for the poor sap who has to read it later.

Matt Lentzner on April 22, 2009 11:39 AM

Just a side comment, the ergonomics of the command key vs the control key are one of my favorite things about the Mac keyboard. I don't see why it is 'heathen' - I always find it less of a stretch to reach with my thumb vs my pinky for control. YMMV.

Jeff B on April 22, 2009 11:39 AM

Hmmm...

Programming with GUIDs. The worm has certainly turned.

Think of all the tools you would need for this language.

First you will be programming with only GUIDs, but will then realize that there is a simpler way to program by using an GUID-assembler. After a few years of this, somebody will then create a higher level language that will abstract out all these numbers and letters so that you eventually end up with a English-like programming language called G.

You think formatting wars are a problem. Already people are creating a war about the actual algorithm to be utilized to generate the reference (@Joe Chung).

Is there really a problem with copy and paste? What kind of paradox is this? In order to solve a problem you actually end up with the same problem (See first part of comment).

Joseph Gutierrez on April 22, 2009 11:41 AM

The argument that I always use to force myself to not copy/paste is that writing it improves retention: I'll remember it better even if I transcribe it verbatim over copy/pasting it.

Steve-O on April 22, 2009 12:24 PM

Like any other programmer, I constantly am googling to find solutions for minor problems I'm having. Often, that solution is found in a ten line code snippet. However, in all my years of programming, I cannot recall a time when I pasted that snippet into my program unmodified and left it at that. I use the snippet as a starting point, but then I always end up refactoring the hell out of it, such that it usually ends up hardly resembling the original snippet. So in my case, I don't really see how ad-hoc version control would help me; by the time I'm done with a pasted code snippet, it usually bears little resemblance to the original copy.

EvanM on April 22, 2009 12:31 PM

This GUID thing is never going to work.
As for copy-n-paste, it's a necessary evil I've grown used to tolerate. I try to avoid it whatever possible, considering it as you say, indicator of design error, but it's sometimes better than the other way around, such as:

bool func(const float param) { ... }
bool func(const int param) { ... }

In cases like this, I often resolve to copy-paste. The other way around involves templates and I often see them as a far too generic tool (you know they're still somewhat not quite right in C++). Having those two funcs allows to make the limitation more explicit, which I consider to be an advantage in the long run.

I'd like if some IDE could have a special copy-support to track those things even inside the same file. I know some do indeed have a snippet management system but I'm still not accustomed to it (it doesn't seem to work as I would like to).

MaxDZ8 on April 22, 2009 12:33 PM

+1 for gist

+infinity for github in general

taelor on April 22, 2009 12:48 PM

By copying and pasting you find and get fast what you need. If the code isn't to the standards and designs you are using, you have to format it first. Also of course you should make subroutines instead of pasting, but sometimes you just copy the general block of code that calls some subroutine. For example you copy a database call, just change the call details, but you get the way that database calls are done in that particular project. There are many ways to call database starting from simple calls to object mappers and the programming language can vary too. Hard to just remember how the calls are done in what project. Better design makes the process easier as you don't have to guess what the programming standard and method is this time.

Silvercode on April 22, 2009 12:51 PM

a href=http://www.informit.com/articles/article.aspx?p=1193856Donald Knuth said this in an interview/a:
I also must confess to a strong bias against the fashion for reusable code. To me, re-editable code is much, much better than an untouchable black box or toolkit. I could go on and on about this. If youíre totally convinced that reusable code is wonderful, I probably wonít be able to sway you anyway, but youíll never convince me that reusable code isnít mostly a menace.

Also, there is some neat C-code at
a href=http://ccan.ozlabs.org/CCAN (Comprehensive C Archive Network/a

SteveC on April 22, 2009 1:21 PM

Also, Apple was the first to do copy and paste, and they did it with the command button. Isn't it easier to use your thumb than your pinky?

No. ???

Just a side comment, the ergonomics of the command key vs the control key are one of my favorite things about the Mac keyboard. I don't see why it is 'heathen' - I always find it less of a stretch to reach with my thumb vs my pinky for control. YMMV.

Do you use two hands to cut/copy/paste? Because otherwise I have difficulty seeing how that's not a contortion for a person with regular sized hands. And maybe a Mac keybinding subtly trains you to do it two handed and other bindings train you to do it single handed. If I just rest my hands naturally on a keyboard, with my fingers not specifically moved to the home row in typing position, my left pinky is on left ctrl and my left pointer is on C.

Ens on April 22, 2009 1:47 PM

Copy isn't so evil, Paste is the big culprit...

Code examples on the net should be rendered out as Images so you could look at them, use them as a reference, but aren't able to paste them directly into your own code.

Pasting someone elses code incorporates EVERYTHING from the source. Variable naming, object selection, poor algorithms, poor performance.

And the developer doesn't learn much (if anything at all) by doing it.

The final code is the end result of a virtual blueprint that the developer has designed in their head. Grabbing bits and pieces of other peoples stuff is certainly going to clutter your own model and/or change its essence.

Do you consider developing software is an Art? Or a Process?

Copy/Pasting the Mona Lisaís head is not proper in the former but encouraged if itís the latter and you are baking a cake.

David E. on April 22, 2009 1:57 PM

Are all O'Reilly readers / Safari users bad programmers?

Nope. That would only be all of the O'Reilly readers / Safari users who approach software development as simple as cut, paste and program.

:-)

Oldster on April 23, 2009 2:12 AM

But copy and paste lets you put in crazy characters easily. #8238; Like U+202E. Which has the ability to reverse all of hte text so that it reads from right to left instead of left to right. This would be hard to do without cut and paste.

trouble maker on April 23, 2009 2:55 AM

A good idea, but I want to reiterate that github gists are the real solution here as it allows anyone to edit and improve a snippet and a clear and easy to follow revision history is available for everyone to see.

Of course if github goes down there's your single source of failure.

Max Howell on April 23, 2009 3:13 AM

I'm very leery of copy/pasting code. I don't want code that I don't understand running. That's how bugs get introduced, and I like writing code that doesn't have bugs in it.

Sometimes though, you're left with no choice, when working with undocumented or poorly documented APIs. In which case I hold my nose, and also experiment with the code until I understand what each line is doing.

Bill on April 23, 2009 4:15 AM

you (or someone you work with) isn't falling into the trap

Grammar nazi here - you () AREN'T falling

Regis on April 23, 2009 4:44 AM

More comments»

The comments to this entry are closed.