July 17, 2007
Tim Berners-Lee on the Principle of Least Power:
Computer Science spent the last forty years making languages which were as powerful as possible. Nowadays we have to appreciate the reasons for picking not the most powerful solution but the least powerful. The less powerful the language, the more you can do with the data stored in that language. If you write it in a simple declarative from, anyone can write a program to analyze it. If, for example, a web page with weather data has RDF describing that data, a user can retrieve it as a table, perhaps average it, plot it, deduce things from it in combination with other information. At the other end of the scale is the weather information portrayed by the cunning Java applet. While this might allow a very cool user interface, it cannot be analyzed at all. The search engine finding the page will have no idea of what the data is or what it is about. The only way to find out what a Java applet means is to set it running in front of a person.
If you liked that article, I recommend the rest of Berners-Lee's architectural and philosophical points page. Although the content is quite old in internet time-- only two of the articles were written in the last year-- it still contains some timeless nuggets of advice and insight from the guy who invented the world wide web.
Posted by Jeff Atwood
I agree with Jeff.
It's a bit like Lego - lots of simple, small, reusable pieces allows you to build all sorts of structures, whereas a house is just a house.
Well, my favourite weather website recently changed it's UI from good old HTML to Flash. Now instead of opening a bookmark, I have to go through menus every time I want to knows something about weather. So we get slower, harder to use, but "prettier"(for some value of pretty). I wouldn't say they won anything powerwise...
I don't get it. Presenting data in a usable form has nothing to do with the expressive power of a programming language. In fact, I'd say the inverse applies. If you're stuck in a contorted language you're more likely to output some horrid, contorted thing, while if you're in a clean language, it should be straightforwards to get a clear output of your data structures.
Putting weather data up in a fancy Java applet instead of as straightforwards data doesn't reflect power of a programming language. It reflects idiocy of a programmer.
I have read quite a bit lately about the perceived bloat that is occurring with other languages. I find your less is more take to be quite interesting. Thanks for sharing.
Oh, and by the way, I'll never forgive Tim Berners-Lee for not enforcing the simple rule that every tag should be closed. Ugh. We're still paying the price for that one... (Tell me if my hatred is misplaced...)
VIC-20 Basic... now there was a language that knew what a semicolon was really all about...
10 PRINT "STEVE IS SKILL " ;
20 GOTO 10
and then, more and more people have to re-write their beautiful ajax apps on every new browser release.
seems to be what silverlight seems to be addressing right now.
and following a standard seems to be exactly what all browsers (except mozilla) has been doing since the very beginning
and then, more and more people have to re-write their beautiful ajax apps on every new browser release.
seems to be what silverlight seems to be addressing right now.
and following a standard seems to be exactly what all browsers (except mozilla) has NOT been doing since the very beginning
Jesse McNelis: "This has nothing to do with language, it has to do with data structures.
Programming languages should be powerful, data structures should be accessible."
Yes, exactly. The linked article conflates "power" with "data transparency". There are a LOT of languages which would be called quite not-powerful, but which are also quite data-opaque. For instance, Word 2.0's BASIC language was, quite undeniably, very very weak. However, if it was displaying a dialog box, do you think any other application would be able to see that data?
In the linked article, the "Principle" is "Powerful languages inhibit information reuse." However, the body of the article then talks about being able to determine what a program will do via code analysis ("powerful" languages, it is argued, will be harder to decipher in this way and will require running ... not sure if I agree with that either.) In other words, the "information reuse" is the algorithm or process, NOT the data.
Fundamentally, I think the article is not allowing for multi-tiered applications at the client (ie, a web service shoots down raw data; the client displays it all prettified (and copy/pasteable). That type of an architecture allows the individual user to get at the data, AND an automated tool to get at the data without having to run the prettifier.
Example: given the raytracer link. Is its data more, or less, accessible (meaning, algorithms, raw data, and produced results) than a similar project in Java or C++? I would guess that the "more powerful" languages might allow for a much more transparent process and results.
IMHO, it all boils down to the simple and well-known law: "Use the right tool for the job." If you are conveying text information, use a text markup language. If you are performing complex mathematical operations, use a language capable of doing those. If you are dealing with complex data structures, a language with more than laughable support for complex data structures would be advised.
Sorry, Mr Berners-Lee.
Tim Berners-Lee (obviously no dummy) seems to be conflating simplicity of syntax and program structure with a lack of power, which simply isn't the case, as many previous commentators have noted.
Maybe this rule should be restated as: "Programs/data should be as simple as possible, and no simpler." (with apologies to Einstein).
@David: not requiring a closing tag is a "feature" of SGML, of which HTML is a subset.
I find Atwood's law interesting, though I don't know what you mean. I assume your law is somewhat tongue in cheek, but I'm not sure what you're getting at. Are you saying that people will run everything possible in a browser? Or that programmers are moving to simpler, more permissive languages? Or that programmers like showing off, building space shuttles out of popsicle sticks just to show they can?
One of my biggest complaints with Windows is how Windows has taken what in the Unix world would be straight forward ASCII text and made it "object oriented". In Unix, everything is a file stream while in Windows it is an object.
I understand why the Object model is so powerful, except that means I can't simply parse through the data without the appropriate API and linking the DLL to my code. Even if I can do that, I have to go through dozens of pages of documentation to delve into the information I need. What I thought is just some straight forward data is really pieces of two different objects which are members of a particular sub-class that is a member of two meta-classes.
And, about half the time, the API was designed, so that the two pieces of data really have no relationship to each other. So, not only do I have to learn the entire class structure of the model, but I still end up doing a lot of hacking to get what I want.
People make fun of me because I still code so much in Perl -- that's so 20th century!. However, you give me some data in straight, plain text, and with Perl, I'll get that information to dance.
Wait a second... Al Gore didn't invent the world wide web?
This has nothing to do with language, it has to do with data structures.
Programming languages should be powerful, data structures should be accessible.
Just wait until you have to go through the contortions needed to code all your business logic in BPEL XML instead of a real programming language.
I can see where the JavaApplet is counter intuitive to getting and parsing the information. I guess you could say Microsoft WPF could be a solution to this. The GUI is written totally in xaml, which is xml based, and can be parsed and anaylized.
I'm reminded of the Principle of the Least Fun. Which says that the fun of using computers decreases in proportion as computers become more complex and, ha-ha, user-friendly. I.e., WordStar was infinitely more fun than MS Word, the more rudimentary versions of Linux, once mastered, are more fun than Ubuntu. Etc.
David: "One of my biggest complaints with Windows is how Windows has taken what in the Unix world would be straight forward ASCII text and made it "object oriented". In Unix, everything is a file stream while in Windows it is an object."
Umm... What? Windows doesn't have a file object. The .NET Framework does, but Windows itself doesn't. The Windows API has the OpenFile() function that returns a file handle, like in Unix. You use that handle with ReadFile(), just like in Unix.
@Jesse McNelis Aaron G.
At first, I was a bit confused by Tim's writing too. In my mind, the word "language" was naturally restricted to refer to "programming language" and he sounded like a dumb guy who didn't know his foot from a PDF document on the ground.
I needed to step a bit back from the programmers POV and realize that there is a thing called "Markup Language" (ML in X/SG/HTML).
He's talking mainly about markup languages, and when you read it that way, it surely does make sense.
I assume [Atwood's Law] is somewhat tongue in cheek, but I'm not sure what you're getting at. Are you saying that people will run everything possible in a browser? Or that programmers are moving to simpler, more permissive languages? Or that programmers like showing off, building space shuttles out of popsicle sticks just to show they can?
Can I select "all of the above"?
If the weather program were written as a ball of JS webapp code rather than a Java applet, that wouldn't necessarily make the weather data any easier to pull out of it.
I see he used a NeXTCube -- what ever happened to this wonderful machine?
Is anyone using this now?
The NeXT OS and APIs were subsumed into Apple's OS X. NeXT applications can generally be ported to OS X with little more than a recompile. Examples include Apple's Mail and TextEdit apps, and OmniGroup's OmniWeb web browser.
I have to agree with some of the other responses. At the least, we haven't been given a good example of why a less powerful language would be better, but rather why it's important to have easily parse-able output.
Feasibly you could even have some server based Java code generating the web page in a nice, parse-able format, just like you were using RoR or PHP - and I fail to see how using a simpler language to produce the same text output would be better.
If analysis of the code itself were being done, though - but even then, I'd think that more concise, legible code would be easier to analyze. Though I don't see why many people would want to do that (with another language, at least). Extending a program, on the other hand, is something that people would want to do, and something that's fairly prevalent with AJAX; but I fail to see how it's the language's lack of power that allows this.
I don't like that Javalet example. On the one hand you have the raw text data, on the other hand you have a visual image made by Java - and !shock! the image is harder to parse. On the one hand the text is already in the right format, but on the other you have to use complex OCR algorithms.
This is like saying: Some raw ASCII text is easy to read as raw ASCII text, but it's harder to parse text that's part of a photo of some rather sloppy handwriting writen by a druken programmer. Maybe I'm suffering Fridayitis and the brain has decided to stop working.
Browsers ARE following standards. The days of the browser wars are over. They might not be able to immediately make huge leaps and bounds and get the standards perfect straight away, but noone wants to return to the days of proprietary scripts and tags.
This subject reminds me of Occam's (Ockham's I checked it on Wiki) Razor. "Entities should not be multiplied beyond necessity" or "All things being equal, the simplest solution tends to be the best." Now if one were to compare this rather eloquent bit of logic to the principle of least power, one should dismiss the cluttered logic of the principle and replace it with Occam's. After 30 years of writing code in 5 different languages, Occam's logic has never failed. "ALL THINGS ARE ESSENTIALLY EQUAL" before the one starts the journey down that project road.
David: "One of my biggest complaints with Windows is how Windows has taken what in the Unix world would be straight forward ASCII text and made it 'object oriented'. In Unix, everything is a file stream while in Windows it is an object."
I've got a dumb question. What is an .so file? What's the difference between ld and ldd?
I'm sorry, that was way too easy. However, I feel like I have to ask the next one, too. How do I connect a stream to GTK+ so I can draw a pretty GUI with it?
Atwood's law is going come to fruition thanks in large part to the iPhone. :).
"HTML should be a dead markup language. The web has grown far beyond what it should be doing with HTML, but due to backwards compatibility, we instead force HTML to do thinks it was never intended to do.
So we keep hacking away mixing markup with logic at a vain attempt to create a rich user experience."
Amen, brother. Amen.
This is how I see it:
I do understand the point this article is trying to put forward, but it very much depends on the problem that we are trying to solve with the software that is being built. In the end, we all design software to solve a certain problem that we have whether it be personal or at work.
I believe that the way in which we represent information is simply a design choice in the design phase and will depend on how "open" the software is and for what other purposes it will be used. The more closed source and "secret" the information needs to be the more complicated or obfuscated we will make it.
If we assume we are dealing with open source software, then I strongly support using formats which are simple to understand and reuse and follow some kind of standard where possible. It makes everyone's live simpler.
One of the reasons for using a java applet may well be to try to hide the source as much as possible. I don't think that a website which uses applets really intended other users to use that website as a source of weather data. In the end a well design peice of software will keep the GUI separate from data store/representation in which case it doesn't matter that a java applet was used.
I agree with Tom: "Use the right tool for the job."
"HTML is not code, it's markup (data). Dublin Core Metadata and "the content of?
"HTML is not code, it's markup (data). Dublin Core Metadata and "the content of most databases" is, uh, data. ACLs are still data!
All Tim is doing is lumping code and data into the same category, then pointing at the data and saying it's easier to analyze because it's simpler. No, it's easier to analyze because it's content, not instructions on what to do with the content."
Ugh, Aaron, then what is a well-written LISP program? Then whole point of LISP bottom-up programming is that you boost the language itself with higher-order functions and sometimes syntactic macros up to the point when you actually get to write the application logic, what you are actually using is hardly more than an extremely handy domain-specific markup language. It just has a more succint syntax than XML but (customer (field 'name) (field 'address)) isn't that much harder to analyse programatically than it's XML counterpart. It's a trivial transformation to XML.
So Tim is wrong but for that reason you mentioned. What Tim did actually forgot that power IS simplicity - the power you have at your hand when you write your libraries is simplicity when you define your business logic.
The problem with understanding what Jeff is trying to convey is that you really need to be in a position to compare different languages to each other.
Remember that no language is idiot-proof (An idiot can write bad code in any language), although some seems to be genius-proof (I'm no genius, so I'm not sure, but it seems to me that certain ideas in certain languages enforces complexity), but very often people use 'generated' arguments against stuff that they have no personal experience with, which is detrimental to any debate.
This means that combined with the feature that you can create objects (including anonymous functions) on the fly for use as arguments or return values, bloat gets cut by orders of magnitude.
That's simple for you.
Also, if static typing and enforced exception catching was as important as it seems, it would never have been possible to build something so large and powerful as Dojo, or indeed Yegge's yet-to-be-released Rhino on Rails.
I have a sneaking suspicion (after eight years as a Java/JEE coal-miner) that those two features (among others in the same camp) just maybe have been costing us a whallop more of complexity than it has actually given us usable 'security' in the application.
@Peter Svensson, I'd agree that Java's type system and exception declarations contribute to the complexity of codebases. But static typing isn't *necessarily* as bad as Java's. It's *good* when the compiler catches real type errors. Java's types make you 1) spell out types all the time (the compiler can't work it out) and 2) make up meaningless types to make common situations work (for example, if I want a list that can contain As and Bs, I need to invent some common superclass that A and B inherit, whereas in dynamically-typed Ruby or better-statically-typed Haskell, I can just put As and Bs into the list).
Of course, all that's tangential to Tim BL's main point to prefer formal languages that are lower down the Chomsky Hierarchy of languages, when deciding on a representation for the web.
HTML is not code, it's markup (data). Dublin Core Metadata and "the content of most databases" is, uh, data. ACLs are still data!
All Tim is doing is lumping code and data into the same category, then pointing at the data and saying it's easier to analyze because it's simpler. No, it's easier to analyze because it's content, not instructions on what to do with the content.
Not sure if I agree with this statement in this blog post. Language power has nothing to do with information transferral and display and format.
Practically any language can parse, format, retrieve or forward information. This have nothing to do whether or not the information is accessible.
I might write a program in VB, Perl, ASP, C, Java, .Net to create an HTML page that displays the weather. People can then look at the data. HTML is open and data readily accessible.
I might also write a similiar program that displays the weather in a Flash Control, or in a Java, or maybe an Active X control written in VB, etc. In these cases, the information is no longer accessible due to the container it is now in.
If the first instance, the container was the browser and the format was HTML. This is an open means of rendering and displaying the data.
The second case the container was propertiary or additional code was written to not display the raw data, but to format and convert the data so that it was no longer accessible.
In early versions or web, text and images were about it. Now it is much more complex. People want bells and whistles, can't just dislay the temperate and whether or not it is raining, you have to put a 3-D or 2-D map behind it and show the location that way. Also, user needs to manipulate the map as well. It is much more than just the weather temperate. The added complexity usually hides the underlying data in some fashion.
Maybe HTML just isn't good enough for the web anymore, after all it is TEXT markup. Maybe a new protocol is needed for the more complex forms of data that are being presented on the web.
HTML should be a dead markup language. The web has grown far beyond what it should be doing with HTML, but due to backwards compatibility, we instead force HTML to do thinks it was never intended to do.
The problem is that there is no real solid replacement. Its certianly not flash, and Silverlight will probably not gain the across the board support it needs.
So we keep hacking away mixing markup with logic at a vain attempt to create a rich user experience.
Oh, the pain. There's a lot of seeming missteps in Tim Berners-Lee's logic, but this helps explain the problems with HTML and the XML family of technologies.
The secret to the success of the Internet was e-mail, HTML for documents, and networking. These were all successful, especially the networking part. HTTP made it rather easy to get a resource from the server to the client and TCP/IP made connections work.
HTML's simplicity was a powerful factor back in the day when everyone had to impliment it. Being easy to impliment is much more important to early adoption than expressiveness, but unfortunately despite the fact that we're no longer there, TBL is stuck there mentally.
Ok, Atwood's law is starting to accelerate at a geometric rate.
Good call, dude.