March 23, 2012
I suppose What You See Is What You Get has its place, but as an OCD addled programmer, I have a problem with
WYSIWYG as a one size fits all solution. Whether it's invisible white space, or invisible formatting tags, it's been my experience that forcing people to work with invisible things they cannot directly control … inevitably backfires. A lot.
I have a distinctly
Ghostbusters attitude to this problem.
I need to
see these invisible things, so that I can zap them with my proton pack. I mean, er, control them. And more importantly, understand them; perhaps even master them.
I recently had the great privilege of meeting
Ted Nelson, who gave me an in-person demo of his ZigZag project and his perpetually in-progress since 1960 Xanadu project, currently known as Xanadu Space. But one thing he mentioned as he gave the demo particularly intrigued me. Being Ted Nelson, of course he went much further than my natural aversion to invisible, hidden markup in content – he insisted that markup and content should . Far more radical.
never be in the same document
I want to discuss what I consider one of the worst mistakes of the current software world, embedded markup; which is, regrettably, the heart of such current standards as SGML and HTML. (There are many other embedded markup systems; an interesting one is RTF. But I will concentrate on the SGML-HTML theology because of its claims and fervor.)
There is no one reason this approach is wrong; I believe it is wrong in almost every respect.
I propose a three-layer model:
content layer to facilitate editing, content linking, and transclusion management. A
structure layer, declarable separately. Users should be able to specify entities, connections and co-presence logic, defined independently of appearance or size or contents; as well as overlay correspondence, links, transclusions, and "hoses" for movable content. A
special-effects-and-primping layer should allow the declaration of ever-so-many fonts, format blocks, fanfares, and whizbangs, and their assignment to what's in the content and structure layers.
It's an interesting, albeit extremely hand-wavy and complex, alternative. I'm unclear how you would keep the structure layer in sync with the content layer if someone is editing the content. I don't even know if there are any real world examples of this three layer approach in action. (And as usual, feel free to correct me in the comments if I've missed anything!)
Instead, what we do have are existing, traditional methods of
intermixing content and markup ala HTML or TeX.
When editing, there are two possible interfaces:
WYSIWYG where the markup layer is magically hidden so, at least in theory, the user doesn't ever have to know about markup and can focus entirely on the content. It is an illusion, but it is simple enough when it's working. The downside is that the abstraction – this idea that the markup is truly "invisible" – is rarely achieved in practice and often breaks down for anything except the most basic of documents. But it can be good enough in a lot of circumstances.
Two windows where the markup is fully visible in one window, and shown as a live rendered preview in the other window, updated as you type, either side-by-side or top-and-bottom. Users have a dynamic sandbox where they can experiment and learn how markup and content interact in the real world, rather than having it all swept under the rug. Ultimately, this results in less confusion for intermediate and advanced users. That's why I'm particularly fond of this approach, and it is what we use on Stack Exchange. The downside is that it's a bit more complex, depending on whether or not you use humane markup, and it certainly takes a bit more screen space and thinking to process what's going on.
What I didn't realize is that there's actually a
third editing option: keep the markup visible, and switch rapidly back and forth between the markup and rendered view with a single keystroke. That's what the Gliimpse project reveals:
Please watch the video. The nearly instantaneous and smooth transition
that Gliimpse demonstrates between markup and preview has to be seen to be appreciated. The effect is a bit like Expose on the Mac, or Switcher on PC. I'm not sure how I feel about this, mainly because I don't know of any existing IDEs that even attempt to do anything remotely like it.
But I'd sure like to try it. As a software developer, it's incredibly frustrating to me that we have generational improvements in games like Skyrim and
Battlefield 3 that render vastly detailed, dynamic worlds at 60 frames per second, yet our source code editors are advancing only in tiny incremental steps, year after year.
Posted by Jeff Atwood
So is the UI slick? Yes, but it suffers from the same problem that almost all other current editors have: I have to save and switch contexts in order to see my results. What a productivity drain.
I know many web developers (including myself) who eschew fancy IDEs in favor of developing in the browser itself: using Firebug or equivalent to 'live preview' changes as they are being typed into the window. No save, no refresh, rather immediate results.
The Glimpse thing is pretty damn cool. That said: It would be nice if there was an argument for complete separation of content, structure, and effects that wasn't 15 years old. (The linked article of Nelson's is from 1997.) We've all had a lot more experience dealing with that kind of integration; editing markup inline with content was a relatively new thing in 1997 (yes, I realize there are prior examples), but we've all been doing it for over a decade at this point.
It might be true that for certain domains (writing textbooks?) a more rigorous separation might be valuable, but I'm not certain there would be much value in forcibly splitting apart the content and markup of a website, especially when the content is frequently a tiny fraction of the overall number of bytes involved.
I completely buy how important this approach is. Have you checked out Bret Victor's awesome video on visualizing code?
https://vimeo.com/36579366. It totally blew me away.
We obsess over how to teach young kids to code. Stuff like this rocks...
I was just about to share that same video, KellerRinaudo. Good post, Jeff. And I agree with KellerRinaudo, check out that video. It's really inspiring stuff.
I'd been thinking recently about how people often say that HTML vs. CSS is a way to separate content from layout. However, that actually seems like a bad place to try to make that division, because there are a bunch of layout-related things that live in the HTML part. For example, you might have to edit your HTML to add a new class to a div so that you can style that class in the CSS, or reorder the divs in the HTML so that one will 'float' to the correct place.
So, actually, this 3-layer model seems to map fairly well to the reality of how things are separated in most blogging/CMS tools:
1) content: the plain-text/markdown/etc. content that you create per-entry
2) structure: the templates you use to generate each page, usually HTML + a "templating language"
The line between "structure" and "polish" is blurred about the same amount between HTML and CSS as it is in the conceptual definitions of those levels that you give above.
I really liked the Glimpse screencast and wonder how it could work with or handle more complex rendering stacks. Particularly in web development like a LAMP stack.
That said I've created a quick AutoHotKey script to play with with this idea. Basically same thought of holding down a key >> preview in a web browser >> let go of key >> back to editing. I hope my hacked together partial-solution helps someone else out there :)
It seems to me that the anchoring of higher levels becomes possible with proper version control. So the answer is in the operating system.
I was at Bell Labs in the 90s and we had it there in the Sablime source control system.
I don't understand how we've sat through decades of OS development and version control is still something only used by programmers (and not even all of them).
You can have something similar to Valetta for LaTeX with Emacs + AUCTeX (Preview-LaTeX part of AUCTeX). You can have live preview (in a separate window) with Emacs + Whizzytex (the problem here is invalid input during writing). You can go from preview to source with source specials enabled.
Jeff, have you looked at XPages from IBM? There you have that split between data (stored in Domino databases, relational databases or other sources), business logic/functionality (using Xpages/SSJS) and finally polish (using CSS).
I am not an expert on Xpages, I just started developing with it, but there are plenty of resources and blogs about it where you can find out more. http://www.xpages.info is a good start.
I filmed a Xpages jumpstart session at the IBM conference Lotusphere that show Xpages development, about 2-3 minutes into the video there is a slide that explains the architecture. Take a look at http://youtu.be/0ViUTfAzoTo if you like.
I too would recommend watching Bret Victor's video.
There's actually a forth option: hide markup and use visual means (font face, color etc.) to highlight text structure (WYSIWYM). WYSIWYG is not of much help anyway if your content is to be rendered in several formats (i.e. HTML and PDF). If you limit styles to the most essential ones then Markdown or any other markup could be used behind the scenes.
I recently began exploring this approach at
http://www.textseditor.com and would appreciate feedback.
Can you ever truly separate markup and content? What if you want emphasize a word in your content? You would need some sort of syntax to highlight the word. What if your content required a table, that it only made sense when contained in columns and rows?
This exists for CMS applications as well. The theory is that you only store "content" in the database, and "style" in the presentation. For basic content that works, but falls apart with real world usage.
I used in a generated and annotated documentation project XML and XSL, outputting HTML.
It seems to me pretty close to the 3 layers split.
I had no tool, though, to easily switch from generated markup to content view.
But the XML contained >90% content. Of course, the XSL can be quite complex, especially when mixing the HTML markup inside.
The gliimpse project seems promising. Someone might create a MarkDown version for stackoverflow :-) Editting text still rules, but a lot of work can be done to make it better.
I think if the last 50 years have taught us anything, there is no utopia. And I'm far more afraid of the problems with keeping content synchronized with it's separated mark-up than I am of having light mark-up mixed into the content. The Ted Nelsons of the world make us think ... truly a good thing. The Tim Berners-Lees of the world give us working systems that often don't live up to the high expectations of the theorists, but they're availably today.
Personally, I'm scared of people that try to persuade you to strictly follow one or the other.
Actually, not all things should evolve at the same rate.
Also a good developer not always have to see invisible things to control them, when coding or typing, he knows what will be the output.
And there is fourth interface or option, but it is used commonly in drawing/CAD/painting applications.
As someone who is currently writing a WYSIWYG editor for the web (
http://ghosted.it (currently in alpha, only compatible with chrome/FF)) I find this really interesting. I have found that if you want to produce good code, then people do need to have some concept of how the basics of the format works (e.g. CSS floats), but they shouldn't need to know the actual syntax.
Personally I don't see any need to go to a full code view (in theory, with a good enough editor). WYSIWYG controls are more than capable of handling basic headers and if you need/want to use TeX for complex formulas then this should be available in the area it is needed. Custom CSS values should be editable from within a visual interface, and rendered in real-time.
Cross-browser (almost), usable WYSIWYG editing is not impossible, it is just very difficult and time consuming to implement (and I certainly don't have enough time to do it). The problem with offering a code view, is that this means that the WYSIWYG editor has to be able to handle arbitrary code, and not only be able to render it, but provide intuitive editing controls for it (ever tried pasting a complex layout into TinyMCE's code view and then trying to edit it in WYSIWYG mode? Hidden elements galore. Even with a good understanding of HTML and TinyMCE's behaviour I can't work that one out).
IMO it is better for WYSIWYG editors to constrain themselves to what they can do (and producing good code for basic headers/lists/columns they certainly could).
Google should acqhire this, and roll it in to Chrome's View Source and Developer Tools.
That was awesome! But in this and many other videos (including Bret Victor's) I see another interesting trend - we're finally about to step over some kind of threshold in processing power where we can truly SEE what we are doing WHILE we are doing it. I'd say that these approaches don't replace WYSIWYG - they are the next generation of WYSIWYG! Made possible by the vast amount of unused processing power we're starting to get in our computers. I'm excited to find out what else people will come up with the insanely powerful multi-core CPU's of today and tomorrow!
Too bad none of these things actually work yet in a real IDE!
Comes tThis article, and most of the comments, all miss an essential point. The presentation is the content.
This white elephant of content and presentation, model and view, etc. is something thats plagued our user interface design for ages. That is why doing proper localization is so difficult, why text editors are so primitive, and why trying to separate out html and the "content" almost always results in some ungodly mixing of levels.
I dont have the answers, but as a thought experiment I offer the following question: commonly we see the content feeding into the presentation layer, but what if we reversed that and instead visualized the presentation details feeding into the content? CSS comes to mind, of course, as do some of the less formal markup languages, where the way something is rendered as resolved as a function of the markup symbol, rather than the layout adding content.
Neither of those techniques are ideal. Someone much cleverer than I will, soon I hope, come up with a contextual rendering system, probably tied to a good natural language parser.
Good article! And amen to Bret Victor's talk.
For web stuff I use the side-by-side approach, with
LiveJS, works great.
Separating the layers vs. inline markup seems like MVC Frameworks vs. PHP. One is more Theoretically Good; and one is friendlier to a
much wider audience. At least in part because there's nearly zero bootstrapping: "highlight and click one of these (font face/bold/color/etc.) buttons" in the word processor is good enough to get started; a fancy Styles library that lets you re-face/bold/color/etc. your document by editing the style set is an Advanced Feature that may be introduced/discovered later. If Styles were the only way to do it, that whole model needs to be learned before it produces useful knowledge.
I used to work on a project, where I had 3 layers like you said, namely XML, XSL-T and CSS. Unfortunately XSL-T isn't very good for the exact purpose it is created for, because it's a functional language that is supposed to walk through data - even lacking at that. It has no iterations, variables, reliable complex conditional statements or even cross-platform mathematical tools. It was interesting to use it, but in the end it was such a mess to work with it, that the next time I wanted to present data from XML on HTML+CSS I just ended up making a code to code compiler for it in Python...
Context is often forgotten when people think about these things. If you assume a context if plain web content (without floating elements and such), then this approach is great.
If you are thinking about writing a book, then structure is more important so that everything can be presented in a consistent style and the content is removed from the formatting. Here I think somthing like DocBook, XML-XSLT, etc. are good given you use a good editor for them.
For code, Bret's approach is excellent as it also encourages better programming (smaller chunks of code with single responsibility etc.)
For more visual stuff like a modern website that may load content dynamically and bits need to be designed individually to show/hide this content you need a different approach. Here I think the only solution I have seen work is getting youself two monitors and use live preview editors like editing a page in FireBug.
Hence there is no "right" answer. A simple WYSIWUG editor for stuff like comments is "just-right" for its content but totally awkward for writing a book. Vice versa DocBook or XML/XSLT is overkill for writing comments in a blog. And neither are good enough for designing a dynamic webapp.
Just my 2 cents.
Quite nice as an idea. However, I don't think it approaches solving the content->structure->effect problem. I think Nelson's idea is the correct-yet-unachievable statement by which we should just strive to get as asymptotically close to as possible.
The problem is that Nelson's approach, while correct and provides for a living document that scales with the technology, required the work to be done three times. You can ask me to do the work once, even twice, but three times?
A good example is something that I do on a daily basis: Kuali Rice (kuali.org) presentation. To do it "The Kuali way", I have to create a Java object, a JSP presentation layer, and a data dictionary. Same problem as above: I'm doing the same work three times. In practice, I don't do the data dictionary and simplistically do the JSP (i.e. just enough to make the customer happy).
The correct approach isn't WYSIWIG, it's to present "good enough" structure and effect layers to start with, let content creation occur, and then alter the structure and effect layers as part of the editing process. Perhaps I should just shut up and
create that system...
Didn't Word perfect do this back in the day? It was WYSIWYG, but you could toggle the visibility of mark up at the touch of a button. Delete one tag and the other tag would be removed.
Yup; works just like Word Perfect 5.1 from two decades ago.
The rubber meets the road when you design an abstract publishing model that uses these hierarchies. They exist. Question is who's markup, for what purpose? Is it merely to be read, or is there a semantic meaning that needs to be extracted? Is it poetry? Is it technical documentation is it legal material? All of these have special needs that are not easily subsumed under one universal markup. SGML tries but fails everywhere. Yet it is the most successful markup for general use.
I'll throw out the idea that to find a solution for general coding IDE's one has to ask what a bit of code should represent.
Does it represent a UML element?
Does it represent a logic gate?
Does it represent a flow control scheme?
To map a code change to its effects on a program's logical interactions seems like the natural transition.
The separation in three different layers seems interesting but hard to reach.
Here is a project which is not far from this idea :
Actually they still put markups with the plain text but the markups represent an idea rather than a style. So we end up with :
First Layer : Text + markups expressing the role and meaning of the content
Second Layer : Correlation between the markups and the style
WordPerfect. Alt-f3. Great pity that the lemmings all rushed off the appalling MS Word cliff.
WordPerfect appears to be stale and moribund. But even so I remain loyal.