HTML Validation: Does It Matter?

March 5, 2009

The web is, to put it charitably, a rather forgiving place. You can feed web browsers almost any sort of HTML markup or JavaScript code and they'll gamely try to make sense of what you've provided, and render it the best they can. In comparison, most programming languages are almost cruelly unforgiving. If there's a single character out of place, your program probably won't compile, much less run. This makes the HTML + JavaScript environment a rather unique -- and often frustrating -- software development platform.

But it doesn't have to be this way. There are provisions and mechanisms for validating your HTML markup through the official W3C Validator. Playing with the validator underscores how deep that forgiveness by default policy has permeated the greater web. Dennis Forbes recently ran a number of websites through the validator, including this one, with predictably bad results:

FAIL - http://www.reddit.com - 36 errors as XHTML 1.0 Transitional. EDIT: Rechecked Reddit, and now it's a PASS
FAIL - http://www.slashdot.org - 167 errors as HTML 4.01 Strict
FAIL - http://www.digg.com - 32 errors as XHTML 1.0 Transitional
FAIL - http://www.cnn.com - 40 errors as HTML 4.01 Transitional (inferred as no doctype was specified)
FAIL - http://www.microsoft.com - 193 errors as XHTML 1.0 Transitional
FAIL - http://www.google.com - 58 errors as HTML 4.01 Transitional
FAIL - http://www.flickr.com - 34 errors as HTML 4.01 Transitional
FAIL - http://ca.yahoo.com - 276 errors as HTML 4.01 Strict
FAIL - http://www.sourceforge.net - 65 errors as XHTML 1.0 Transitional
FAIL - http://www.joelonsoftware.com - 33 errors as XHTML 1.0 Strict
FAIL - http://www.stackoverflow.com - 58 errors as HTML 4.01 Strict
FAIL - http://www.dzone.com - 165 errors as XHTML 1.0 Transitional
FAIL - http://www.codinghorror.com/blog/ - 51 errors as HTML 4.01 Transitional
PASS - http://www.w3c.org - no errors as XHTML 1.0 Strict
PASS - http://www.linux.com - no errors as XHTML 1.0 Strict
PASS - http://www.wordpress.com - no errors as XHTML 1.0 Transitional

In short, we live in malformed world. So much so that you begin to question whether validation matters at all. If you see this logo on a site, what does it mean to you? How will it affect your experience on that website? As a developer? As a user?

w3c-validation-button-large.png

We just went through the exercise of validating Stack Overflow's HTML. I almost immediately ruled out the idea of validating as XHTML, because I vehemently agree with James Bennett:

The short and sweet reason is simply this: XHTML offers no compelling advantage -- to me -- over HTML, but even if it did it would also offer increased complexity and uncertainty that make it unappealing to me.

The whole HTML validation exercise is questionable, but validating as XHTML is flat-out masochism. Only recommended for those that enjoy pain. Or programmers. I can't always tell the difference.

Anyway, we validated as the much saner HTML 4.01 strict, and even then I'm not sure it was worth the time we spent. So many of these validation rules feel arbitrary and meaningless. And, what's worse, some of them are actively harmful. For example, this is not allowed in HTML strict:

<a href="http://www.example.com/" target="_blank">foo</a>

That's right, target, a perfectly harmless attribute for links that you want to open in a different browser tab/window, is somehow verboten in HTML 4.01 strict. There's an officially supported workaround, but it's only implemented by Opera, so in effect .. there is no workaround.

In order to comply with the HTML 4.01 strict validator, you need to remove that target attribute and replace it with JavaScript that does the same thing. So, immediately I began to wonder: Is anybody validating our JavaScript? What about our CSS? Is anyone validating the DOM manipulations that JavaScript performs on our HTML? Who validates the validator? Why can't I stop thinking about zebras?

Does it really matter if we render stuff this way..

<td width="80">
<br/>

.. versus this way?

<td style="width:80px">
<br>

I mean, who makes up these rules? And for what reason?

I couldn't help feeling that validating as HTML 4.01 strict, at least in our case, was a giant exercise in to-may-to versus to-mah-to, punctuated by aggravating changes that we were forced to make for no practical benefit. (Also, if you have a ton of user-generated content like we do, you can pretty much throw any fantasies of 100% perfect validation right out the window.)

That said, validation does have its charms. There were a few things that the validation process exposed in our HTML markup that were clearly wrong -- an orphaned tag here, and a few inconsistencies in the way we applied tags there. Mark Pilgrim makes the case for validation:

I am not claiming that your page, once validated, will automatically render flawlessly in every browser; it may not. I am also not claiming that there aren't talented designers who can create old-style "Tag Soup" pages that do work flawlessly in every browser; there certainly are. But the validator is an automated tool that can highlight small but important errors that are difficult to track down by hand. If you create valid markup most of the time, you can take advantage of this automation to catch your occasional mistakes. But if your markup is nowhere near valid, you'll be flying blind when something goes wrong. The validator will spit out dozens or even hundreds of errors on your page, and finding the one that is actually causing your problem will be like finding a needle in a haystack.

There is some truth to this. Learning the rules of the validator, even if you don't agree with them, teaches you what the official definition of "valid" is. It grounds your markup in reality. It's sort of like passing your source code through an ultra-strict lint validation program, or setting your compiler to the strictest possible warning level. Knowing the rules and boundaries helps you define what you're doing, and gives you legitimate ammunition for agreeing or disagreeing. You can make an informed choice, instead of a random "I just do this and it works" one.

After jumping through the HTML validation hoops ourselves, here's my advice:

  1. Validate your HTML. Know what it means to have valid HTML markup. Understand the tooling. More information is always better than less information. Why fly blind?
  2. Nobody cares if your HTML is valid. Except you. If you want to. Don't think for a second that producing perfectly valid HTML is more important than running your website, delivering features that delight your users, or getting the job done.

But the question remains: does HTML Validation really matter? Yes. No. Maybe. It depends. I'll tell you the same thing my father told me: take my advice and do as you please.

Posted by Jeff Atwood
263 Comments

As a web user, it doesn't matter to me if it doesn't validate. I wouldn't know unless I go through the trouble of passing it through a validator. Why would I spend time doing this.
What matters is the site should be functional and easy to use. like stackoverflow. I don't care if SO has 200 errors.

However it probably matters to people with non GUI browsers...

Abdu on March 6, 2009 1:01 AM

Jeff, what has always intrigued me is that programmers are able to create valid RSS and ATOM feeds for their websites, but come up with every reason under the sun why they can't create valid XHTML for their websites.

XHTML is just XML with a couple extra rules about what elements can go inside what other elements and what attributes are allowed - as you've noted above. No biggie. Coding Horror's RSS Feed validates. StackOverflow's RSS feed validates. CNN's feed, etc. etc. etc. What's the problem? My thoughts:

http://iamacamera.org/default.aspx?section=developid=73

Carl Camera on March 6, 2009 1:01 AM

Brilliant!

Janko on March 6, 2009 1:04 AM

Your problem is that... you're doing it wrong.

First - target=... is BAD! If I want it in a new window, I'll do it myself. Don't try to force anything on me, thankyouverymuch.

Second - You're writing a html page. Why don't you write it correctly from the start?
Comparing to programming - do you write the whole program and then add patches until it kind-of works, multiple bugs cancel each other out and memory leaks aren't critical? Or do you write small parts and test them so that the whole program is correct?
You don't have to cleanup the mess if you don't make it in the first place.

Me on March 6, 2009 1:08 AM

Jeff, since you are the self-proclaimed ShamWow (attributed) guy of Coding Correctness, it must sting to see FAIL next to that validator report.

Now you have to re-python the whole thing or the Scrummy world will implode.

BugFree on March 6, 2009 1:14 AM

I may not be the most precocious web developer out there, but I really found the process of converting my own CMS (if you can call it that) to output strict XHTML valid CSS2 pretty satisfying. The real pain in the neck for me is filtering the output of the RSS feeds aggregated on my site so that they, too, are XHTML-strict.

Making your site XHTML-strict is good for you. Think of it like... flossing.

Leviathant on March 6, 2009 1:15 AM

It is not too much to ask that all browser render valid html the same way: according to spec.
But it is too much to ask that all browsers make the same guess how you want your invalid html to be interpreted.

Only when we write valid html can we expect html to be cross-browser compatible.

ako on March 6, 2009 1:48 AM

I am amazed that those who are so opposed to the target attribute have not found the simple solution:

Keep the tag in the spec, but let the user agent ignore it if that is what the user wants.

That way, people could live in the dark ages if they want, but those of us who understand tabs and windows can benefit from a web author's suggestion that those features would be useful when following a particular link.

Jonathan on March 6, 2009 1:48 AM

The problem with the standards is that they keep changing.

So any attempt to standardize right now may be wasted effort when whatever is done is then undone.

I understand that is the idea of versioning, but to spend 2000 manhours standardizing your HTML only to find out it's no longer the latest-and-greatest when that was your selling point for the project might cause you to lose your job.

Not a wise risk to take.

Of course, for those of you who work for a company where you can waste 2000 hours for no reason, have fun!

Practicality on March 6, 2009 1:50 AM

I would add that following standards on new content is a good idea. But for refactoring old: You have to consider the cost/benefit ratio and make an informed decision.

That goes for all of these QA points. Seriously. The solution is not ALWAYS refactor nor is it NEVER refactor. (replace refactor with standardize for this specific blog post) The solution is to do so when the benefit outweighs the cost, abstain when the benefit is less than the cost, or when analyzing the cost is more expensive than the benefit.

Practicality on March 6, 2009 1:57 AM

@Practicality - the latest and greatest standards are ten years old, plenty of time to learn them properly so that one does not need to spend 2000 hours making something valid.
Making code validate is VERY easy. The important thing is to remember, that valid tag soup is still a tag soup.

Rimantas on March 6, 2009 2:17 AM

if you have a ton of user-generated content like we do, you can pretty much throw any fantasies of 100% perfect validation right out the window.

I think that's true in general, but a href=http://validator.w3.org/check?uri=http%3A%2F%2Fen.wikipedia.org%2Fwiki%2FFixing_Broken_Windowscharset=%28detect+automatically%29doctype=Inlinegroup=0Wikipedia validates/a, and it's emall/em user generated content. (Not necessarily a counter point, but it's interesting.)

I've been working on a specialized a href=http://simple.wikipedia.org/wiki/Main_Pagesimple wikipedia/a editor; and Wikipedia's use of XHTML over HTML makes it more straightforward to pull information, like edit-tokens, out of their webpages. So I think XHTML has it's advantages for letting people build things that interface with your webpage in ways you wouldn't expect.

Vincent Gable on March 6, 2009 2:33 AM

if you have a ton of user-generated content like we do, you can pretty much throw any fantasies of 100% perfect validation right out the window.

I think that's true in general, but Wikipedia validates ( http://validator.w3.org/check?uri=http%3A%2F%2Fen.wikipedia.org%2Fwiki%2FFixing_Broken_Windowscharset=%28detect+automatically%29doctype=Inlinegroup=0 ), and it's *all* user generated content. (Not necessarily a counter point, but it's interesting.)

I've been working on a specialized editor for http://simple.wikipedia.org/ ; and Wikipedia's use of XHTML over HTML makes it more straightforward to pull information, like edit-tokens, out of their webpages. So I think XHTML has it's advantages for letting people build things that interface with your webpage in ways you wouldn't expect.

Vincent Gable on March 6, 2009 2:35 AM

Brilliant! ( posting THIS article immediately after BikeShedding :)

Breck Carter on March 6, 2009 2:45 AM

BUT why not make things better? I'll wager that if browsers only displayed valid (X)HTML everyone would create valid (X)HTML. So I'm soooo glad XHTML Strict are really anal about validation!

This is technically accurate, in that obviously, everyone would only consist of those people who could still write web-pages. So it'd be a smaller everyone by a factor of hundreds of thousands.
However, it misses the point that the web would be limited to providing for the kind of people who obsess over XHTML. The true power of the web would still be slowly rendering Captain Janeway pr0n, and arguing over whether bang is a silly word to use for the ! character.

Tom on March 6, 2009 3:04 AM

target, a perfectly harmless attribute for links that you want to open in a different browser tab/window

Ahem. As the user, I am the one should dictate where the links open. If I want it to open in a new window, then it shall. You need to ditch the target attribute. Now.

Josh Stodola on March 6, 2009 3:33 AM

If you want Strict - then stick with strict, and don't use a target attribute on your link.

If you want target, use the Frameset DTD. HTML 4 has 3 fully legitimate DTDs you can pick-and-choose from, and just like your favourite fundie, whatever you pick-and-choose is the right one.

Why did you pick Strict for Stackoverflow anyway?

-------
* The HTML 4.01 Strict DTD includes all elements and attributes that have not been deprecated or do not appear in frameset documents. For documents that use this DTD, use this document type declaration:

!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.01//EN
http://www.w3.org/TR/html4/strict.dtd

* The HTML 4.01 Transitional DTD includes everything in the strict DTD plus deprecated elements and attributes (most of which concern visual presentation). For documents that use this DTD, use this document type declaration:

!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN
http://www.w3.org/TR/html4/loose.dtd

* The HTML 4.01 Frameset DTD includes everything in the transitional DTD plus frames as well. For documents that use this DTD, use this document type declaration:

!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.01 Frameset//EN
http://www.w3.org/TR/html4/frameset.dtd
-------

Robert on March 6, 2009 3:34 AM

You should always leave 1 mistake in your HTML to retain your humility.

Steve on March 6, 2009 3:35 AM

I have to say that HTML validation (especially using the 4.01 Transitional) is not difficult to attain. Pages that don't validate are either due to a misunderstanding of the basics of HTML, either to pure laziness.

Let's take for example out beloved codinghorror.com:

6x document type does not allow element LINK here : That comes from a misunderstanding of the way HTML works. Tags such as link and a handful others (br, area, link, img, param, hr, input, col, base, meta) don't need to be closed. Actually it is wrong to explicitly close them because they are implicitly closed, closing them manually would be like closing them twice.

12x+ end tag for element INPUT which is not open: Same error as above: misunderstanding of the basics of HTML

1x: end tag for element TD which is not open: Laziness... There is no table anywhere close to that TD...

and 30 more warnings about not encoding to amp; in urls as it should be. : Misunderstanding of HTML basics. is not a regular character in HTML, it is used to reference entities that are declared in the DTDs that apply to the current document in the form entityName; The HTML 4.01 Transitional declares a crapload of entities. Entities are NOT limited to the ones declared in the HTML 4.01 DTD as you could supposedly attach more DTDs to your document.

Hence using '' verbatim in your HTML is similar to using an unescaped in your C#/Java/whatever source code, you should escape it with amp; .

Seriously, how can we even expect browsers to be somewhat standard compliant if we keep feeding them that kind of crap.


Nicolas Piguet on March 6, 2009 3:44 AM

Got halfway through the page when I gave up on reading all of the comments, but (for the first time ever) I feel compelled to say I also think you forgot something very important in your article:

Yes, it's true that most browsers render almost any website - whether they're good or badly coded - but not all browser render bad code in the same way. How can you make a cross-browser website that way?

I personally spent many, many hours on trying to get websites to look the same on any (major) browser. When you (and browsers) don't commit to a standard (be it HTML or XHTML of any version), how can one then create a website that anyone can see in the same way - namely, like you wanted them (your visitors) to see it?

Frederik on March 6, 2009 4:16 AM

So, immediately I began to wonder: Is anybody validating
our JavaScript? What about our CSS?

Well, the JS interpreter validates JS. There's also JSLint (http://jslint.com/) which performs a stricter form of validation.

Regarding CSS, the W3C have a validator http://jigsaw.w3.org/css-validator/

Personally, I believe it is worth the effort to write HTML which validates as being at least XHTML transitional. This guarantees that the code is valid XML, which means I can use it with XQuery, XPath, XSLT, etc.

Donal on March 6, 2009 4:55 AM

Although I frequently attack standardistas who fixate on minutiae of HTML and CSS, while effectively ignoring software engineering, scalability, design, usability, business workflows, marketing, expedience, and all the other factors that weigh in on real websites, this post is just plain irresponsible.

Web standards exist to reign in the MADNESS that us long-time web people lived through 10 years ago. The amount of lost productivity could only be measured in the billions. Thankfully, due to the standards movement, it's now possible to write flip articles like this. But be careful not throw the baby out with the bath water. Standards are what bring us to a world of less browser testing, greater accessibility, and ultimately greater productivity. A general vote against validation is a vote for increasing the number of development headaches we'll face in the years to come.

Gabe da Silveira on March 6, 2009 5:24 AM

Looks like I got to the party late. OH well, here goes anyway...

To all you people, maybe even the author, saying, Who Cares?, remember to bring this attitude up when interviewing. I damn sure wouldn't want you working for my company. You are all the type that wrote those crappy, spaghetti-code VB apps aren't you?

Back in the day, visual tools could have helped Mom and Pop write valid HTML. Today? I know of no one, outside of web developers, who writes HTML any more. They use blogs, Facebook, etc.

Shitty markup is dead and lazy morons that still produce it should be fired and sent back to the McJobs they are qualified for.

El Guapo on March 6, 2009 5:47 AM

It's a matter of what you think your responsibility is to the wider developer community. If you're within some very small percentage instead of just a small percentage, things improve. As the old saw goes, it's never the same small percentage. Go from 99% to 99.9% and you reduce the possible set of global quirks by at least some factor, maybe not the full order of magnitude, but enough to make a difference in how difficult it is to create browser tech, and that's one more strength for the web to leverage for its growth.

Maybe you don't care personally. Certainly that's been the tone of this blog of late. But it's worth considering.

mgb on March 6, 2009 5:52 AM

Google actually ranks it's indexed pages. The more valid the (X)HTML of your pages, the higher it'll appear in a search.

Vordreller on March 6, 2009 5:54 AM

Wow, a lot of soap boxes out there...

In the theoretical world, yes, there are standards. However, in the practical world, if the standards are not enforced then there are no standards. I find I'm more productive if I live in the practical world and not the theoretical world - and no, I'm not lazy, just efficient.

Jeff, keep up the good work.

hank on March 6, 2009 5:56 AM

If you do write XHTML, you'd better get it right. I heard about CodeProject practically dropping off the map because of an XHTML error that caused Google to stop ranking them. Also, I've heard Safari can refuse to render XHTML that's not according to Hoyle. Apparently spiders and browsers expect sloppy HTML, but if you say you're writing anal-retentive XHTML, they take you at your word.

John on March 6, 2009 5:57 AM

Go from 99% to 99.9% and you reduce the possible set of global quirks by at least some factor

Given everything I've learned to date about browser idiosyncracies, I find it very hard to believe that validating as HTML 4.01 strict will make any difference whatsoever.

For one thing, validating HTML says *nothing* about the CSS or JavaScript that drive most sites these days..

Jeff Atwood on March 6, 2009 6:00 AM

1. validating as xhtml 1.0 transitional is pretty easy nowadays, strict less so. However, try writing css that validates and works in internet explorer. If you can't write xhtml that validates, there's something wrong with your tools. But this isn't about bashing Microsoft again. Okay it is, they're an easy target.

2. Ever wonder why web browsers are so hard to build? Why they are chock full of security bugs? Ever considered writing your own web browser? I didn't think so. The reason so many xhtml parsers are fail fast is because we want to get away from this liberal about inputs world and return to a somewhat more sane fail on bad inputs world. Even writing a simple html parser for web scraping purposes can be such a waste of time because you *have* to assume the input will be invalid.

3. It's not terribly important for your site or your users to be good about (x)html validation but it is for the community at large. Which basically means everybody will start caring around about the same time pigs start flying.

wppds on March 6, 2009 6:04 AM

I don't find writing proper (validating) XHTML strict any difficult, once you learn what is allowed or not. To me XHTML makes more sense than HTML 4, if we are going to write pseudo XML then why not write real XML ?

Marti on March 6, 2009 6:05 AM

I'm not by any means a front end guru, but I would think that valid html would get much more important when it came to 508 compliance. Screen readers will get confused a lot more easily than browsers, and invalid html can't help the situation.

razmaspaz on March 6, 2009 6:06 AM

Know the rules, then break them if you have a good reason to. It's as simple as that.

arle nadja on March 6, 2009 6:07 AM

Nobody cares if your HTML is valid
As pointed out if indexing and your google ranking does not matter then take Jeff's advice of it is up to you otherwise validate your pages.

What on March 6, 2009 6:08 AM

The reason so many xhtml parsers are fail fast is because we want to get away from this liberal about inputs world and return to a somewhat more sane fail on bad inputs world.

The sooner we can smash pandora back in that darn box, the better off we all will be!

I dunno, there's a fine line between thought leadership and tilting at windmills. Validate if it's important to you. But realize that it's a small part of the overall equation.

Jeff Atwood on March 6, 2009 6:08 AM

if we are going to write pseudo XML then why not write real XML ?

Indeed, why write a comment when you can write a novel? Why pilot a boat when you could captain a battleship?

Jeff Atwood on March 6, 2009 6:10 AM

No, really, why whould you expect target= to work in *strict* DTD? It's a frmes related attribute and as such is valid in frames DTD only.

And as to it being harmless... No, it isn't. It's a quite basic usability breaker. See, normal link (no target=) opens wherever I want. If I want it in the same window/tab it will open there. If I want it open in a new one -- it will open there, as instructed. But there's no fscking way to tell your browser to *not* open it in a new window/tab if you don't want it to.

coven on March 6, 2009 6:10 AM

@Vordreller and John: citation needed.

Apparently spiders and browsers expect sloppy HTML, but if you say you're writing anal-retentive XHTML, they take you at your word.

Even if that's true, virtually nobody does that, anyway. Serving true XHTML is more than simply adding a doctype or a xmlns attribute, and if spiders relied on just that, easily more than half the websites which claim to use XHTML (via doctype) would be dropped from their index.

Daniel on March 6, 2009 6:11 AM

The opinions in this post are very similar to my feelings on using FxCop in .NET development. FxCop finds a few important defects very quickly for you *if* you've been fixing violations all along.

But, if you never fix the Specify IFormatProvider and Specify CultureInfo errors, you'll never find the Exceptions should be public errors that bite you in the future. I think electrical engineers refer to this as the signal-to-noise ratio.

Eddie on March 6, 2009 6:11 AM

I fully agree with this post: HTML validation, as it currently stands, is only useful for finding structural errors in your code, such as mismatched tags.

For one thing, lots of things that are illegal are, as you said, impossible or impractical with valid code. Javascript elements can only appear in certain places. Anchors can't have targets. There are other examples. In many cases the validator is essentially wrong, since no browser implements those rules. I believe HTML 5 is relaxing the rules a bit because of the way the browsers actually work.

Other things are just noise warnings for HTML: URLs are supposed to be SGML escaped (i.e. - amp;). This never matters in practice, so why do it?

I do agree that td width=80 is wrong, however, and so is td style=width:80px. The width should be declared in the CSS for the page. I wish HTML validators could highlight style= as an error, to make it easier for me to find hard-coded styles.

Mr. Shiny New on March 6, 2009 6:11 AM

Basically what coven said: No target attribute is a feature. If I want your link to open in a new window (which is never) or tab (which is often), I will do it my goshdarn self, thank you very much.

Kr on March 6, 2009 6:12 AM

Seems like new window links are really a behavior issue and don't necessarily need to be markup elements anyway---use Javascript for that.

Colin Jones on March 6, 2009 6:15 AM

The target attribute is so much a better way to open anything in a new window than doing it with JavaScript... At least the browser (Safari does it) can tell the user this link will open a new tab/window. This is way more accessible and user respectful.

I hate it when I click on a link and it opens a new window (because my Safari is set to open target=_blank in new tabs).
And, as a developer, I continue to use target=_blank for external links.

I think target attribute is forbidden on strict html/xhtml because it breaks history, can do weird things if the user doesn't notice the new window, etc. But doing STRICT and doing the exact same thing with JS is just stupid (even MORE stupid if they say we're xhtml strict valid).

Julien Tartarin on March 6, 2009 6:15 AM

My initial reaction to the removal of target was negative. I thought it was stupid. Then I thought some more and figured out the theory behind it. XHMTL is trying to do the MVC pattern with the markup being the model. Actions, like opening in a different browser are in the controller, which is browser and if you want to do custom browser behavior, you use JavaScript. So it make sense from the theory perspective. But my final conclusion rest in change and that people only like change that makes things better. This makes things harder and more bug prone, especially for non-developers. So I agree with you that XHTML strict is likely to be a developer only markup.

Just you wait to you see XFORMS, XHTML has nothing on them.

sblundy on March 6, 2009 6:16 AM

I would say validation should be important to developers. As HTML and CSS continues to evolve and become more advanced browsers will have no choice but to begin being more and more strict and when they do the broken HTML will be truly broken. One must also wonder what kind of impact these loose interpretations have on the performance of browsers since they have to spend a lot of time fixing stuff that should be done properly in the first place.

Steven Surowiec on March 6, 2009 6:20 AM

Rather than looking at the short term as everyone usually does, perhaps if everyone moved to valid HTML/XHTML we would have higher visual consistency in different browsers over time. It's a lot easier to code a display engine if everyone is following the same rules.

Thomas on March 6, 2009 6:21 AM

+1, for the removal of the target attribute.

Oh, how I hate those sites that thing they're so important that they can take over my browser experience.

Tom on March 6, 2009 6:22 AM

XHTML has no advantage over HTML as of now, since IE doesn't support true xml. They won't in the future either.

http://blogs.msdn.com/ie/archive/2005/09/15/467901.aspx

It's fairly easy to make a HTML-Transitional page fully validate though.

Jin on March 6, 2009 6:24 AM

It's also worth noting that valid (X)HTML is MUCH easier for screen readers to parse. They aren't (always) as forgiving as your standard web browser.

Nor are the cut-down browsers in mobile devices, or in browsers that support older systems (ever try to surf using Lynx? THAT'S masochism)

Valid (X)HTML isn't just a nicety, it's the ramp outside the store, right next to the handicapped space. Just because YOU are sighted, not color blind, and can read 8-pt font comfortably doesn't mean everyone can.

It's also worth noting that if you use the web for a business and your website isn't accessible to screen readers, you may run afoul of some local laws.

Jeff on March 6, 2009 6:25 AM

It is very sad to read this post. I seriously doubt that the W3C standards were created just for the fun of it! I think the root of the problems lies in the forgiving nature of the EARLY browsers, newer browsers would then have to be forgiving as well, otherwise nobody would use them since they might not display your favorite site correctly (or at all).

BUT why not make things better? I'll wager that if browsers only displayed valid (X)HTML everyone would create valid (X)HTML. So I'm soooo glad XHTML Strict are really anal about validation!

If everyone then again had valid HTML, I'd bet a lot of things would be easier, e.g. parse XHTML via XML DOM...

Regarding JavaScript, if the JS engine only allowed valid changes to the DOM then there would be no problem.

Of course JS should be valid ecmascript (or whatever it's called this week) and the same really applies to CSS.

The attitude: its hard to implement and it works anyway, then I can't really be bothered (IMHO) effectively prevents unleashing the full power of HTML/the web.

lhundertwasser on March 6, 2009 6:28 AM

Few points:

1) target=_blank breaks (to some degree) the consistency of the web. If people want a new window, they can right-click-open in new window. If they want to replace your site with the new one, they'll just click. If they want it in a new tab, they'll middle-click. target=_blank breaks all of that good standard Web stuff. Even if you insist on your nonstandard behaviour, chuck it where it belongs: not in the document structure (probably in the Javascript).

2) td width=80
br/
.. versus this way?
td style=width:80px
br
- both of these are wrong. 1st one has terrible hard-to-correct in-page styling (chuck it in a stylesheet!) and the 2nd is worse: in-page styling AND a malformed line break element (which even when written correctly is rarely/never necessary if you have decent mark-up. But it's legal, so fair enough).

3) Why build away from standards? XHTML Strict isn't *that* hard (apart from if you want to keep the web-distorting _blankness). Get it right now. Doing mark-up quickly while doing it well is only zero-sum if you aren't very good at it, in which case: pay someone to do it.

Rob on March 6, 2009 6:30 AM

I generally like your posts but I am disappointed to see a prominent blog like yours advocating invalid markup and I feel sorry for those who follow your bad advice on this post.

Sloppy code, everyone is doing it and it works well enough so why bother to do it right, no one notices the code seems to be your message. Anyone who calls themselves a web developer ought to care enough to do it right or get out of the way and let someone who does care do it.

You are right to call out the popular sites that are not valid, they should be shamed into doing better, you are wrong to endorse bad practices just because there is so much successful crap out there.

Maybe its difficult to validate a site once you've waited until its all developed and in production, but with modern tools and awareness it is not difficult to build valid XHTML Transitional ASP.NET applications.

Joe Audette
Founder of mojoPortal an Open Source CMS that does produce valid Xhtml.

Joe Audette on March 6, 2009 6:31 AM

I have not looked at the sites listed, but do they say they apply to HTML 4.01 strict? Or have you validated with strict anyway? If you say your site is transitional or frameset it should be validated for that.

Maybe because I am a programmer I prefer strict. And I would probably use XHTML if the browsers really would support that. Firefox at least tell it likes XHTML. IE does only do */* as accepted data. I hate that. Now it was probably over a year since I last tried what browsers really accepted XHTML, so it probably lot better today. But with the world still being owned by IE6 I stick to HTML 4.01 strict.

smernaz on March 6, 2009 6:35 AM

I'm a bit disappointed that you don't get why target=_blank is bad. Here are some links:

http://www.useit.com/alertbox/990530.html
http://www.snyderconsulting.net/article_7tricks.htm#7

Julian on March 6, 2009 6:36 AM

Validation is pretty much at the very, very, very, bottom of priorities in any site I make. And even if by some miracle I ever actually got to this proverbial bottom, I wouldn't do it just because I think it's such a waste of time.

Paolo on March 6, 2009 6:36 AM

a perfectly harmless attribute for links that you want to open in a different browser tab/window

Therein be part of the problem - actions that *you* as developer want to happen to the user's machine, regardless of the user's wishes. For accessibility and usability reasons the user/user agent should remain in control of behaviour. Using JS is a better fit to separate behaviour from content but arguably this particular practice is undesirable anyway.

XHTML is just as easy to handcode as HTML once you are used to either one of them. Is this the same question as 'why does PHP need semi-colons at the end of each line when Python doesn't?', or 'why do I need a closing brace as well as an opening brace?' As noted though, sending XHTML is normally utterly pointless unless sent as application/xhtml+xml.

You're right that nobody cares if your code is valid except you, but it's akin to writing a statistical report where, because of rounding, the percentages don't add up to 100%. It doesn't really matter at the end of the day, but it's lazy and could confuse/give extra work to the reader (browser).

Alistair on March 6, 2009 6:37 AM

I agree with you completely. Validating your HTML is very useful since it exposes some obvious problems that could turn into real headaches. But, things like the target attribute are ridiculous. I actually ran into that recently.

So, I tossed the idea of completely validating my code right out the window. I used the validation check to see what needed to be fixed, but I did not fix it all. Fixing it all would lead to more problems. Instead, I just focused on what could actually cause complications down the line. All the rest, that have no effect on user experience, were left alone.

An example is when I set up a form with an empty action, and assign an event using MooTools. The event then directs an ajax submission when the user submits the form. But, when validating the HTML it comes up as an error that the form does not have an action assigned to it. Who cares? Nobody. It works.

Validating to 100% should only really matter if your site is running completely on HTML, with little to no JavaScript or other languages.

Timothy on March 6, 2009 6:38 AM

You can feed web browsers almost any sort of HTML markup or JavaScript code and they'll gamely try to make sense of what you've provided, and render it the best they can.

Having seen web-developers near-enough tear their hair out over cross-browser rendering issues, one finds it difficult to conclude that the above statement is a positive one.

In comparison, most programming languages are almost cruelly unforgiving.

This is a good thing, and this is the same standard that webpages should adhere to. If this were the case, it would be far easier to implement web browsers that could confidently apply standards-based layouts to pages, and would render cross-browser design issues a thing of the past. Web Developers that dig their heels in the ground and proclaim (about the lax attitude towards mark-up) It has always been this way; why should we change? are digging their own.

stadidas on March 6, 2009 6:39 AM

Not just 508 but W3C accessibility guidelines, of which Priority 2 is the accepted (although not legal) minimum standard in the UK states that to achieve Priority 2 you must 'Create documents that validate to published formal grammars'.

I like having valid HTML, it helps when the browser rendering engines change so frequently, it's good to know you have a solid and correct foundation to buid from.
And as someone else said it helps with Search Engine Rankings, MSN even more so (although the site I'm basing that on we also did it to pass accessibility guidelines too) I was amazed!

The other thing is if you built a building and it didn't pass building regulations you'd get sued by the client...

Dave on March 6, 2009 6:40 AM

Here's the reference for the story I mentioned about an XHTML error causing CodeProject to disappear.

How to Stop Google Indexing Your Site. A Bedtime Story.
By Chris Maunder

http://www.codeproject.com/KB/server-management/Google_Indexing_Problem.aspx

John on March 6, 2009 6:44 AM

target is one of the most annoying things in web pages. I've adopted the habit of opening *all* links in a new tab (middle mouse button) in the background because this way, the experience is consistent.
This way, I know what will happen when I click that link (well except for the hiccup when Adobe Acrobat loads into memory).

If I were the worlds dictator I would force web browser developers to inform their users with a big red sign that the website, they're currently looking at, was not written carefully.

SealedSun on March 6, 2009 6:49 AM

I agree with you - if you see no ACTUAL TANGIBLE benefits using true XHTML (i.e. application/xhtml+xml) then it makes no sense to use it. However, if you like to mash together SVG or MathML with your HTML there is some benefit (though IE doesn't support those technologies and you'll have to use server-side negotiation for fallback to text/html).

Also, I thought there was an official W3C validator for CSS?

Jeff Schiller on March 6, 2009 6:49 AM

I agree with this post on the level that it is a pain to validate and the rules seem arbitrary, but I don't agree that it doesn't matter. For persons with Low vision/blindness who use text only browsers or screen readers, it is very important to have valid markup.

I look at valid XHTML the same way I look at XML. Well formed, valid markup is only a problem if you don't think about it until the end. Start from the beginning with valid markup as the goal, and you're fine.

Josh on March 6, 2009 6:52 AM

If you see this logo on a site, what does it mean to you?

It means very little: slightly less than half of pages using detectable validation icons actually passed validation. (http://dev.opera.com/articles/view/mama-markup-validation-report/ )

There's some relevant discussion of the reasoning behind HTML 5's conformance requirements at http://stackoverflow.com/questions/432933/will-html-5-validation-be-worth-the-candle/446732#446732

Philip Taylor on March 6, 2009 6:54 AM

Validation is pretty much at the very, very, very, bottom of priorities in any site I make. And even if by some miracle I ever actually got to this proverbial bottom, I wouldn't do it just because I think it's such a waste of time.

Boy, I imagine the quality of the code that you write is just superlative. Do you just vomit it out, and so long as it (sometimes) runs to the intended effect, ay okay?

Making correct rendered output is no different than ensuring your solution doesn't compile with thousands of warnings because of ignorant or sloppy coding practice. In the end you end up with something that is tremendously more maintainable.

Making valid HTML *from the outset* is trivial, and it's only when hacks do their worst and someone else inherits it that it becomes much more of a toss-up if it's worthwhile going back and fixing up their monstrosity.

And while Jeff mentioned the limited benefits of making StackOverflow validate, the reality is that the errors in it weren't minor little errors, they were *egregious* errors. Getting to a clean state offers them the ability to go forward and immediately catch and eliminate those errors quickly and easily, instead of being Yet Another Hack pontificating about how much of a waste of time it is.

Does it matter? To end users, probably not (unless they're blind or aren't using the majority browser, but who cares about those people anyways right?), however it absolutely speaks volumes about the care and concern put into the product by the developers.

Dennis Forbes on March 6, 2009 6:56 AM

Actually, Pandora was never *in* the box. She was the one who opened it. Now, there may have been some who wanted to stuff her in the box afterward but that's another story...

ChrisL on March 6, 2009 7:01 AM

I don't understand much about this stuff, but I think that validating against XHTML means that your website can be parsed by any XML parser, with that in mind there might be some browser and search engines optimizations when you validate as XHTML strict. But even if there are optimizations, that doesn't really matter, why would you want to make the job easier on google servers or in your user browser. People already got dual-cores and google has farms of them...

Hoffmann on March 6, 2009 7:03 AM

Not writing valid (x)HTML/CSS, etc. is just a sign of pure laziness.

If you're going to write code, whether it's markup, script or compiled, then you do it right, period. There is no excuse to write code that doesn't validate.

If you can't write markup that validates then you truly don't know or appreciate the markup you're writing.

Richard Fleming on March 6, 2009 7:07 AM

In a previous blog post (http://www.codinghorror.com/blog/archives/000723.html), commenters discussed the fact that Google's main page has dozens of validation errors. It was mentioned that bandwidth is the issue; when you're trying to serve a very simple page to a significant percentage of the internet's users each day, whether you specify that your style is text/css explicitly or let the browser figure it out starts to matter. Most websites don't serve billions a day, but a lot aspire to do so; if it's good enough for Google, why not for the rest of us?

As far as browser compatibility is concerned, I think it's a pipe dream. The ecosystem of browsers that fail to correctly support standards is already too large; there's insufficient incentive to be standards-compliant when standards-compliant code isn't going to render correctly on a significant percentage of your user base. I much prefer the jQuery philosophy of treating browsers as separate environments and adding the layer of abstraction needed to deal with their individual quirks. It's an extensible model that can adapt to new members of the browser ecosystem, as opposed to wiring down some CSS according to the spec and then praying that all browser authors understood the spec as well as you did.

The web is a jungle. And it's nice to imagine that standards will bring some civilization to that jungle, but I think it's too optimistic to believe they will.

Mark T. Tomczak on March 6, 2009 7:10 AM

Next you'll be telling us IE6 is ok! :)

Jo on March 6, 2009 7:11 AM

It's much more important to test your web site on all the platforms your visitors are likely to use.

Include all relevant web browsers, and maybe the most common screen reader (JAWS). If it works properly on all of those, you're golden.

Since adherence to HTML and XHTML strict rules does not guarantee platform compatibility, you should run these tests anyway. So why not cut out the middle man and just run the tests, and skip the strict? :)

Vance on March 6, 2009 7:12 AM

Uh. I am a bit bored with these discussions, but let me say this: client side also has lots to learn. It takes some time and effort to find hows and whys.
The fact that browser handle HTML in forgiving manner does not mean that one shouldn't
spend time trying to learn markup.

Yes, that is true, HTML 4.01 strict DTD does noe define target attribute. Frameset DTD does (leaving alone the discussion, is it up to the author to decide how that link should open.)

Regarding width=80 vs. style=width:80—these both should be absent from markup.
Style attribute is very nice to have in DOM, but it is ugly in markup, even if it is allowed.
And validation _is_ easy, regardless of the doctype. Producing meaningful markup is not so easy.

Does avoiding syntax errors in you C# code matter? Of course it does—your code won't run otherwise. Invaid XHTML proper (served with correct MIME type) will give you yellow screen of death. Invalid HTML most likely will be displayed, but that's a poor excuse.

Rimantas on March 6, 2009 7:22 AM

You should at least open and close your html tags and nest them correctly, and have a doctype on your page. If your doctype says html or xhtml is not so important. That's the rules I must say are absolutely necessary. Otherwise I should say you are a careless, sloppy and motherfucking lazy ignorant jerk!
Why is programming code so important to have well-structured but not you're html? Why should not html abide the same strong rules as your server code does?

Jeff, believe in HTML!!!! And love your CSS!!! And don't use inline styles, put a class or a id on it instead.

jerko on March 6, 2009 7:23 AM

From a purely practical standpoint, validation does not matter and will never matter. It's kind of like suggesting that we ditch SMTP and implement something totally new to cut down on spam. It's not going to happen.

Validation on the web will never be the norm. It's just too late. That battle was lost (due to natural selection) about 15 years ago.

Chase Seibert on March 6, 2009 7:24 AM

I do my best to write valid XHTML or HTML Transitional code because It makes javascript and CSS debugging so much easier. I've solved dozens of cross-browser javascript and CSS issues simply by getting the HTML correct. Perhaps if you had valid HTML as a base, you wouldn't have so many complaints about javascript and CSS.

Scott on March 6, 2009 7:26 AM

I always make an effort to validate for the reasons given above, when you make a mistake it's easier to find using the validator. w3c also offer a CSS validation tool which is useful for the same reasons. I also feel better about the code if I feel I have managed to code to some standard :)

Browser authors seem to be making moves to be compliant gradually too, IE 8 will be compliant apparently, which will be a massive shock to the system for developers used to messing about to get IE to behave (6 was/is dreadful, 7 not so bad).

With the exception of IE = 6 I have found that compliant code seems to render more consistently across browsers than non compliant (unless efforts are made specifically to handle particular browsers in said non compliant code).

JamWheel on March 6, 2009 7:31 AM

the reality is that the errors in it weren't minor little errors, they were *egregious* errors

I don't think egregious is the right word for..

br versus br/
input versus input/
img versus img/

etc, etc, ad nauseam. The validation was marginally useful, because we did have a few extra close tags hanging around in a few pages and it helped us find those. But as I said in the post, for us it was largely a six hour exercise in to-may-to versus to-mah-to.

Overall I liked @filini's list, posted earlier on in the thread:

--
1. Knowing what valid HTML is, is something any web developer should learn, especially if he wants to write CSS and JavaScript
2. Trying to validate as HTML 4.01 Transitional is always a good balance
3. Don't even bother putting the W3C Logo on your page... nobody cares
4. Writing XHTML is nice when the browser is not the only target of your piece of markup
--

When it comes to #4, I'd argue you really want an *API*, not a brain damaged wow, we can scrape your site as XML! mentality.

Jeff Atwood on March 6, 2009 7:31 AM

I've only been a professional web designer for one year and I'm shocked that anyone would find it difficult to comply with strict HTML/XHTML. There are exceptions but it seems to be mostly due to sheer laziness, and betrays a lack of understanding with regard to accessibility (and most often with it comes a failure to grasp separation of content, presentation and behaviour). Aside from that, personally I like to have the assurance that whatever else may be wrong, at least I know that my markup is well-formed.

Jim on March 6, 2009 7:34 AM

Overall, I think that:
1. Knowing what valid HTML is, is something any web developer should learn, especially if he wants to write CSS and JavaScript
2. Trying to validate as HTML 4.01 Transitional is always a good balance
3. Don't even bother putting the W3C Logo on your page... nobody cares
4. Writing XHTML is nice when the browser is not the only target of your piece of markup

Filini on March 6, 2009 7:38 AM

I agree completely with this article. Validating is only a small part of the equation.
The key here is: Far more important is delivering features that delight your users, or getting the job done.

Excellent reading!

Alejandro Cuervo on March 6, 2009 7:39 AM

It isn't hard to validate as xhtml strict if you set out with the intention that it will be valid and check it regularly as you build it.

Writing a site without using a validator (or having any knowledge of the rules) and then trying to make it validate is akin to writing a Java or C++ program in notepad and then compiling it for the first time once you've finished it.

I use the tidy/sgml plugin for Safari and Firefox that gives me a little tick in the corner of my page when my HTML is valid and a cross when it isn't. It was one of the first web development plugins I installed (before Firebug even existed) and is still the one I consider most indispensable.

It is true that making your site valid isn't prerequisite for it to work properly, and it doesn't directly impact the user experience. And sometimes it is neccesary to intentionally make your code invalid to improve the user experience (though target=_blank was a bad example), but you severely weaken your argument when you admit that you didn't do it because it was too hard! I managed it with my own site - does that mean I'm smarter than you? I can assure you it's not because I enjoy pain, it's just that I didn't find it to be painful.

Deciding that it's okay for your site markup to be invalid is an advanced decision best made by developers who have experience of developing several valid sites and are therefore qualified to make that call. You've built one site that kinda-sorta-almost validates - that doesn't now suddenly make you an expert on the benefits and disadvantages of valid html.

As you correctly state, validating your site makes you a better (more informed) coder, and you know what? Making it 100% valid xhtml will make you an *even* better coder. As someone who clearly considers bettering your own expertise to be worthwhile, I would have thought that would be incentive enough.

Nick on March 6, 2009 7:42 AM

To put it another way.
as a user, I couldn't care less if the html is validated or not, as long as the website is working and giving me value in return.

Alejandro Cuervo on March 6, 2009 7:43 AM

I'm guessing most of the commenters here are programmers, so I get why having your site validate sets your geek pleasure centers tingling, but let's get real here: For most people who author web content, validation is neither simple nor obvious. In fact, I'd argue the fault tolerance of HTML is precisely why the web took off the way it did: it made it easy to anyone to get in on the game. Now you can argue about whether that's actually a good thing, but (to unmangle Jeff's metaphor) the evils are out off the box and they're *not* going back in. XHTML is dead in the water. As TBL put it: It is necessary to evolve HTML incrementally. The attempt to get the world to switch to XML, including quotes around attribute values and slashes in empty tags and namespaces all at once didn't work. (http://dig.csail.mit.edu/breadcrumbs/node/166)

James on March 6, 2009 7:44 AM

As usual most people here are missing the point of the post...

Browsers are already overly forgiving of everything that gets passed into them which is why half TheDailyWTF.com content is about web developers. Validation is a nice idea but unless browsers (all of them) actually rejected invalid content, what is the point?

The problem here is that browser won't reject content or else they lose users, and developers aren't going to validate so long as browsers continue to accept their crap.

Which all leads back to the original question _why_validate_?

HB on March 6, 2009 7:54 AM

Requiring the JavaScript garbage in HTML tags is absolutely stupid.

What if users of your site are running Firefox with the NoScript extension enabled? By default large portions of your markup won't work *at all* until they allow your domain. And you might not have done anything yet to earn their trust to enable JavaScript on your site.

This is a case where the presentation (HTML) conflicts with the code (JavaScript). I don't care if the client renders it or not.

jtimberman on March 6, 2009 7:55 AM

The reasons I validate my website:?

If the validator spits out This document was successfully checked as XHTML 1.0 Strict!, it just feels good!

It's the same as compiling on Visual Studio using Warning Level 4. Yes, the program will work even if there are tons of warnings, but you simply start to worry anyway (only if you're pessimistic... and usually that's a good thing if you're a developer)

But, you need to do it from the start and you need to check it often. After some point it's simply useless to fix all warnings. Just close your eyes and pray.

Putting the w3c xhtml banner on the page is useless though. Nobody of your visitors care, except you as a responsible developer should.

ZuBsPaCe on March 6, 2009 8:02 AM

Exactly as Thomas said. If everyone is out doing their own thing with invalid markup, and it works, browsers will never standardize.

HardCode on March 6, 2009 8:06 AM

Why pilot a boat when you could captain a battleship?

I don't see how someone could disagree with captaining a battleship because that is awesome.

wickethewok on March 6, 2009 8:09 AM

Yeah, just throw some garbage on the page and let browsers forgive you and parse the page anyway. Next year they decide to adhere a standard and won't parse your hacks anymore but why would you care about something you will have to do a year from now?..

Alexey Bobyakov on March 6, 2009 8:09 AM

XHTML validation from the group up is easy.
Additionally, I don't get why any programmer would prefer an inconsistent model (HTML) over a consistent model (XHTML).

Matt on March 6, 2009 8:09 AM

I'll follow standards and validate my web as soon as the new standards are better than the old ones. Not for browsers, but for me. So far, it seems that each new standard is making things harder to do.

Vlasta on March 6, 2009 8:13 AM

I understand the point about migrating existing html code to a standard compliant format, because that really is a waste of time if you follow the if it ain't broke, don't fix it mantra... but for new websites, if you design it from the beginning to be validated its not a problem, and it provides the (questionable) advantage of having your markup in a consistent format that complies with standards, allowing you to use many tools without worrying about specific issues.

There is also the (again questionable) advantage that when a tool fails it is actually the tool's fault for not complying with the standard correctly.

I can't say too much from experience though since I learned HTML from the W3C docs and have always written it to pass the XHTML 1.0 validation check and to be compliant with that standard...

What I don't quite get is why a new website would be designed with disregard to the standards... I don't see any advantage myself, just a few (small) disadvantages. I also really disagree with the idea that using XHTML complicates things... I have no idea where that idea comes from, its not like it adds any appreciable functionality absent in HTML 4.01 or vice-versa...

jheriko on March 6, 2009 8:13 AM

Incidentally the question about whether it matters if you use

td width=80
br/

versus

td style=width:80px
br

betrays a particularly poor understanding of mdern web development. You clearly intend it as rhetorical but the answer is actually yes, it does matter, and here's why:

HTML is not a visual formatting language, it is a language for defining document structure. The inclusion of stylistic tags such as font and b and attributes such as width/color was a grievous mistake, which has now been rectified by deprecating those elements, and browsers only support them for backwards compatibility.

The correct way to define the appearance of your site is with css styles. Doing this separates presentation from content, and makes your content more re-usable. Of course to really take advantage of that you should move your styles out of the style attribute and into an external css file (which is why the style attribute is also deprecated as of xhtml 1.1).

While we're on the subject, your use of td and br tags as an example is also quite telling, as these are both quite controversial due to the tendency of bad web developers to abuse them. br/ adds a line break, which is almost always a presentation rather than content-related decision. The only legitimate use of this that springs to mind is formatting an adress or a poem, where the line break actually has a semantic meaning, but it is more often used for all kinds of things that it shouldn't be, such as adding white space between lines of text or images. I certainly can't imagine why you'd ever want one inside a table cell.

And of course the humble table cell - an element whose intended purpose is to mark up tabulated data, but which for some strange reason is still regularly used for visual formatting by lazy developers who can't be bothered to figure out how to create columns, rows or grids using divs and css (despite the fact that it is actually much easier and more flexible than using a table).

It's hard to explain to software engineers why this stuff matters because they typically don't see html as a data structure. The best analogy I can find is that using a table to mark up a list or format a paragraph is like using a two dimensional array to store the characters in a string - it's the wrong tool for the job because it neither represents the structure of the data, nor offers the optimal method for retrieving and manipulating it.

Certainly the end user doesn't care which you use, but as a programmer it should matter to you that your code is well written, and given the inherently open source nature of the web it should matter even more since your bad code is there for all to see (and judge).

Nick on March 6, 2009 8:16 AM

There is a href=http://www.jslint.com/JSLint/a for validating JavaScript by the way :D

Sesarr on March 6, 2009 8:18 AM

I use jslint to validate my JavaScript, and I highly recommend it. I also think html validation is important so that browsers know that they will not break your site when they release a new version.

Kirk Cerny on March 6, 2009 8:22 AM

Alexey,

Future browsers adhering to a standard is incredibly unlikely, and that is precisely why so few people see a need to validate their HTML.

Look again at the list of failing websites. They are in a category that makes up 94% of the content on the world wide web. What market forces will compel Microsoft to make their HTML valid? What about Sourceforge? Slashdot? CNN? The main page of Google?

Re-coding a page costs resources: money and time. If current browsers render the pages correctly, the incentive just isn't there (except to simplify debugging). If a future browser didn't render these pages correctly, then the people who find Microsoft, Slashdot, Sourceforge, or CNN useful in their daily lives wouldn't use that browser. So websites have no incentive to standardize, and browser authors have disincentive to reject non-standard HTML. For the scenario you described to play out, a browser would have to exist that was such a 'killer app' that people felt a need to invest time and effort in supporting that browser.

In fact, Jeff, that might be a fun topic to touch on at Coding Horror. Could there exist a killer app for browsers, a technology that couldn't be coded on the web, had to exist in the browser itself, and was so useful that the browser author could compel websites to be compatible with them?

Incidentally, Even if such a scenario occurred, it may not be enough to bring standardization. People would be coding to that browser's interpretation of HTML, not necessarily to the standard. Who's to say our magic 'killer app' implemented the spec correctly with no deviations?

Mark T. Tomczak on March 6, 2009 8:24 AM

Standards are there to aspire to, but for some reason send developers into frenzied defensive panic. Surely it is better to aspire to perfection, even if it isnt acheived. Sometimes, due to business requirements, it is plain impossible to accommiodate.

I work in QA, I am tired of dealing with shoddily coded web apps, when running pages through simple validators, and observing a few simple standards would avoid a lot of issues.

Well coded pages will pass Transitional, in general, which in my eyes is good enough. There is no standards police to make you adhere to standards, but at least using them as a guide would save us all from the proliferation of shoddy websites by amateur coders who think they know best (i.e. coders who dont want to do much work, to see an end result).

jaffamonkey on March 6, 2009 8:30 AM

While I'm building the templates for a site I'm working on, I'm regularly passing them through a html validator and the css through a validator also. Part of the reason, I admit, is to learn where I'm being clumbsy - especially with the css. Doubling up on work, overwriting previous rules etc etc.

I know it means absolutely nothing to the end users if I have completely valid html and css. But if I can get it right now, why wouldn't I for a couple of extra minutes work?

Saying that, I probably wouldn't go back and validate a site already complete and in production. Too much work for little benefit.

`Josh on March 6, 2009 8:30 AM

Not every client is a mainstream web browser, and that's when HTML validation does matter. All those non-browser apps will have to struggle to infer the intent of broken HTML to extract information.

Subbu Allamaraju on March 6, 2009 8:32 AM

Re-coding a page costs resources: money and time. If current browsers render the pages correctly, the incentive just isn't there (except to simplify debugging).

That's the problem. Manager's don't care about your markup or your code. Client's don't care either. Even technology, like browser, do not necessarily care.

But you, the guy maintaining the mess, you should care, for the same reasons you should refactor the codebase of a program.

ZuBsPaCe on March 6, 2009 8:34 AM

More comments»

The comments to this entry are closed.