I <3 Steve McConnell*
Coding Horror
programming and human factors
by Jeff Atwood


27 posts from August 2004

August 31, 2004

Unbreakable Links Revisited

Philipp Lenssen pointed out that my concept of Unbreakable Links is, unsurprisingly, not a new one. It's also known as

All of these terms really refer to the same thing: using a search engine to build an unique URL. However, there are some not-so-obvious problems you'll encounter when building links this way. To work around the problems, the Robust Hyperlinks paper proposes using a combination of techniques:

  1. A Unique Identifier (UID) is a name unique within the document, as per ID attributes in SGML/XML. These survive the most violent document changes, except its own deletion.

  2. A Tree Walk describes the path from the root of the document, through internal structural nodes, to a point within media content at a leaf.

    In practice, tree walks are the central component of robust locations. Since tree walks incrementally refine the structural position in the document as the walk proceeds from root to leaf, they are robust to deletions of content that defeat unique ID and context locations. Thus, tree walks are especially helpful for documents such as those that transclude dynamic content, as with stock quotes, where the content itself changes while the structural position remains constant.

    We describe tree walks with a sequence of node child numbers and associated node tags (generic identifiers), terminating with an offset into a media element. This is both a simpler, less expressive, and more redundant, representation than is allowed by XPointer. For example, consider the following tree walk into a particular HTML document:

    21/Professor/8 0/ 0/ADDRESS 1/H3 0/BODY 0/HTML

  3. Context is a small amount of previous and following information from the document tree. We propose a context record containing a sequence of document content prior to the location, and a sequence of document content following the location. For example, for the location described by the tree walk above, let us suppose the word "Professor" is found in a sentence fragment that reads "congratulations on her promotion to Professor in the Computer Science Division". The context descriptor could be:

    her+promotion+to+Professo r+in+the+Computer+Science

They also propose appending this information to the URL in a querystring-- so you have both an absolute link and a relative fallback:

Given that lexical signatures are a good way to augment URLs, we are left with the issue of how to associate these with hyperlinks. Our primary requirement is that the solution fit into the existing Web infrastructure moderately well. Our proposal is to append a signature to a URL as if it were a query term. That is, if the URL is http://www.something.com/a/b/c, and the designated resource has the signature w1,...,w5, then the robust URL is

http://www.something.com/a/b/c?lexical-signature="w1+w2+w3+w4+w5"

I do think, at some point in the future, all links will be constructed this way. The existing absolute link system breaks down over time, and I think it's fairly obvious by now that absolute keyword search is the most effective navigation metaphor for the web. My apologies to Yet Another Hierarchically Organized Oracle, but that style of tree-based directory navigation was always driven by the lack of a competent search engine, not actual choice.

Try building your own unbreakable link with The Incredible LinkTron 5000(tm)!

Posted by Jeff Atwood    3 Comments

August 30, 2004

You Think You Hate Mondays?

The silicon chip inside her head
Gets switched to overload.
And nobody's gonna go to school today,
She's going to make them stay at home.
And daddy doesn't understand it,
He always said she was as good as gold.
And he can see no reason
'Cause there are no reasons
What reason do you need to be shown?

Tell me why?
I don't like Mondays.
Tell me why?
I don't like Mondays.
Tell me why?
I don't like Mondays.
I want to shoot
The whole day down.

The telex machine is kept so clean
As it types to a waiting world.
And mother feels so shocked,
Father's world is rocked,
And their thoughts turn to
Their own little girl.
Sweet 16 ain't so peachy keen,
No, it ain't so neat to admit defeat.
They can see no reasons
'Cause there are no reasons
What reason do you need to be shown?

Tell me why?
I don't like Mondays.
Tell me why?
I don't like Mondays.
Tell me why?
I don't like Mondays.
I want to shoot
The whole day down.

All the playing's stopped in the playground now
She wants to play with her toys a while.
And school's out early and soon we'll be learning
And the lesson today is how to die.
And then the bullhorn crackles,
And the captain crackles,
With the problems and the how's and why's.
And he can see no reasons
'Cause there are no reasons
What reason do you need to die?

Tell me why?
I don't like Mondays.
Tell me why?
I don't like Mondays.
Tell me why?
I don't like Mondays.
I want to shoot
The whole day down.

-- Boomtown Rats, I Don't Like Mondays

Brenda Spencer: "I just started shooting, that's it. I just did it for the fun of it. I just don't like Mondays....I just did it because it's a way to cheer the day up. Nobody likes Mondays." There's no parole for the sniper who hated Mondays, and survivors still remember the 1979 Cleveland Elementary shooting.

Posted by Jeff Atwood    7 Comments

August 29, 2004

The Incredible LinkTron 5000(tm)!

I talked in a previous post about Unbreakable Links-- that is, stating every URL in terms of a Google search rather than an absolute address. Great concept, but how do you determine which words on a web page are most likely to generate a unique search result? Well, wonder no more:

Behold the Incredible LinkTron5000 (tm)!

As you might imagine, this involves quite a bit of google abuse -- all of which is pre-cached for performance. Well, mostly pre-cached. If you have a page with a lot of words that I can't find in a dictionary, the LinkTron will take a little while to process it.

When researching this project, I found an invaluable source of information at Philipp Lenssen's Google Blogoscoped. For instance, this frequency distribution for the 26,000 most used words online. There's also a cool word frequency colorizer which visually depicts the "uniqueness" of a target URL.

Posted by Jeff Atwood    13 Comments

August 28, 2004

Java vs. .NET RegEx performance

I was intrigued when I saw a cryptic reference to " the lackluster RegEx performance in .NET 1.1" on Don Park's blog. Don referred me to this page, which displays some really crazy benchmark results from a Java regex test class-- calling C#'s regex support "20 times slower than [Java]." First of all, them's fightin' words, and second, those results are just too crazy to really make sense. Why would C# be over an order of magnitude slower than Java at a classic computer science problem like a regular expression parser? I don't believe it.

So I downloaded the Java JDK, a freeware Java development environment, and I ran that benchmark class on my own machine:

Regular expression library: java.util.regex.Pattern

RE: ^(([^:]+)://)?([^:/]+)(:([0-9]+))?(/.*)
  MS    MAX     AVG     MIN     DEV     INPUT
  46    16      0.0046  0       0       'http://www.linux.com/'
  61    16      0.0061  0       0       'http://www.thelinuxshow.com/main.php3'
  61    16      0.0061  0       0       'usd 1234.00'
  172   16      0.0172  0       0       'he said she said he said no'
RE: usd [+-]?[0-9]+.[0-9][0-9]
  MS    MAX     AVG     MIN     DEV     INPUT
  0     0       0.0     0       0       'http://www.linux.com/'
  15    15      0.0015  0       0       'http://www.thelinuxshow.com/main.php3'
  15    15      0.0015  0       0       'usd 1234.00'
  15    15      0.0015  0       0       'he said she said he said no'
RE: \b(\w+)(\s+\1)+\b
  MS    MAX     AVG     MIN     DEV     INPUT
  0     0       0.0     0       0       'http://www.linux.com/'
  31    16      0.0031  0       0       'http://www.thelinuxshow.com/main.php3'
  31    16      0.0031  0       0       'usd 1234.00'
  47    16      0.0047  0       0       'he said she said he said no'
Total time taken: 266

Note that I only ran this for the "built in" Java regex library java.util.regex.Pattern; the benchmark has template code for dozens of alternative regex parsers. I snipped that code out for simplicity. The standard Java regex class performs very well according to the results shown on the referring page.

I then converted the sample code to VB.NET, and got these results:

Regular expression library: System.Text.RegularExpressions

RE: ^(([^:]+)://)?([^:/]+)(:([0-9]+))?(/.*)
  MS    MAX     AVG     MIN     DEV     INPUT
  32    3.033   0.0032  0.0025  0.0325  'http://www.linux.com/'
  63    3.04    0.0063  0.0053  0.0325  'http://www.thelinuxshow.com/main.php3'
  122   3.053   0.0122  0.0109  0.0327  'usd 1234.00'
  234   3.067   0.0234  0.0212  0.0328  'he said she said he said no'
RE: usd [+-]?[0-9]+.[0-9][0-9]
  MS    MAX     AVG     MIN     DEV     INPUT
  20    0.729   0.002   0.0017  0.0073  'http://www.linux.com/'
  40    0.732   0.004   0.0036  0.0073  'http://www.thelinuxshow.com/main.php3'
  63    1.748   0.0063  0.0056  0.0175  'usd 1234.00'
  82    1.751   0.0082  0.0075  0.0175  'he said she said he said no'
RE: \\b(\\w+)(\\s+\\1)+\\b
  MS    MAX     AVG     MIN     DEV     INPUT
  19    0.25    0.0019  0.0017  0.0037  'http://www.linux.com/'
  38    0.252   0.0038  0.0034  0.0038  'http://www.thelinuxshow.com/main.php3'
  62    4.961   0.0062  0.0053  0.0497  'usd 1234.00'
  81    4.963   0.0081  0.007   0.0497  'he said she said he said no'
Total time taken: 170

So.. yeah. I don't know what kind of crack the guys at manageability.org are smoking, but I can't seem to find a local vendor.

You may be interested in my VS.NET 2003 console solution which includes both the stripped down java class and the VB.NET equivalent, so you can run this test on your own machine. A few notes on the test:

  • My PC is an Athlon FX-53 (3800+), although the relative scores are all that matter. Just for fun, I'll try both versions on a few other boxes I have here, and post the results in the comments.
  • No optimizations were enabled for either the Java or .NET solutions.
  • Console apps were executed directly without a debugger. Having a debugger running will double your runtime. I check for this in the .NET version and display a warning if you have the debugger attached.
  • The timing code is a little different between the Java and .NET versions. The .NET conversion uses the QueryPerformanceCounter windows API call to get accurate sub-millisecond timing. One side effect of this is that I have to make two passes to get all the benchmark results: the first pass times each of the 120,000 regex calls individually into an array, and the second pass times just the total execution time. You'll notice that the first pass takes twice as long; that's due to the overhead of calling QueryPerformanceCounter 120,000 times. The upside is that I provide far more accurate timing results, as you can see in the table above. I think the built in Java timer System.currentTimeMillis is kind of like the .NET DateTime.Now.Ticks, eg, limited to a resolution of about 10ms. This was OK back in early 2002 when the Java benchmark was originally constructed, but on today's PCs, it's kind of tough to measure a single regex call with a granularity of 10ms..
  • I think there's a small bug in the original source. Notice that the innermost timing loop takes a start time before it enters the loop, then calculates the elapsed time inside the loop against that start time. This seems wrong to me, because each loop will reflect not only its time but the total time since the loop was entered. Anyway, I've preserved this "feature" in my VB.NET source so the timings will be comparable.

Even though .NET appears to perform almost 40 percent faster than Java in this test, it's still interesting that in only 6 months since that Java benchmark was run (813ms), I can produce Java code that runs over three times as fast (266ms)! So before we put our language war hats on, consider that perhaps the real winner here is the hardware. Doh!

Posted by Jeff Atwood    30 Comments

August 27, 2004

Net.WebClient and GZip

The Net.WebClient class doesn't support HTTP compression, eg, when you add the Accept-Encoding: gzip,deflate header to your request:

        Dim wc As New Net.WebClient
        '-- google will not gzip the content if the User-Agent header is missing!
        wc.Headers.Add("User-Agent", strHttpUserAgent)
        wc.Headers.Add("Accept-Encoding", "gzip,deflate")
        '-- download the target URL into a byte array
        Dim b() As Byte = wc.DownloadData(strUrl)

What you get is a gzipped array of bytes. It's pretty easy to add the missing gzip support, though. First, download the SharpZipLib and add a reference to ICSharpCode.SharpZipLib to your project. Then it's only a few more lines of code..

        Dim gz As New GZip.GZipInputStream(New MemoryStream(b))
        Dim intSizeRead As Integer
        Dim unzipBytes(intChunkSize) As Byte

        Dim OutputStream As New MemoryStream
        While True
            '-- this decompresses a chunk
            '-- remember the output will be larger than the input (one would hope)
            intSizeRead = gz.Read(unzipBytes, 0, intChunkSize)
            If intSizeRead > 0 Then
                OutputStream.Write(unzipBytes, 0, intSizeRead)
            Else
                Exit While
            End If
        End While

        '-- convert our decompressed bytestream into a UTF-8 string
        Return System.Text.Encoding.UTF8.GetString(OutputStream.ToArray)

And voila, the bandwidth, you have saved eet! How do I know this actually works? Using my network sniffer of course..

Posted by Jeff Atwood    8 Comments

August 26, 2004

Sniff this!

I've occasionally used network sniffers in the past, but with the rise of REST, XML, SOAP and .NET Remoting in the last year, sniffing has become an essential part of my development toolkit. I've evaluated a bunch of network sniffers, including the excellent open-source Ethereal, but the one I keep coming back to is Etherdetect:

screenshot of EtherDetect application

Etherdetect isn't free, and it isn't perfect, but it offers the best blend of functionality and ease of use that I've found. Peeking behind the scenes at network traffic has solved some tough performance and debugging problems in our .NET apps. Highly recommended.

One tip: you typically can't sniff traffic going to localhost, at least not without some special workarounds; the loopback TCP/IP stack behaves very differently than the "normal" network paths. Also, you'll need the latest WinPcap libraries installed, particularly if you have a hyperthreading CPU.

Posted by Jeff Atwood    3 Comments

August 25, 2004

Building Unbreakable Links

I was reading through some of the DataGrid Girl's oh-so-cute article links, and I encountered a few dead ones. It's not really Marcie's fault; dead links are inevitable on any page as it ages. Such is the nature of absolute links. For example, this one:

http://msdn.microsoft.com/msdnmag/issues/02/03/cutting/cutting0203.asp

A few years ago, I had this thought: why do we use traditional absolute URLs any more? Why not build all of our links using relative Google search terms? For the above broken link, we can restate it like so:

http://www.google.com/search?q=msdnmag+asp.net+data+shaping&btnI=1

All you need to do is run a quick function to determine three or four of the most unique words on the page, then feed them to Google as query terms with the "I'm feeling lucky" parameter. Now you have a permanent, unbreakable link to that content. Well, permanent unless Google goes out of business, or the content disappears from the internet completely.

Of course, it's unlikely everyone will adopt this approach for the most obvious reason: Google would become unbelievably powerful. They would be the "link DNS" for the entire internet. But as a practical solution to Marcie's problem, I think it is totally workable. Whenever I link to articles in my code, I try to do so through very specific google search terms, which are likely to produce valid links many years from now-- even if the content moves to a different place on the internet.

All I need is some sort of web-based tool to automatically parse a page and produce 4-5 unique words from that page. It's sort of like googlewhacking, but with a more practical bent.

Posted by Jeff Atwood    4 Comments

August 24, 2004

Why aren't my optimizations optimizing?

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." - Donald Knuth
Michael Teper's blog has a great post about a bread and butter optimization scenario involving string replacement. After implementing three logical alternatives, Mike looks at the benchmark runs and asks,
Why aren't my optimizations optimizing?
Optimizing code is a tricky business. I would have tried the exact same things-- probably in the same order. Many times approaches I just assume will be "faster" turn out not to be. That's why I tell developers, always measure performance. Never assume anything will be faster or slower until you've actually measured it to be so-- you'll be surprised how often your assumptions are wrong. Unfortunately, sometimes the way you measure performance can even be flawed. That's what revealed Mike's third optimization, was, in fact, an optimization:

it turns out that Replace is only fast when the input string does not contain the string (or character) that is intended for replacement. When the string does contain it, the performance of CleanString class drops, and, as expected, the character array exhibits better perf.

If you must optimize, make sure you're benchmarking valid test cases, with a reasonable set of test data, to ensure that you actually have an improvement. And before "improving" anything, take the optimization rules of M.A. Jackson to heart:

Rules of Optimization:

Rule 1: Don't do it.
Rule 2 (for experts only): Don't do it yet.

And I would add a third: don't optimize work that doesn't have to be done. Don't get me wrong, performance is incredibly important...

The basic advice regarding response times has been about the same for almost thirty years [Miller 1968; Card et al. 1991]:
  • 0.1 second is about the limit for having the user feel that the system is reacting instantaneously, meaning that no special feedback is necessary except to display the result.
  • 1 second is about the limit for the user's flow of thought to stay uninterrupted, even though the user will notice the delay. Normally, no special feedback is necessary during delays of more than 0.1 but less than 1.0 second, but the user does lose the feeling of operating directly on the data.
  • 10 seconds is about the limit for keeping the user's attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish, so they should be given feedback indicating when the computer expects to be done. Feedback during the delay is especially important if the response time is likely to be highly variable, since users will then not know what to expect.
... but so is having a functioning, stable system. It's up to you to decide how to balance that. For more, there's an excellent treatment of this topic in chapter 9 of Programming Pearls, and Microsoft Performance Blogger Rico is a fun (and .NET specific) read as well.

Posted by Jeff Atwood    0 Comments

August 23, 2004

Showstopper!

A friend of mine recently returned the book Showstopper! after an extended loan. If you haven't heard of this book, allow me to quote the Amazon.com editorial summary:

Showstopper! is a vivid account of the creation of Microsoft Windows NT, perhaps the most complex software project ever undertaken. It is also a portrait of David Cutler, NT's brilliant and, at times, brutally aggressive chief architect.

Cutler surely ranks as one of the most impressive software engineers the field has ever produced. After leading the team that created the VMS operating system for Digital's VAX computer line--an accomplishment that most would regard as a lifetime achievement--he went on to conceive and lead the grueling multi-year project that ultimately produced Windows NT. Both admired and feared by his team, Cutler would let nothing stand in the way of realizing his design and often clashed with his programmers, senior Microsoft management, and even Gates himself.

I hadn't looked at this book since I originally read it in 1996, and I found myself casually skimming through it, eventually re-reading the entire thing. It's a critical part of Microsoft's history. Think about where Microsoft would be if the NT project, which began way back in 1988, had failed. Can you imagine running some variant of Windows 95 today?

It's also interesting to note that nobody is writing new operating systems any more. The world has devolved into UNIX and NT camps exclusively. Without NT, I think we'd all be running UNIX at this point, for better or worse. It certainly happened to Apple; their next-generation Copland OS never even got off the ground. And now they're using OS X which is based on Unix. There are some uncanny observations in the book that foreshadow this divide:

Besides, NT would still meet the goals closest to Cutler's heart: portability, reliability, and the ability to provide an alternative to Unix, the splintered high-end operating program.

The last goal was crucial to Cutler. "Unix is like Cutler's lifelong foe," said one team member who'd worked with Cutler for nearly two decades. "It's like his Moriarty [Sherlock Holmes's nemesis]. He thinks Unix is a junk operating system designed by a committee of Ph.D.s. There's never been one mind behind the whole thing, and it shows, so he's always been out to get Unix. But this is the first time he's had the chance.

In many ways, the story of Windows NT is the story of Dave Cutler: he comes across as the Ted Nugent anti-hero of software architects. There are some very amusing anecdotes in the book about his gonzo management style:

In truth, nobody worried about Rashid's etiquette. Of all people, Cutler deserved indelicate treatment. Other Microsoft leaders viewed him as a bully. One senior executive usually responded to a Cutler complaint with the succint statement, "Fuck Dave." When asked why, the executive excused his boorishnes with the reply, "Cutler tells me to fuck off all the time."

Cutler keeps an incredibly low profile today, which is strange for an architect of his stature. You won't find many interviews or articles about him. In fact, he still works at Microsoft today, and he was a key reason the 64-bit version of Windows XP even exists in the face of lackluster Intel support for 64-bit x86 extensions.

There are some interesting themes in the book that emerged after a second reading:

  • Eating your own dogfood. I've long been a proponent of this technique. Dogfooding keeps you honest. NT development was perhaps the ultimate dogfood scenario: developing a new OS using the current build of that OS.

  • The importance of R&D. By the time NT was truly viable on the desktop (Windows 2000), it was ten years after the initial 1989 design spec. This speaks volumes about strategic direction and R&D: if large corporations aren't actively planning ten years out, they're probably not going to last very long. Nathan Myrhvold presents a document to Bill Gates on page 31 that outlines the risk of Unix, portable code, and RISC-- all "DOS killers"-- that was absolutely prophetic in hindsight.

  • Process vs. People. It's shocking how little formal process was involved in the development of NT. Microsoft didn't really manage much at all: they just chose to build the company with the smartest people they could find and let them figure it out. This may sound surprising, but it clearly worked for NT, a project of almost unimaginable complexity. More supporting data on this can be found in McConnell's Quantifying Soft Factors editorial.

  • The importance of senior architectural oversight. Cutler goes to great lengths to prevent people from optimizing for x86 in the early development of NT, despite the intense pressure to do so for performance reasons. He intuitively knew that sacrificing portability this early would cripple the future design of the OS. Although, ironically, there's nothing left but x86-- the Alpha, Mips, PPC versions of NT were all discontinued due to lack of market demand-- the NT kernel has evolved and survived, and now lives on the desktops of millions of everyday users, not just "power users".

  • If it sounds like a bad idea, it probably is. eg, Cairo. This was supposed to be Jim Allchin's "vision" for next version of NT, what ultimately became NT 4.0. What the hell was Microsoft thinking? If you can't explain what you plan to do in practical, meaningful terms-- you're probably full of crap. I can certainly empathize with Dave's skepticism about Cairo, and in retrospect, he was correct. Cairo never went anywhere.
One of the last things Dave Cutler mentions in the book resonated with me:
The end of a project was always a difficult time for him. He always pushed to outdo himself, never lingering for long over his achievements and eschewing any examination of his motives and psychology. "My motivation is I like to do this stuff. I just like to do this stuff," he said. "I like to get [my code] done and see it work." Rather than monumental, his concept was Sisyphean. He dared not speculate about the benefit of his labors for society. Nor did he concern himself with his place in the history of technology. He only looked forward, abolishing the past as he went on. "This isn't the end," he said. "Ten years from now we'll be designing another system, and everyone will be sitting around bemoaning that it will have to be compatible with NT. That will happen."

I am not so sure. Unix is 30 years old and would unquestionably rule today's desktop if not for the existence of NT. Is it unreasonable to expect the NT kernel to last as long? In fact, I think it's possible we may not see another "from the ground up" OS developed in our lifetimes.

Posted by Jeff Atwood    16 Comments

August 22, 2004

HTTP Compression and IIS 6.0

HTTP compression is the ultimate no-brainer. The network is really slow, and CPU time is effectively free and geting faster and, uh, "free-er" every day. Compression typically reduces plaintext size by 75 percent: that quadruples your throughput! Every website should be serving up HTTP compressed pages to clients that can accept it. The client indicates ability to accept compressed contents in the request headers:

GET /blog/index.xml HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)
Host: www.codinghorror.com
Connection: Keep-Alive

HTTP compression was available in IIS 5.0, but it was also horribly broken. I know that's a link from a vendor selling a competing product, but I can personally vouch for this-- it sucked. Don't enable compression in IIS 5.0. It's not worth the pain it will inevitably cause you. Fortunately, there is an alternative-- the free FlatCompression ISAPI filter. It's not very sophisticated. All outgoing content of the specified mime type(s) is blindly compressed in real time with no caching, so it's ideal for sites with mostly dynamic content. Most importantly: it actually works, unlike the built in IIS 5.0 compression, and it's free open source. If you control an IIS 5.0 server, you should have the FlatCompression ISAPI filter installed.

One of the things I was looking forward to in IIS 6.0 was a HTTP compression layer that actually worked. I thought I had HTTP compression enabled correctly in IIS 6.0 in the Properties, Service tab, but after looking at some sniffer traces.. not quite. I followed a few walkthroughs, such as the excellent Enabling HTTP Compression in IIS 6.0, and I was still getting spotty results. A few observations on my troubleshooting:

  • Adding the IIS compression filter .dll to the extension manager made absolutely no difference on my server. I'm not sure why people think they need to do that; It has no effect for me, and I tried it both ways a few times.
  • Despite what the MS documentation says, the metabase filename extension lists are not space delimited! They are cr/lf delimited.
  • You must restart IIS to get it to reload any changes you've made to the metabase.
  • There appear to be some non-obvious metabase entries that will prevent compression of script output.
  • Setting the file extensions to "blank" does not cause IIS to compress all content as specified in the documentation.
Getting a variety of static content extensions to compress was easy, but I had an absolute rip of a time getting dynamic script output to compress. You know, .cgi (perl), .aspx, .asmx, etcetera. I followed every suggestion out there with no joy-- the sniffer kept showing uncompressed dynamic output coming back, but all the static files I tried came back compressed just fine. I'm still not sure which metabase setings I changed to get it to work, so I will post the current working version of the relevant IIS metabase sections in their entirety:

<IIsCompressionScheme	Location ="/LM/W3SVC/Filters/Compression/deflate"
		HcCompressionDll="%windir%\system32\inetsrv\gzip.dll"
		HcCreateFlags="0"
		HcDoDynamicCompression="TRUE"
		HcDoOnDemandCompression="TRUE"
		HcDoStaticCompression="TRUE"
		HcDynamicCompressionLevel="10"
		HcFileExtensions="htm
			html
			xml
			css
			txt
			rdf
			js"
		HcOnDemandCompLevel="10"
		HcPriority="1"
		HcScriptFileExtensions="asp
			cgi
			exe
			dll
			aspx
			asmx"
	>
</IIsCompressionScheme>
<IIsCompressionScheme	Location ="/LM/W3SVC/Filters/Compression/gzip"
		HcCompressionDll="%windir%\system32\inetsrv\gzip.dll"
		HcCreateFlags="1"
		HcDoDynamicCompression="TRUE"
		HcDoOnDemandCompression="TRUE"
		HcDoStaticCompression="TRUE"
		HcDynamicCompressionLevel="10"
		HcFileExtensions="htm
			html
			xml
			css
			txt
			rdf
			js"
		HcOnDemandCompLevel="10"
		HcPriority="1"
		HcScriptFileExtensions="asp
			cgi
			exe
			dll
			aspx
			asmx"
	>
</IIsCompressionScheme>
<IIsCompressionSchemes	Location ="/LM/W3SVC/Filters/Compression/Parameters"
		HcCacheControlHeader="max-age=86400"
		HcCompressionBufferSize="8192"
		HcCompressionDirectory="%windir%\IIS Temporary Compressed Files"
		HcDoDiskSpaceLimiting="FALSE"
		HcDoDynamicCompression="TRUE"
		HcDoOnDemandCompression="TRUE"
		HcDoStaticCompression="TRUE"
		HcExpiresHeader="Wed, 01 Jan 1997 12:00:00 GMT"
		HcFilesDeletedPerDiskFree="256"
		HcIoBufferSize="8192"
		HcMaxDiskSpaceUsage="99614720"
		HcMaxQueueLength="1000"
		HcMinFileSizeForComp="1"
		HcNoCompressionForHttp10="FALSE"
		HcNoCompressionForProxies="FALSE"
		HcNoCompressionForRange="FALSE"
		HcSendCacheHeaders="FALSE"
	>
</IIsCompressionSchemes>

The last things I tried were modifying the HcNoCompression* settings, and turning HcDoStaticCompression on for gzip. It's likely one of those.

I enabled HTTP compression for the .cgi script filetype, which covers the PERL scripts that Movable Type uses, and the interface was just blasting on the screen after I did that. It's truly amazing how much faster pages appear to load, even over a 100baseT local network, with HTTP compression enabled. It is dramatic. I can only imagine how much snappier pages load over a remote network.

Posted by Jeff Atwood    36 Comments
Read older entries »
Content (c) 2009 Jeff Atwood. Logo image used with permission of the author. (c) 1993 Steven C. McConnell. All Rights Reserved.