Net.WebClient and Deflate

January 12, 2005

In a previous entry, Net.WebClient and Gzip, I posted a code snippet that enables the missing HTTP compression in Net.WebClient, using the always handy SharpZipLib.

This code eventually made it into one of my CodeProject articles. An eagle-eyed CodeProject reader noted that, while my code worked for gzip compression, it failed miserably for websites that use deflate compression. This is case of be careful what you ask for:

        Dim wc As New Net.WebClient
        '-- google will not gzip the content if the User-Agent header is missing!
        wc.Headers.Add("User-Agent", strHttpUserAgent)
        wc.Headers.Add("Accept-Encoding", "gzip,deflate")
        '-- download the target URL into a byte array
        Dim b() As Byte = wc.DownloadData(strUrl)

99% of the time, you'll get a gzipped array of bytes back from that request. For whatever reason, deflate compression is extremely rare on the open internet. The same reader also helpfully provided a URL that uses deflate: Redline Networks. So that was my test case. Although SharpZipLib supports deflate compression, I had difficulty getting this to work using provided the inflater stream class. And since it's such a rare case, I couldn't find any working code samples.

In desperation-- my OCD prohibits me from letting that last 1% case go-- I turned to the only relevant google result I could find, which happens to be on the SharpZipLib community forum. Jfreilly quickly provided an answer within a day! Problem solved. He also maintains a very nice SharpZip Library FAQ. Kudos to you, sir.

    ''' <summary>
    ''' decompresses a compressed array of bytes 
    ''' via the specified HTTP compression type
    ''' </summary>
    Private Function Decompress(ByVal b() As Byte, _
      ByVal CompressionType As HttpContentEncoding) As Byte()

        Dim s As Stream
        Select Case CompressionType
            Case HttpContentEncoding.Deflate
                s = New Zip.Compression.Streams.InflaterInputStream( _
                    New MemoryStream(b), _
                    New Zip.Compression.Inflater(True))
            Case HttpContentEncoding.Gzip
                s = New GZip.GZipInputStream(New MemoryStream(b))
            Case Else
                Return b
        End Select

        Dim ms As New MemoryStream
        Const intChunkSize As Integer = 2048

        Dim intSizeRead As Integer
        Dim unzipBytes(intChunkSize) As Byte
        While True
            intSizeRead = s.Read(unzipBytes, 0, intChunkSize)
            If intSizeRead > 0 Then
                ms.Write(unzipBytes, 0, intSizeRead)
            Else
                Exit While
            End If
        End While
        s.Close()

        Return ms.ToArray
    End Function

There is also a mysterious, third kind of HTTP compression, compress. Ok, it's not all that mysterious, but nobody seems to use it. What's up with that?

Posted by Jeff Atwood
6 Comments

deflate is the original Zip compression, unpatented and released to the public domain.

gzip is the same algorithm (IIRC) but the format has been made streamable and with less overhead (since it only contains one streamed file).

compress is an older, usually not-as-tight, and until last year patented compression algorithm.

Erich Stehr on January 26, 2005 1:25 AM

Thanks for the post. You've just saved me hours of work ... and probably some hair! ;)

Cheers, Pete

lad4bear on November 20, 2005 10:02 AM

Excellent work. Saved me hours of hassle

barrybevel on December 14, 2006 6:05 AM

I know this is old, but I thought I'd clarify a misconception by Erich. gzip is the same algorithm, but there's more overhead to it, as it adds a CRC.

Daniel on January 22, 2009 7:40 AM

Nice article. Quite useful.

John on May 3, 2009 12:54 PM

Perhaps someone would be interested to see more examples. http://johnwilliams.eu

John on May 3, 2009 12:55 PM

The comments to this entry are closed.