I learned to appreciate the value of the Cyclic Redundancy Check (CRC) algorithm in my 8-bit, 300 baud file transferring days. If the CRC of the local file matched the CRC stored in the file (or on the server), I had a valid download. I also learned a little bit about the pigeonhole principle when I downloaded a file with a matching CRC that was corrupt! An 8-bit CRC only has 256 possible values, after all.
Checksums are somewhat analogous to filesystem "fingerprints"-- no two should ever be alike, and any modification to the file should change the checksum. But checksums are unsuitable for any kind of security work:
CRCs cannot be safely relied upon to verify data integrity (that no changes whatsoever have occurred), since it's extremely easy to intentionally change data without modifying its CRC.
That's probably because CRC is a simple algorithm designed for speed-- not security. As I discovered, a checksum is really just a specific kind of hash. Steve Friedl's Illustrated Guide to Cryptographic Hashes is an excellent, highly visual introduction to the more general theory behind hashing. The .NET framework provides a few essential security-oriented hashing algorithms in the System.Security.Cryptography namespace:
As far as I can tell, there are only three hash algorithms represented here: Des, MD5, and SHA. SHA is available in a couple different sizes, and bigger is better: every extra bit doubles the number of possible keys and thus reduces the pigeonhole effect. It also doubles the number of brute force attempts one would theoretically need to make in an attack.
However, if all you need to do is tell two things apart, you don't need fancy security hashes. Just use the humble GetHashCode method:
Dim s As String = "Hash browns" Console.WriteLine(s.GetHashCode)
I'm not clear exactly which algorithm was used to generate this hash, but I'm sure it's at least as good as my CRC32 class.
I hear more hashing algorithms will be introduced with .NET 2.0. I'd like to see CRC32 in there at the very least. For an interactive demonstration of the 13 most popular hash algorithms, I recommend SlavaSoft's HashCalc.
Anecdotally, System.String.GetHashCode seemed to produce a different result in .Net 2.0 Beta 1 from 1.1.
Yeah, the BCL guys said up front they reserve the right to break compatibility on hashcode values between versions of the runtime. They're improving the hashing algorithms for better distributions, among other things..
Jeff Atwood on April 7, 2005 2:23 AMAnecdotally, System.String.GetHashCode seemed to produce a different result in .Net 2.0 Beta 1 from 1.1.
I built an RSS reader that relied on string combination and hashes to identify items it had already seen, and changing framework versions seemingly changed every hash code in use, so I had a lot of unread items post-upgrade...
Some posts about what GetHashCode does internally, from MS folks.
http://blogs.msdn.com/brada/archive/2003/10/06/50434.aspx
http://blogs.msdn.com/bclteam/archive/2003/10/31/49719.aspx
Bear in mind that System.String.GetHashCode produces a "true" hash, whereas the Object.GetHashCode provides a "stable" random number, which clearly isn't a hash..
http://blogs.gotdotnet.com/BradA/commentview.aspx/b688ad81-1642-4a4b-bff8-a9fdb985fbbc
I built an RSS reader that relied on string combination and hashes to identify items it had already seen, and changing framework versions seemingly changed every hash code in use, so I had a lot of unread items post-upgrade...
http://masterclinic.ru
The comments to this entry are closed.
|
|
Traffic Stats |