Don't Use ZIP, Use RAR

February 22, 2007

When I wrote Today is "Support Your Favorite Small Software Vendor Day", I made a commitment to spend at least $20 per month supporting my fellow independent software developers. WinRAR has become increasingly essential to my toolkit over the last year, so this month, I'm buying a WinRAR license.

Sure, ZIP support is built into most operating systems, but the support is rudimentary at best. I particularly dislike the limited "compressed folder wizard" I get by default in XP and Vista. In contrast, WinRAR is full-featured, powerful, and integrates seamlessly with the shell. There's a reason WinRAR won the best archive tool roundup at DonationCoder. And WinRAR is very much a living, breathing piece of software. It's frequently updated with neat little feature bumps and useful additions; two I noticed over the last year were dual-core support and real-time stats while compressing, such as estimated compression ratio and predicted completion time.

WinRAR fully supports creating and extracting ZIP archives, so choosing WinRAR doesn't mean you'll be forced into using the RAR compression format. But you should use it, because RAR, as a compression format, clobbers ZIP. It produces much smaller archives in roughly the same time. If you're worried the person on the receiving end of the archive won't have a RAR client, you can create a self-extracting executable archive (or SFX) at a minimal cost of about 60 KB additional filesize.

RAR also supports solid archives, so it can exploit intra-file redundancies. ZIP does not. This is a big deal, because it can result in a substantially smaller archive when you're compressing a lot of files. When I compressed all the C# code snippets, the difference was enormous:

ZIP229 KB
RAR73 KB

But even in an apples-to-apples comparison, RAR offers some of the very best "bang for the byte" of all compression algorithms. Consider this recent, comprehensive multiple file compression benchmark. The author measured both compression size and compression time to produce an efficiency metric:

The most efficient (read: useful) program is calculated by multiplying the compression time (in seconds) it took to produce the archive with the power of the archive size divided by the lowest measured archive size.

2 ^ (((Size/SmallestSize)) - 1) / 0.1) * ArchiveTime

The lower the score, the better. The basic idea is a compressor X has the same efficiency as compressor Y if X can compress twice as fast as Y and resulting archive size of X is 10% larger than size of Y.

And sure enough, if you sort the results by efficiency, WinRAR rises directly to the top. Its scores of 1871 (Good) and 1983 (Best) rank third and fourth out of 200. The top two spots are held by an archiver I've never heard of, SBC.

WinRAR and SBC 0.970 score very well on efficiency. Both SBC and WinRK are capable of compressing the 301 MB testset down to 82 MB [a ~73% compression ratio] in under 3 minutes. People looking for good (but not ultimate) and fast compression should have a look at those two programs.

The raw data on the comparison page is a little hard to parse, so I pulled the data into Excel and created some alternative views of it. Here's a graph of compression ratio versus time, sorted by compression ratio, for all compared archive programs:

Compression Time vs. Compression Ratio graph

What I wanted to illustrate with this graph is that beyond about 73% compression ratio, performance falls off a cliff. This is something I've noted before in previous compression studies. You don't just hit the point of diminishing returns in compression, you slam into it like a brick wall. That's why the time scale is logarithmic in the above graph. Look at the massive differences in time as you move toward the peak compression ratio:

72.58%02:54WinRAR 3.62
75.24%11:20UHARC 0.6b
77.16%30:38DRUILCA 0.5
78.83%05:51:19PAQ8H
79.70%08:30:03WinRK 3.0.3

Note that I cherry-picked the most efficient archivers out of this data, so this represents best case performance. Is an additional two percent of compression worth taking five times longer? Is an additional four percent worth ten times longer? Under the right conditions, possibly. But the penalty is severe, and the reward miniscule.

If you're interested in crunching the multiple file compression benchmark study data yourself, I converted it to a few different formats for your convenience:

Personally, I recommend the Excel version. I had major performance problems with the Google spreadsheet version.

After poring over this data, I'm more convinced than ever. RAR offers a nearly perfect blend of compression efficiency and speed across all modern compression formats. And WinRAR is an exemplary GUI implementation of RAR. It's almost a no-brainer. Except in cases where backwards compatibility trumps all other concerns, we should abandon the archaic ZIP format-- and switch to the power and flexibility of WinRAR.

Posted by Jeff Atwood
138 Comments

I have been using winRAR for a long time, which makes me think I should actually purchase a lisence. I don't see a point for anything else. When I deal with EU's who can't use the .exe because of e-mail, I'll have them just install winRAR, and they end up buying because they love it. It's such a cleaner install too compared to WinZip.

Luke on February 23, 2007 12:40 PM

"I wasn't too sure at first but the carping of the Free Software extremists made me decide to go out and support something proprietary today!"

No kidding. If I weren't already a long-time WinRAR customer I'd buy a license just to spite these GNU fools...

Also, WinRAR is a good choice because it's excellent across the board. You get a great compression ratio AND fast compression AND good Windows integration AND a powerful command line AND lots of useful options AND a very low price compared to WinZip -- not just one or two of these things.

Chris Nahr on February 24, 2007 1:15 AM

Jeff,

re: solid archive - should it be "inter-file" redundancies? NOT intra-file ...

Herman on February 24, 2007 3:10 AM

tar + bzip2, EOD.

Joomla Degauss on February 24, 2007 9:25 AM

first, the person who says "RAR annyos me because it is not included with windows" is a fucking idiot. second, yes winrar is indeed the best... we've been using it since around 2001

cliff on February 25, 2007 5:52 AM

Cliff, I'm sorry my little comment negatively affected your opinion of the intelligence of a TOTAL STRANGER that typed ONE LINE into a text box. This is a big, dangerous world, and people should protect you from opinions that you don't agree with.

I just think compressing files is as important as editing text files, calculating simple sums or displaying web pages. I just think it is annoying to have to go and download WinRAR when I need to unpack a compressed file. No need to call me a fucking idiot.

Joost on February 25, 2007 7:13 AM

I became a WinRAR user when I started learning Japanese and realised that I couldn't use unicode filenames with WinZIP.

Paul Coddington on February 25, 2007 9:36 AM

RAR annoys me because it is not included with Windows like ZIP is.

Joost on February 25, 2007 12:46 PM

I have worked on one of the big realtime data warehousing projects and we were persisting large amount of data in database from one daemon while another daemon supposed to read it.
At some point we tried to use ZIP compressed datastream, and it worked just fine for the
compression. However, decompressing data with zip takes at least 6-10x time of normal reading data.
So we had to look into another algorithms to do it.
I guess all I am saying is before make decision you need to find match between compression
algorithm, content of your data and process.

ZipIsOld on February 26, 2007 2:55 AM

If you send someone a rar and they cant open it.... then chances are they dont DESERVE! to open it =P. winrar r0x0r my s0x0r!

Arudis on February 26, 2007 11:43 AM

@Adam

"So, for GUI programs he's allowed to tweak them as much as they possibly can be, but command-line programs have to use default options? Come on!"

No I only said I will use the GUI when available and select the 'best' option there (which are sometimes not the best anyway). Command line programs are also tweaked for best compression but not all combinations are tested as it will take ages for programs like 7-zip (command-line mode) and EPM to find the best switches.

So, I try to find the best possible compression for both GUI and command line programs but not at the cost of hours of tweaking. Maybe the text on the site is a bit unclear. will try to fix that.

BTW I not only test the 'best' modes but also the 'normal' modes (of the command-line and gui programs).

Werner

Werner Bergmans on April 27, 2007 6:19 AM

Very informative post Jeff.

If compression time matters more in comparison with compression size, the best archiver is not WinRAR. Consider for instance that you have to send the file over a network and you can send 1MB per second. In this case you want to minimize compression size+2^20*compression time, and the top 10 winners are (in order):

Program Switches Efficiency (smaller is better)
PKZIP 2.50 (none) 137569935
THOR 0.93a exx 140586704
THOR 0.93a (none) 141152880
AIN 2.32 (none) 141791336
PKZIP 2.50 -exx 143382475
GZIP 1.2.4 (none) 144803092
WinXP (Builtin) (none) 145697399
ALZip 6.32 (normal) 145758694
WINZIP 10.0 Normal 148619279
ESP 1.92 (none) 148719009

Bogdan on May 12, 2007 3:59 AM

I think that THOR 0.95 with -e3 is the best!

angel on May 18, 2007 5:29 AM

Hi,
Seem to finally have found some people knowing
about RAR ...

My problem concerns a huge file (about 1 TB).
A test-file with similar content from 100 MB
was compressed with RAR at a factor +1000.

This opens possibilities to really work with
the actual file, for it would be compressed to
something around 1 GB, if the ratio remains
1000.

But I need to work IN that file, yes reading
and ... writing too.

My life is hell since then.
Please bring me back in heaven.

Dummietoo on August 15, 2007 9:43 AM

Please don't take this advice. RAR is a niche format not natively supported by any platform. Stick with zip, or maybe try gzip or bzip2 on your tarball. With any of the above, you'll get native support in at least 2 of the 3 major platforms.

Kidding? on January 15, 2008 8:37 AM

Chaps, your kidding discussion is not worth a penny -- even for newbies!

http://www.maximumcompression.com
http://www.uclc.info

romulus on March 19, 2008 5:08 AM

I have both of them.So I sometimes uses .zip,another uses .rar acording to the files I received/download.

Neo1027 on March 20, 2008 1:52 PM

"I wasn't too sure at first but the carping of the Free Software extremists made me decide to go out and support something proprietary today!"

Good on you! I'll buy a bottle of wine with those $20 instead... we can share it if you want. Want red or white?

I will stick with using my free copy of Linux and my free copy of PeaZip for opening RARs. I'm not a socialist, and I'm not an extremist, just a student who thinks that there are better ways to spend his money. I was happy to buy a Photoshop licence, for example, as there is not a REAL good free alternative on Linux (GIMP is getting better but it's not there yet).
But WinRAR... seriously, it's just an archive manager. I personally don't care if it takes me a little bit more to decompress a file also because, unless you're dealing with gigantic archives (and 95% of the users surely are not), you won't notice any difference. And I don't care if I can spare a few kb on my archive. Actually, today's hard drives are big and cheap enough to keep everything uncompressed and live happy.

Free software is also about beeing free to choose. You are free to pay WinRAR, I'm free not to do so if I think that's better for me.

As for the metter of "how will I decompress my archives in 20 years?"
Are you guys really planning to keep 20 years worth of backups of your data? How often do you open backups of files say... 5 years old? Not very often, probably never I would say. In 20 years the problem simply won't be there.

nico on March 22, 2008 5:05 AM

You must be devilishly insane. This way you are supporting a closed format. Why use such ridiculous formats, when there are better ones?
Examples? FLAC, MPC [music], BH, and so on [arch].
Blabber talking!

marc on May 26, 2008 5:24 AM

Good luck finding a file that you know the name of, but not what directory path it is in. Both WinRAR and 7Zip list by path first, so to find a file you have to click through all those folders to find it. WinZip may not be the best but to find a file all I have to do is sort on file name and type the first or second letter to find it quickly. I can also sort on type or path. I admit I haven't looked to see if you can change it in WinRAR, but if you can't then I will continue using WinZIP even with the larger file sizes.

Mark Bernard on May 28, 2008 8:46 AM

Why don't you compare all, i mean all, the softwares that compress with kgb archiever ?

KGB Archiever is the slowest of them and the software that compresses 1 gb to 10 mb

ghost_rider on June 8, 2008 1:30 PM

I love winRAR, not only does it have a lot of useful feature and great compression rates, it also gives me something useful to use at work. If someone annoys me by asking for files and being rude about it I tend to RAR it and see if they have the initiative to find out how to open it or not :D!!

Its amazing how polite they are when they call you asking how to open the file! ^_^'

Childish I know but amusing :D

Peter on September 16, 2008 8:49 AM

A lot of people here confuse the Software with the Format. What software you use is basically irrelevant, that's nothing but personal preference. What matters is the format it puts out.

I don't mind winrar that much, it can create zip files too. If you prefer its interface over winzip or 7zip, good for you. Go buy it and set it up to spit out zip files. Fyi, there is no gui for rar on non-windows platforms, only a commandline interface, so it not only locks you to winrar, but also to windows. Which might not seem like a problem now, but it might be in a few years. A lot of people hate vista, but are you really gonna stick with XP for the next 20 years? I doubt it.

I do mind rar-format files, because they're proprietary and force people to install software from 1 single vendor (= vendor lock-in). Too bad there's people like Jeff promoting this kind of nonsense, cause it has 1 or 2 slight advantages over the free (as in 'speech' and 'beer') alternatives. People buying this kind of software force others to do the same.

As long as the specifications of the format are not freely available, it's yet another piece of vendor lock-in to me, and i'll stick with recompressing any rar files i download to 7zip or bzip2. Or better yet, download the already uncompressed version, which is often a few % smaller too, if you look at video files for example...

Mephisto on October 1, 2008 8:54 AM

Compressing era is over. and has been since 2006 .. get over it
in a time where you can easyly get 12Mb\s ADSL line or a 100Mb\s cabel line why do you realy need to compress away 2-5% of the orginal size value, then it has to be a big fucking file your sending and then its realy hard to get over the intra-WEB
.. no more coments ..

Regards
The Protagandasist

The Protagandasist on October 7, 2008 11:16 AM

The advice to use RAR is the exact same advice that caused Word .docs to be the standard document format. Using proprietary file formats is a fool's decision for small, short-term gains at the cost of large, long-term losses.

Don't make this mistake. It's been made too many times with too many formats.

Jeff, you also forgot to consider compression/decompression asymmetry.

Chris on March 16, 2009 1:39 PM

You've convinced me!

Yaarik on July 13, 2009 2:53 AM

Thank you for sharing your experience

It really helps me a lot

I searched out this by input"ZIP RAR" in Google.

Betty on July 22, 2009 3:23 AM

Interesting post. I redid the graph to use the X and Y axes for time and ratio respectively, I find it clearer that way.

After all your praise of RAR I find it funny that you post the Excel spreadsheet as a ZIP though ;-)

http://www.isotton.com/sandbox/graph.png

Aaron Isotton on February 6, 2010 10:03 PM

As a followup: the red lines are the means, the blue lines are the medians.

(The mean is the sum divided by the number of elements, the median is the value which divides the elements into two equal sized sets).

So the algorithms in the top left part of the blue cross are the ones which are "better than most". Someone else could label them...

Aaron Isotton on February 6, 2010 10:03 PM

I have to agree with Noah Slater and all the others who insist on using a free compression algorithm.

Archives (that's valid for all file formats, but archives is what we're talking about here) are bound to stay here for a long time - maybe even decades. So saving your data in a proprietary format is the most stupid thing you can possibly do. What will you do if you want to repack a RAR in 10 years - for whatever reason? Winrar might not exist any more, and you might not have a Windows box available for the task.

What will you do *now* if you want to pack or unpack a RAR on your Sparc, Alpha or Itanium server? The answer is you can't.

File formats should be open, cross-platform and there should be at least one major open source implementation for them before I consider using them for my work. This has nothing to do with "socialism" as some seem to think; it has to do with me wanting to be able to access my data now and in the future.

I think it is great if someone writes a commercial, closed source Zip program which is twice as fast as anything else - I can buy it and use it if I want, but it will give me a standard Zip file I can work with using dozens of programs on dozens of platforms. Or I can even implement my own if I feel like. *That's* the point of open file formats.

Aaron Isotton on February 6, 2010 10:03 PM

Phil Katz, RIP :)

Another vote for ZIP until RAR is implemented on a bunch of platforms. You can still get PKZip for Windows if you don't like WinZip, among a bunch of other free Zip clients.

The RAR guys should recognize that it is in their own interests to release the RAR format so that others can write to their format. Their superior compression would actually stand a chance at the kind of ubiquitous adoption that ZIP has.

Aaron Erickson on February 6, 2010 10:03 PM

Edit...

Another vote for ZIP until RAR is implemented by **more than one vendor**.

Aaron Erickson on February 6, 2010 10:03 PM

Ugh, I used RAR for a while and while it is interesting for comparison, I'm not really interested in using yet another proprietary tool for general purpose archiving. I lost interest in how many percent one compression tool wins over another under what circumstances some time in the 90s (remember ARJ?), and the issue is moot to me. WinZip's latest version stopped working, so I just switched to 7Zip...I can extract and create Zip and BZip2 files just fine. As far as I'm concerned bz2 should be the future.

Aaron H on February 6, 2010 10:03 PM

Allan Clark had a great point: "Your code snippets, for example, will be compressed ONCE, but downloaded and decompressed MANY TIMES. Take the download time, and add to it the decompression time, and you have a more accurate number."

The point is to think about what you're compressing and why, rather than just blindly picking one based on raw speed, or GUI (ooh, look, aqua buttons!!!). If you're backing up files on your own system and plan to extract as often as you compress then by all means pick fast and shiny. If you're distributing something large then take the one time hit on compression time and save the GB's of bandwidth.

Whatever you use, anything's better than the built in Windows Compressed Files support when it comes to speed. 7-zip's compression to standard ZIP format runs around 100 times faster that compressed folders on my machine.

Jon Galloway on February 6, 2010 10:03 PM

Love the commie free software nuts. Yep, doesn't matter how good something is, don't touch it if you have to pay for it or you can't find enough information to build it yourself!

(Boooring...)

Jeff, you know that MS Office 2007 documents are zips, right?

Aaron G on February 6, 2010 10:03 PM

ZIP sucks as it lacks options (lack of solid, or varied algorithms)

RAR is great as it has features allowing to specify compression details (ie, enable disable sharkin/ppm compression, or go down to set up its dictionary size), as well as practical features (such as delayed reads, that save hd from trashing)

7ZIP seems to be more geared towards people that truly like to experiment with their parameters. Sure, if you know what you are doing its amazing. (ie : ''7za a -t7z -mx=9 -m0=LZMA:a=2:d=55m:fb=255:lc=8:lp=4:pb=4:mf=pat4h '' is a very simple line)

PAQ series is the one that true compression fans go for. Its headed by comp-sci people and fueled by a challange and fame. http://cs.fit.edu/~mmahoney/compression/

compression fan on February 6, 2010 10:03 PM

RAR is a poor choice, for a number of reasons:

1. It is in extricably bound to one supplier.
2. No source for using it in other applications.

If you want a good compression for archives, try tar+bzip2. The compression is slower but the files are ultrasmall. tar+gzip is the fastest combination, smaller than zip but fast enough to be usable.

Ramon on February 6, 2010 10:03 PM

Hi Guys! A nice post here! I like your comments!
I used all the compression format available on the web because i got a 12kb upload speed, so for upload I need to compress the data!
My hardware is a Acer Aspire One Netbook, 1.6ghz, 2gb Ram, And 160gb Harddisk.

Tested this on a image folder of 74.9mb. contain only jpeg
My own comparison are:

7-Zip format is good but doesn't compress below 98percent.Very slow on compress but fast to decompress. All other format in 7-zip is very slow.Same compression as winzip legacy format.Depends on different hardware's and people needs.

Zip legacy format is fast but not much difference to the original file.(70.4mb)
Zip Zipx new format is fast and the compression is very good.(68mb)

Stuffit SITX format is memory eater but compression is second in list to KGB Archiver(65mb), I used it on most files like 80mb very reduced to aprox 50mb.

Winrar is best compression not much change to the file and can be compare to zip legacy format.

Now, my conclusion after this is zip legacy AND 7-zip is the best for uploaders or any client, receiver because the person receiving the files do not need to buy winrar and stuffit license! USE 7-Zip archiver format is zip!

Nejispiral on May 5, 2011 8:45 AM

«Back

The comments to this entry are closed.