May 2, 2011
As an early advocate of solid state hard drives …
… I feel ethically and morally obligated to let you in on a dirty little secret I've discovered in the last two years of full time SSD ownership. Solid state hard drives fail. A lot. And not just any fail. I'm talking about catastrophic, oh-my-God-what-just-happened-to-all-my-data instant gigafail. It's not pretty.
I bought a set of three Crucial 128 GB SSDs in October 2009 for the original two members of the Stack Overflow team plus myself. As of last month, two out of three of those had failed. And just the other day I was chatting with Joel on the podcast (yep, it's back), and he casually mentioned to me that the Intel SSD in his Thinkpad, which was purchased roughly around the same time as ours, had also failed.
Portman Wills, friend of the company and generally awesome guy, has a far scarier tale to tell. He got infected with the SSD religion based on my original 2009 blog post, and he went all in. He purchased eight SSDs over the last two years … and all of them failed. The tale of the tape is frankly a little terrifying:
- Super Talent 32 GB SSD, failed after 137 days
- OCZ Vertex 1 250 GB SSD, failed after 512 days
- G.Skill 64 GB SSD, failed after 251 days
- G.Skill 64 GB SSD, failed after 276 days
- Crucial 64 GB SSD, failed after 350 days
- OCZ Agility 60 GB SSD, failed after 72 days
- Intel X25-M 80 GB SSD, failed after 15 days
- Intel X25-M 80 GB SSD, failed after 206 days
You might think after this I'd be swearing off SSDs as unstable, unreliable technology. Particularly since I am the world's foremost expert on backups.
Well, you'd be wrong. I just went out and bought myself a hot new OCZ Vertex 3 SSD, the clear winner of the latest generation of SSDs to arrive this year. Storage Review calls it the fastest SATA SSD we've seen.
Beta firmware or not though, the Vertex 3 is a scorcher. We'll get into the details later in the review, but our numbers show it as clearly the fastest SATA SSD to hit our bench.
While that shouldn't be entirely surprising, it's not just faster like, "Woo, it edged out the prior generation SF-1200 SSDs, yeah!" It's faster like, "Holy @% that's fast," boasting 69% faster results in some of our real-world tests.
Solid state hard drives are so freaking amazing performance wise, and the experience you will have with them is so transformative, that I don't even care if they fail every 12 months on average! I can't imagine using a computer without a SSD any more; it'd be like going back to dial-up internet or 13" CRTs or single button mice. Over my dead body, man!
It may seem irrational, but … well, I believe the phenomenon was explained best on the television show How I Met Your Mother by Barney Stinson, a character played brilliantly by geek favorite Neil Patrick Harris:
Barney: There's no way she's above the line on the 'hot/crazy' scale.
Ted: She's not even on the 'hot/crazy' scale; she's just hot.
Robin: Wait, 'hot/crazy' scale?
Barney: Let me illustrate!
Barney: A girl is allowed to be crazy as long as she is equally hot. Thus, if she's this crazy, she has to be this hot. You want the girl to be above this line. Also known as the 'Vickie Mendoza Diagonal'. This girl I dated. She played jump rope with that line. She'd shave her head, then lose 10 pounds. She'd stab me with a fork, then get a boob job. [pause] I should give her a call.
Thing is, SSDs are so scorching hot that I'm willing to put up with their craziness. Consider that just in the last two years, their performance has doubled. Doubled! And the latest, fastest SSDs can even saturate existing SATA interfaces; they need brand new 6 Gbps interfaces to fully strut their stuff. No CPU or memory upgrade can come close to touching that kind of real world performance increase.
Just make sure you have a good backup plan if you're running on a SSD. I do hope they iron out the reliability kinks in the next 2 generations … but I've spent the last two months checking out the hot/crazy solid state drive scale in excruciating detail, and trust me, you want one of these new Vertex 3 SSDs right now.
Posted by Jeff Atwood
I've had on 80GB Intel X25-M for over two years, another for almost two years, with no problems with either. One drive was used in a MacBook for most of its life; the other was used first inside a Mac Pro, then in a FW800 enclosure with an i7 iMac. I'm sure I've swapped them once or twice since I sold the MacBook and one of the drives became a backup. These drives have generally been used as system drives, holding only the base OS and Applications, with user data on a separate drive.
Once I experienced the performance difference in the MacBook, I couldn't stand using a platter HD for my OS in any machine going forward. Discovering that I still got most of the performance benefits (zero latency, random read performance, etc) via FireWire, it allowed me to upgrade from an old Mac Pro to a much faster i7 iMac, for less than a comparable new Mac Pro, and get a 'free' 27" display.
I have been pondering the whole hot/crazy scale for some time now, and I realized the reality of it. It depends on the type of crazy. If she occasionally runs down the hall screaming Banzai!!! or wears weird socks, we are good. If she comes after me with a chainsaw, there is no level of hot that is good enough.
Likewise, if the drive occasionally runs really slow, or refuses to boot, but never loses my data, we might be good, if hot enough, but this is not an acceptable trade.
Case in point: my phone occasionally starts to lag a bit, so I have to hit it with a task killer. Depending on how nasty of a task killer I have to use in order to get the resources freed up, sometimes I need to reboot before the USB connection to the PC (rarely used) works again. However, it is so neato than I am OK with this trade off. If it regularly dropped calls or just randomly destroyed all of my data, I would junk it.
Thanks for telling the truth about SSD! I'm also a tortured SSD-addict. They told me: you can't use Sleep mode in Windows, it will kill your SSD. I used it in a Dell Vostro 3700.
OCZ Vertex 2 60 GB, failed after 65 days
Nice writeup, but I have to stray off topic for a bit and mention that your overuse/misuse of the word "crazy" is needless and, to be quite honest, offensive.
So as not to derail this topic any further, I urge anyone who is interested in learning more to please read this writeup:
The English language is full of very descriptive words that would be perfect alternatives for conveying the thrust of your argument without resorting to what basically amounts to a slur.
How large are your builds? 4GB of RAM is $40. Building off a RAM disk is going to be quite a bit faster than an SSD, for 1/10 the cost and much less evident risk.
@Alexei You're dead on. Every application opens faster. Every build is faster. Every run of my full test suite is faster (we got a bump from 30 minutes to 5 minutes on a big suite that we were running every few minutes).
I have, currently, between a ton of SSDs on various machines that I manage/own. The oldest, at two years old, appeared to have trouble when windows died during an update, but none of my data was lost and I reformatted and kept going just fine (replacing the primary drive with a second Gen SSD that was bigger).
Of these drives, I have two 60GB OCZ Vertex at 3 and 2.5 years. Both have seen some heavy development/test use. I have 100GB OCZ Vertex/Agillity (not sure which) running as an repo server (that is backed up onto platteR) that sees heavy read and write, and it is about 2-2.5 years old. I have a OCZ Vertex 60GB running in a laptop that sees not much usage, save when I travel. I have OCZ Agility 120GB (maybe a second gen, don't recall) running on a server hosting a large webapp (which has super light traffic and everything is in memory anyway) for over a year (traffic picking up, knock on wood). I have another 120GB OCZ Agility running on a backup server for 3 months that has not much usage. Yet another OCZ 100GB running as the main drive on a server that hosts a few small web applications, email for a few dozen clients and a boatload of static web pages, but it's not moved into production (it is one year old, my old platter based system partially-croaked and I'm migrating it now)
I also have 3 large servers running raided intel SSDs (not sure which). Oddly, the OCZ drives were awful in this setup, and the intels performed several times better. These servers are over a year old and one of them sees a ton of traffic and has regular small, manual writes (backup dumps to SSD because it's fast and I want little/no downtime, then copied to platter and erased).
Also, when I game it's nuts. The performance boost of a faster disk is the biggest speed jump I've seen since I moved from a 486 to a pentium. I missed the hints in Fallout 3 because the load screen blinked by too fast (glad I read online about the pip-boy light). Jumping to a different solar system in eve is hugely faster as it loads everything almost immediately (even when I jump into Jita at prime time). I easily save a couple hours a day, whether I'm at work or at play.
There are a fair number of little netbooks with SSD drives (my brother in law has a 3 or 4 year old one running Linux).
You'd think we'd hear about those SSDs failing. Any idea if their failure rate is any different? Maybe the usage pattern is different?
This column made me think about my old IBM Deskstar . . . that actually works.
Just Googling the Vertex 3 reliability after I've just had mine die after 4 days.
Bought on the 24th...
RMA'd for a refund this morning (died overnight).
Will go for an Intel Elmcrest instead - Sure - not as fast, but seemingly more reliable.
Just to pass it on to you, if not other readers.
Pretty annoyed at the reliability even with the latest firmware due to all the time lost in installing everything...
The problem with a situation in which you expect your main hard drives to fail this often is that you can't just rely on a backup strategy.
By switching to SSD, you're taking on more risk.
If BACKUPS are for extreme rare cases, then your backups to your failing SSDs are no longer true BACKUPS -- they've become data that you're likely to use, not just in extreme rare case.
So you need BACKUPS to your backups, i.e. you need to have two different backup strategies: two different softwares, two different channels, two different target media/hosts.
Otherwise, you've just raised your risk level across the board.
1) Most or all of your failure data is based on Gen1 MLC SSD's. scary for people yaknow
2) most of the replies saying their ssd's are fine are people with gen2 mlc drives
3) This is sparta ! (err gen3)
4) Even if SSD's rock, they're currently way overpriced for any inclusion in any "normal" config i.e. the pc for mr. youtube-facebook-mail (about 94% of the pc users)
5) Even if SSD's fail like hell, their IOPS numbers are just madness for either virtualization or databases. Anyone saying their reliability is an issue in the matter clearly doesn't know how to setup a raid.
We use the Samsung 256 GB SSD that came from Dell and I am over 2 years without a problem. Love it and would never go back.
RAID of Drives obviously this is only feasible for a desktop. Well you may want to look at : http://www.wiebetech.com/products/ToughTech-Duo-QR.php
Hard drives used to make floppy disks seem reliable. In particular the HD's with removable platters were really unreliable. They would fail all the time.
SSD's will hopefully improve given some time.
Performance has become a key requirement - Solid State Disks make using a PC a joy - it requires a new concept in computing and you have nailed it well here. It makes your PC more like the old days of dumb terminals connected to mainframes. Fast - but the data is held elsewhere. It's a good technique to store your data away from your OS and apps - let alone up in the cloud which has a lot of buzz about it right now.
Got my triple-SSD raid0 (192GB) Sony Z11 March, 2010. What's the date now? 18 months ...
I do back up the important stuff to a 2TB drive, but still going strong with no issues.
I have been considering an ssd for a few months, waiting for the magic 120gb under £100.....
my pc seemed pretty sluggish and I knew that an ssd would make it much faster.
deep down I also knew my pc wasn't sluggish when I got it.
then my SEAGATE HDD popped its clogs. pushed up daisies. fell off the perch.
my data was safe in another partition on the same disk, but backed up.
except it wasn't.
so windows users who stick their data on another drive and don't worry about the ssd failing hopefully have a back up of the users folders.
anyway, a new magnetic drive is installed. new install ow windows. all clean and tidy.
and i don't need an ssd for speed anymore.
I've had 4 out of 13 OCZ SSDs fail within a few months of purchase. This sure demonstrates the need to keep relatively recent images of every SSD boot drive. Your post was especially disconncerting because I had figured this high failure rate is unique to OCZ. I've since switched to Crucial SSDs. Only time will tell.
Had an OCZ Vertex 2 60GB fail on my after 6 months and now the replacement OCZ Vertex 3 60GB failed after about 9 days !!!
This article is really grate and I think SSD's has some other problems and it need some warranted from the company or over check by company. You can use other Disk for good results. Thanks for good article and all commendatory. :-(
@Thibaut on May 2, 2011 4:42 PM
@Flori on May 3, 2011 5:52 AM
Thanks for the thoughts about the impact on the environment, it seems that there's still a lack of concern on this.
My Intel X25 160GB HD failed today. It went into service in my daily use laptop on or about June 15th, 2010. Today is December 27, 2011. It first failed on December 15th, 2010. 17 months.
INTEL part number SSDSA2MH160G2R5 R.
It held my Linux OS and home directory the entire time.
The first sign was a suspicious boot failure.
I immediately performed a full backup which copied 69 of 70 GB.
It continued to mount, so I ran numerous high and low level tests on it including chdsk, bad block, various SMART checks, including the BIOS test in the laptop. All tools said the drive was fine.
A couple days later it would not mount at all. The partition table was totally wiped.
The BIOS check said it was fine even after the partition table was wiped and it wouldn't mount.
I just ordered a new Dell XPS17 because of this incident. Not sure the drive controller in my old HP laptop wasn't part of the problem.
I'm not sure what I will do for an OS drive in my new laptop. I find it hard to believe that SSDs are that much on the crazy side of the crazy-hot line, but then I just about lost a lot of data with ZERO warning.
I'll be running a RAID 1 setup using the eSATA port on the XPS17 !
Hope this helps someone.
my Acer Aspire One netbook fails, from one day to the next. I tried reinstalling the OS, but nothing works.
Took my pathetically bricked OCZ Vertex 2 80Gb SSD (Win 7 64 bit OS) and smashed it to bits with a hammer. Cut two of my fingers to ribbons in the process and/but it sure felt good. (Every so often you've just got to remind i.t who the boss is. Know what I mean?) Thing is I'm still considering buying one of the newer models; once you taste that delicious boot-up speed it's hard to go back.
A little of that rebellious Luddite spirit goes a long way,
"oh-my-God-what-just-happened-to-all-my-data instant gigafail"
LOL that was epic !!!