Tue - Sep 08, 2009 : 10:12 am
Hard Drive Crisis
Okay... So three days ago (that would be Saturday morning), I found that my server was having weird problems. I was getting an I/O error when I tried to start a movie for my daughter. Yeah, that can't be good. I'd seen that problem before when LVM got out of sync somehow (after about 6 months of uptime), and decided to reboot it. Upon rebooting, I noticed the computer couldn't make it past the BIOS, and, I heard a not-very-familiar, yet very-widely-known *click* sound coming from the server.
Yeah... I was getting the "click of death" from one of my hard drives.
Later that night, I found it was the newest Western Digital RE2 drive I had bought from PCClub a little over a year ago, and under normal situations, a hard drive going bad to the point that it's unreadable by the OS is a very bad thing.
In my situation, it would have been a *very* bad thing, because the broken drive was part of a LVM2 array which houses everything including my movies, my music, all the digital photography and videography of my family for the past 6 years. And according to what I've researched, one cannot restore a broken LVM2 array without all the drives being present. I soooooooo hope I'm wrong on this.
Anyway... All is not lost, because I keep a nightly backup of the entire array. So, I wasn't worried at all about the array being down. Even if I have to rebuild the whole array, I still have the data backed up on another computer.... Right?
Yeah... That was right... until yesterday.
Yesterday, I powered on my backup machine (which houses another LVM2 array which contains all the aforementioned backups), and it wouldn't get past the BIOS. Chills went immediately up my spine, all the way to the back of my head.
I rebooted, and this time, I could NOT believe what I heard. The backup server was - clicking.
This couldn't be happening.
Three days ago, I had ordered a new 640 gig WD black hard drive to replace the one in my server, and, due to the labor-day weekend, won't be here for another two days, giving me a 5-day window to get a new hard drive, install it, and copy all the files over. FIVE DAYS!!!! That's all I needed.
I rebooted. ...nothing.
So, I open the backup box to find the exact same 500 gig Western Digital RE2 drive, which I had bought the same day, from the same place, a little over a year ago. Yeah, it was dead too.
I spent hours last night googling options from restoring partial LVM2 arrays, to reviving dead drives, to professional data restoration... Because now, I was up a creek. I mean, who could imagine that both the server and the backup server could go completely dead within a week. Oh, and get this.. Both drives are warrantied until 2011 - THAT'S TWO YEARS FROM NOW!!! So, theoretically, neither of them should have died... Much less both of them, and even less that they died 3 days from each other. Talk about horrible luck.
I would have preferred that all 5 of the drives in my server explode into fireballs and physically melt my server than this. Gah....
So... The resolution is this:
I've heard from many different sources, that you can temporarily revive a hard drive by putting it in the freezer, and then, when fully frozen, take it out, connect it, and do your best to get all the data off it before it tanks again.... But I've never heard first-hand of this working. It's always been a guy who knew a guy.
Western Digital also makes a tool that can supposedly tell me why my drive is dead.
So, when I have my new 640GB drive in hand, I plan on using WD's tool, find out everything I can, about the two broken drives, call WD support, find out as much as I can from them... Then, if no other solution presents itself, I'm gonna freeze the drive which I think is the least broken, and see if I can use the pvmove LVM2 command to migrate the stuff from the broken HD to the new HD. I only need this to work once - on ONE of the drives, and I'm back in business.
If not, there's gonna be a whole lot of weeping, wailing and gnashing of teeth in the Jones household...
I'll keep you all posted on the results. My new drive is expected to arrive this Thursday.
If you know of any solution other than what I've stated, please comment. Also, if you know of any way to get a partial LVM2 array to assemble itself, please comment.
Both drives are completely dead, unable to be recognized at all by the BIOS, or the OS - and I'm running gentoo Linux on both servers.