Server Near-Death Experience…
What have you been doing for the past week, you might ask? Unfortunately, it hasn’t been anything even remotely fun or interesting.
My server died. And, of course, it’s not one of the minor servers, the few that are running web crawlers or seemingly nonsensical tests. Nope, the big one… The one I haven’t backed up entirely in almost six months… The one with all my financial data and web applications… Client data and business records… Custom source code for new applications and software license records… Email… Saved games… Website server logs… Legal correspondence…
Stop me when you get the picture.
Worse yet is that I had just powered everything down for no real reason other than to reset it and then to perform a comprehensive backup of everything — something I knew was long overdue. Of course, the server was working just fine until I powered it down.
The problem was that it wouldn’t power back *up* again!
The system: A low-end Pentium with 256MB RAM, a Promise UDMA controller, an old 4GB Western Digital drive, two identical 60GB UDMA/100 drives, the rest of the hardware irrelevant. The Red Hat 7.1 Linux operating system booted off the Western Digital and automatically mounted the two 60GB drives, one as a public file system that contained digital media such as the network jukebox and photo album, for security reasons the other separately contained the other non-public information
The symptoms: Linux booted normally, but could not fully mount the file systems on either of the two 60GB drives. At startup, the Promise controller card recognized the drives less than half the time. Sporadically, it properly read one of the drives or the other between reboots.
Days One and Two: In an effort to recover as much data as possible at no expense and to reduce the unnecessary stress of needless interruptions, I shut off the phones and let the mail pile up. I tested and retested for about a day and a half in a desperate struggle to rescue the most vital pieces of data.
With the system as it was, the outlook didn’t look great, but not TOO bad.
Day Three: Promise Technology is not known for quality or reliability, and logic applied to the symptoms lead me to the controller card as the culprit. As I did not want to replace the controller card with another that would likely fail again, AND I have been wanting to upgrade the antiquated Pentium Pro CPU anyway, I opted for a more major overall upgrade. Failing power supplies can also cause weird symptoms in otherwise working parts.
Replacing those parts should work just fine!
A new Pentium IV 2.4GHz CPU (no point in paying up to ten times the price for one faster than 3.0GHz!) meant a new motherboard. CPU and motherboard bundled: $215. A new ATX-style motherboard meant a new power supply for $59. Fortunately, the case had a P4-ready backplane, so I could use the same awesome, ášš-kicking Antec case. Of course, the domino effect dictated that the 168-pin memory I had in the old system would have to be replaced with a newer 184-pin DDR DIMM. And, as luck would have it, I didn’t discover that until AFTER I’d already gotten back home. A quick (yet still feeling like needless later) drive to Best Buy lead to a great discovery of a 256MB DIMM on sale for $25! Total so far: $299.
The problem now was that even after replacing almost everything else neither drive worked!
Day Four: Still no clue.
Day Five: Woke up panicky when I realized that my QuickBooks business data would be difficult — if not impossible — to re-create. After about twenty bazillion reconfigurations and combinations of parts, I discovered that the drive that was previously installed as a slave on the primary IDE drive bus seemed to work fine as a master on the new bus. A breakthrough, albeit a confusing one. It was the digital media drive, but I backed it up anyway onto another computer.
Once the “good” old drive was backed up, I no longer cared about its future. So, with a Herculean effort, I carefully sacrificed parts from the working harddrive and installed them on the “bad” old drive. It fired right up, perfectly — something I never really expected. Success! I backed up all the old private data with absolutely no loss. Amazing, considering the circumstances!
Another trip to the electronics store later resulted in the purchase of two identical 120GB drives for an astoundingly low total of $220. This time, I installed each drive as the master on each of the third and fourth on-board IDE channels. My mediocre luck continued as Red Hat 7.1 did not adequately support some of the new hardware and devices, so I opted for a brand-new installation of RH 9.0, configuring the two new drives as RAID-1 devices. Now, if one goes bad, the other kicks in with no data loss or downtime. Flawless installation…
Days Six and Seven: Additional backups, restoration, configuration, testing, reorganizing — the usual rigmarole.
Huge sigh of relief! And now I’m answering the phone again…