Recovering Data from a Failed Synology NAS
by Ganesh T S on August 22, 2014 6:00 AM ESTIt was bound to happen. After 4+ years of running multiple NAS units 24x7, I finally ended up in a situation that brought my data availability to a complete halt. Even though I perform RAID rebuild as part of every NAS evaluation, I have never had the necessity to do one in the course of regular usage. On two occasions (once with a Seagate Barracuda 1 TB drive in a Netgear NV+ v2 and another time with a Samsung Spinpoint 1 TB drive in a QNAP TS-659 Pro II), the NAS UI complained about increasing reallocated sector counts on the drive and I promptly backed up the data and reinitialized the units with new drives.
Failure Symptoms
I woke up last Saturday morning to incessant beeping from the recently commissioned Synology DS414j. All four HDD lights were blinking furiously and the status light was glowing orange. The unit's web UI was inaccessible. Left with no other option, I powered down the unit with a long press of the front panel power button and restarted it. This time around, the web UI was accessible, but I was presented with the dreaded message that there were no hard drives in the unit.
The NAS, as per its intended usage scenario, had been only very lightly loaded in terms of network and disk traffic. However, in the interest of full disclosure, I have to note that the unit had been used againt Synology's directions with reference to hot-swapping during the review process. The unit doesn't support hot-swap, but we tested it out and found that it worked. However, the drives that were used for long term testing were never hot-swapped.
Data Availability at Stake
In my original DS414j review, I had indicated its suitability as a backup NAS. After prolonged usage, it was re-purposed slightly. The Cloud Station and related packages were uninstalled as they simply refused to let the disks go to sleep. However, I created a shared folder for storing data and mapped it on a Windows 8.1 VM in the QNAP TS-451 NAS (that is currently under evaluation). By configuring that shared folder as the local path for QSync (QNAP's Dropbox-like package), I intended to get any data uploaded to the DS414j's shared folder backed up in real time to the QNAP AT-TS-451's QSync folder (and vice-versa). The net result was that I was expecting data to be backed up irrespective of whether I uploaded it to the TS-451 or the DS414j. Almost all the data I was storing on the NAS units at that time was being generated by benchmark runs for various reviews in progress.
My first task after seeing the 'hard disk not present' message on the DS414j web page was to ensure that my data backup was up to date on the QNAP TS-451. I had copied over some results to the DS414j on Friday afternoon, but, to my consternation, I found that QSync had failed me. The updates that had occurred in the mapped Samba share hadn't reflected properly on to the QSync folder in the TS-451 (the last version seemed to be from Thursday night, which leads me to suspect that QSync wasn't doing real-time monitoring / updates, or, it was not recognizing updates made to a monitored folder from another machine). In any case, I had apparently lost a day's work (machine time, mainly) worth of data.
55 Comments
View All Comments
Impulses - Friday, August 22, 2014 - link
While I generally agree with your logic (having never given my desktop up as my primary system, and being single), saying "just plug the laptop in via TB" or whatever isn't exactly a viable alternative for many users.I don't own a NAS, but it seems to me the biggest market for units are laptop dependant and/or multi-user households... When you have a couple and possibly kids each with their own laptop it's much easier to have a centralized media store in a NAS than anything directly attached.
Gigaplex - Saturday, August 23, 2014 - link
"A consumer can just buy high availability as a service (such as from Amazon services)"Not on ADSL2+ when dealing with multiple TBs of data I can't.
wintermute000 - Saturday, August 23, 2014 - link
QFTWHangFire - Friday, August 22, 2014 - link
Losing a day's work is considered acceptable in most environments. In theory yesterday (or last Friday) is fresh in everyone's mind, and the raw source material (emails, experimental data, FAXes, etc.) are still available in their original form to redo any data entry.What is interesting about the QSync situation is the cascading affect of failures. If caught early, through log examination, dashboards, whatever, true disaster can be averted. If minor issues like sync fails are allowed to continue, and RAID failure follows, say, a month later, then a month's worth of work can be lost. That is not acceptable in any environment.
Kougar - Friday, August 22, 2014 - link
Thanks for the article! I have a ~6 year old TS-409 Pro that is still running great, but internal component failure has been on my mind for awhile now. I'll be bookmarking this in case I ever need to use recovery options on it as I wasn't aware of either of these tools!kmmatney - Friday, August 22, 2014 - link
Nice article! I use a WHS 2011 server, with Stablebit DrivePool for redundancy. The nice thing about Drivepool is that the drives are kept in standard NTFS format. You can just take the drive out, and plug it into any computer to retrieve files, so data recovery is a piece of cake.Impulses - Friday, August 22, 2014 - link
Shame WHS is now RIPDanNeely - Friday, August 22, 2014 - link
Yeah. I'm really hoping ZFS or Btrfs NASes (without huge price premiums) will be available in the next year and a half as reasonable replacements for my current WHS 2011 box.Impulses - Friday, August 22, 2014 - link
That'd be nice, I never bought my own but I recommended and set up several for various clients & family members with small businesses...No clue what I'd tell them to migrate to right now if one were to break down, the ease of recovery and expansion was one of the biggest draws to WHS and in fact the reason many picked it over cheaper NAS boxes.
Gigaplex - Saturday, August 23, 2014 - link
There's a reason it was killed off. It has some serious design flaws that trigger data corruption, and Microsoft couldn't figure out how to resolve them. It has great flexibility but I wouldn't trust it with my data.