SUMMARY: doing the impossible data recovery

From: Paul Caskey (pcaskey@bassetbyte.com)
Date: Fri Dec 27 1996 - 12:50:29 CST


My question was:

-------------------- Begin Original Post --------------------
I've got a user who managed to blow away his entire home directory *and*
the online backup copy he had of it. The tape backup doesn't have his
critical files because of a permission problem. (My fault and his.)
He's hurtin'. Like 6-12 months of lost software development work.

On his online backup, he blew it away with an 'rm -r' on a local Solaris
2.5 file system. He hasn't touched that disk, since. It hasn't been
unmounted. Is there any way in hell of recovering those files, even if
all the file names are lost? I know this is a reach but I have to try.
What if I use dd and extract the raw data off the entire disk/partition,
dump it into a giant file, and sort through it, later? Any better
ideas? I think I've heard of places you can send disks to for emergency
recoveries like this ... ?
-------------------- End Original Post --------------------

I will append all the responses I got to the end of this message. But
here's a summary of what happened:

The #1 place people pointed me to for data recovery was OnTrack Data
Recovery (800-752-7557), so I gave them a call, before I did anything.
They said that chances of recovery from a UNIX "rm -r" were very slim,
but it might be worth a shot. After further discussion and a question
directed at one of their engineers, their response was that all they
could give me would be an ASCII dump of all the data on the disk, for me
to sort through. I could do that, myself, so they said to go for it.

Things looked pretty grim. I went over some possible options with the
user. The data that had to be recovered was a bunch of C++ source
files. I asked him if he had any printouts. No. Had he emailed any of
it to his peers for review? No. Had he made a distribution tape and
sent that off to a customer? No. Was an [old] copy of any of it on
another disk or server from any type of backup or cache or anything? No.

This all started when he went to make his own backup of his home
directory to his backup disk, "/home1". He would go into the Sun File
Manager and drag-n-drop his home directory over to /home1. But since he
also uses a PC, and can never remember which one moves/copies
with/without the "Control" key held down, he managed to MOVE his home
directory instead of COPY it, this time.

Well, when he went over to /home1, he noticed that the backups he had
been making this way for months had been stacking up on one another.
Say his home dir name was "someuser". /home/someuser was what he was
copying. What he had was /home1/someuser/someuser/someuser... each one
had its own subdirectory tree containing older and older backups.

He saw that mess, thought to himself, "This is ugly! I don't need all
these backups!" And proceeded to remove all the copies through File
Manager. It wouldn't let him, or it asked him too many confirmations,
so he got sick of that and went to the shell and did the dreaded
"rm -rf /home1/someuser". Poof. Then he went to make a second, clean
backup copy of /home/someuser, only to discover it was gone -- since it
had been moved instead of copied.

He panicked, but figured he would be saved by the tape backup system
that ran every night. But alas, when I restored from tape, we
discovered that most of the important stuff wasn't there! This requires
a diversion in the story, so bear with me. When I set up this
department's backup system, we had a long meeting to go over the
convenience/security aspects, and they [including this user!] decided it
was important to be able to easily EXCLUDE things from being backed up,
because at least two people on the network had sensitive projects they
were backing up themselves, and they didn't want those things ending up
on an easily-stealable tape down the hall. Also, having /.rhosts was
too insecure. And we had tried some fancier backup packages that used
their own net protocol, and hated them all. (That's yet another story.)

So the solution we all agreed upon was to use NFS. Not blazing fast,
but not too bad for doing reads. (NFS writes are what's really slow.)
Each client on the network exported all its partitions to the backup
server, with standard "ro" (read only) priveledges. A little more
secure than /.rhosts. I wrote a script on the backup server that, for
each client partition, mounted it, backed it up, and unmounted it. Easy
enough. This was all done as root, but of course root doesn't have
special read priveledges on NFS mounts unless that's explicitly allowed.
Which it wasn't. So the caveat I drilled into their heads was that THEY
HAD TO HAVE DIRS AND FILES WORLD READABLE IN ORDER FOR THEM TO BE BACKED
UP. They all said they understood. I felt nervous.

My backup script happens to print an for every file it hits "permission
denied" on. So one of the first things I intended to do was send each
user an email message containing a list of all the dirs/files that were
not being backed up, for them to double check. But I got sidetracked
and never did this. My bad. Adding to the regret factor was that just
days before this whole disaster happened, I also intended to change the
"ro" to "root" in /etc/dfs/dfstab everywhere, because this whole
situation of needing to exclude certain data had vanished. But, alas, I
was too late.

Anyway, as you've guessed by now, this user's critical C++ source dirs
were not world readable. So they weren't on the backup tape.

Our only saving grace was that when this user did the massive delete, he
had a bunch of text editors open on his screen, each one with one of his
crucial C++ files in it which he was working on. He was bright enough
to "Save As" those files to a safe place, not on /home1, and that saved
his bacon. He estimated that alone bumped the lost work down from 12
months to 2 months. Now it was up to us to pull as much as possible
from the unwiped data on the /home1 disk. That was his other saving
grace -- that /home1 disk was inactive, and with four to five copies of
every file (remember the nested backups) on there, we had a good chance
of pulling something useful off it.

By the time I got to the user's machine, the disk in question had been
mounted for over 24 hours, and he had been all over it (with reads
only). Some people [see below] told me to power off the disk, and
others said to unmount it as normal. I figured at this point, I should
unmount it because it had been on so long, anyway, and I didn't want
fsck running on it during reboot, which might do god-knows-what to the
critical bits on that disk.

So I umount'd the disk, and went to work. It's a 1 GB disk, one big
partition. I did 'dd if=/dev/rdsk/c0t1d0s2 of=/remov/home1.raw' (/remov
is a 4GB removable SCSI-2 hard drive with nothing on it.) This is on a
Sun Ultra, and it took an hour! My next step was to 'split -b' the .raw
file into 64MB chunks, a more managable size. This split command was
going to take an hour, also! I couldn't understand why it was going so
slow (around 300 KB/sec for local disk access?!?). For kicks, I ftp'd
the .raw file over to an SGI, and did the split there. Ran 10 TIMES as
fast, finishing in 7 minutes. But then my next step, which was to run
"strings" on each piece, started taking forever on the SGI (a new
Indigo2)! So I hauled it back over to the Ultra, which blitzed through
the "strings" process in 10 minutes or less. Go figure.

What we ended up with was 16 files of 12MB to 40MB each, full of lines
of text to wade through. I opened emacs full screen and we went for it.
The key thing to search for turned out to be "//", which hit the
occasional URL, but also hit all C++ comments. This user had a habit of
putting the file name in a comment as the first line of every file,
which turned out to be very helpful. It gave us a BOF marker as well as
a filename. We found several of his files fairly soon, and a Makefile
or two. Cluttering the mess was a huge source tree for a CAD package he
has a source license for, which is also all in C++. But almost all of
the CAD source had TABS for indention, and he used spaces. The tabs got
filtered out by "strings", so all the CAD source stuck out because it
was all crammed up against the left margin. This made it much easier to
human-scan for his source, because it looked so different.

I believe the Solaris/SYSV block size is 8KB, which is pretty big
compared to source code. So we got several complete files. There was
also a huge swap file on that disk which added to the clutter. (I now
want to learn how to fill a swap file full of zeros, which could have
been done before the raw dump.) We spent three 4-hour afternoons, a
total of 12 long hours. When we were all done, he estimated he was down
to 2 weeks or less of lost work. We saved the "strings" files for him
to use later if he wanted to search for something explicitly that we
might have missed.

I came out of this with a much clearer understanding of just how much
data 1 gigabyte is. And an even stronger appreciation for the
importance of good backups. I quickly did the "ro-->root" change
everywhere so the backups are working properly, now. Life goes on.

-------------------------------------------------------------------------
Paul Caskey mailto:pcaskey@swcp.com http://www.swcp.com/pcaskey
-------------------------------------------------------------------------
"Even if you're on the right track,
 you'll get run over if you just sit there." --Will Rogers (1879-1935)

---------------------- Responses Follow [Edited] ----------------------

>From bonzo@swcp.com Wed Dec 18 17:19 MST 1996
Date: Wed, 18 Dec 1996 16:19:21 -0800
From: Bonzo Amin <bonzo@swcp.com>
Subject: Re: doing the impossible data recovery

Paul Caskey wrote:
>
> I've got a user who managed to blow away his entire home directory *and*
> the online backup copy he had of it. The tape backup doesn't have his
> critical files because of a permission problem. (My fault and his.)
> He's hurtin'. Like 6-12 months of lost software development work.

Ow ow ow ow ow ow ow!!!

> On his online backup, he blew it away with an 'rm -r' on a local Solaris
> 2.5 file system. He hasn't touched that disk, since. It hasn't been
> unmounted. Is there any way in hell of recovering those files, even if
> all the file names are lost? I know this is a reach but I have to try.
> What if I use dd and extract the raw data off the entire disk/partition,
> dump it into a giant file, and sort through it, later? Any better
> ideas?

If I were considering such things, I'd probably power off the disk first
and think about umounting it or shutting the system down later. In
reality, that probably wouldn't matter much because the system probably
synced that disk as much as it's going to no later than fifteen seconds
after the damage was done. In any case, you should probably take that
disk offline or at least mount -r -o remount as soon as possible if you
want to recover anything from it. I'd be thoroughly paranoid of some
random background process scribbling some trifle on the disk and blowing
the whole gig.

I think there used to be tools out there like fsdb or somesuch that
would help you pick through disk internals, but I've never messed with
them. I don't know if any of them know how to deal with UFS disks, but
it's probably worth a look. I wish you could get fsck to do something
useful in this case.

> I think I've heard of places you can send disks to for emergency
> recoveries like this ... ?

I've seen them listed in the Computer Shopper, but haven't heard
anything about any of them. Chances are Excite will find you at least
a couple. There's probably quite a few around, but at this point in
time most of them probably specialize in Novell server disks. If you
can find one to do the job, it'll probably save you a lot of personal
grief in that you won't have to pull your hair out trying to learn file
system internals while under fire.

Good luck. These things happen, but they suck.

(:.:)

>From janice@pinata.West.Sun.COM Wed Dec 18 17:20 MST 1996
Date: Wed, 18 Dec 1996 17:18:38 -0700 (MST)
From: "Janice Anthes [Sun New Mexico SE]" <janice@pinata.West.Sun.COM>
Reply-To: "Janice Anthes [Sun New Mexico SE]" <janice@pinata.West.Sun.COM>
Subject: Re: doing the impossible data recovery
To: pcaskey@bassetbyte.com

Before you panic. Go to Sun's online Catalyst
catalog. http://cataylst.sun.com
and do a search on Data Recovery.

I found several companies who claim they provide this
service.

Good luck.

Janice

=============================================
       Janice Anthes, Systems Engineer
         janice.anthes@west.sun.COM
           Albuquerque, New Mexico
Phone: (505) 262-5204 FAX: (505) 268-5264
=============================================
        

>From mstier@mindspring.com Wed Dec 18 18:07 MST 1996
X-Sender: mstier@mindspring.com
Mime-Version: 1.0
Date: Wed, 18 Dec 1996 20:09:04 -0500
To: pcaskey@bassetbyte.com (Paul Caskey)
From: Matthew Stier <mstier@mindspring.com>
Subject: Re: doing the impossible data recovery
Content-Type: text/plain; charset="us-ascii"
Content-Length: 1252

DD'ing seems to be the only solution. However be prepared to do a LOT of
scavaging.

--
Matthew Stier
mstier@mindspring.com
http://www.mindspring.com/~mstier

>From Ian_MacPhedran@mackenzie.usask.ca Wed Dec 18 18:57 MST 1996 X-Authentication-Warning: imhotep.USask.Ca: macphed owned process doing -bs Date: Wed, 18 Dec 1996 19:57:22 -0600 (CST) From: Ian MacPhedran <Ian_MacPhedran@mackenzie.usask.ca> X-Sender: macphed@imhotep To: Paul Caskey <pcaskey@bassetbyte.com> Subject: Re: doing the impossible data recovery In-Reply-To: <199612182345.QAA17145@sherlock.cmc.sandia.gov>

Well, if no one else is writing to that disk, it may still be okay. (It might have been better to dismount it immediately to protect the data on it.)

You can try to sort out stuff from a copy of the raw disk, it might work for you.

As you say, there are places which do this sort of thing - you might want to do a search through the web: http://www.datarec.com/ - Data Recovery Labs http://www.cbltech.com/ - CBL Data Recovery Technologies http://www.vantagetech.com/ - VANTAGE Technologies, Inc http://www.mind.net/adr/adr.htm - Advanced Data Recovery Inc etc. (Note: I have not used the services of any of these vendors - I just did a search, and these came back as ones offering this service. You don't say where you are located - you will want to find a vendor close to you.)

Ian. ---------------------------------------------------------------------------- Ian MacPhedran, Engineering Computer Centre, 2B13 Engineering Building, University of Saskatchewan, 57 Campus Drive, Saskatoon SK S7N 5A9, CANADA Phone: (306)966-4832 Fax: (306)966-5205 Email: Ian_MacPhedran@engr.USask.CA

>From danno@fv.com Wed Dec 18 19:02 MST 1996 X-Authentication-Warning: mailrus.fv.com: danno owned process doing -bs Date: Wed, 18 Dec 1996 21:04:24 -0500 (EST) From: Dan Pritts <danno@fv.com> To: Paul Caskey <pcaskey@bassetbyte.com> Subject: Re: doing the impossible data recovery In-Reply-To: <199612182345.QAA17145@sherlock.cmc.sandia.gov>

On Wed, 18 Dec 1996, Paul Caskey wrote:

call ontrack data recovery. I dont' know their phone number but I bet it's in 800 information.

If the disk has not been touched since this happened, the chances are very good that they can get stuff back. It won't be cheap, but it will be cheaper than 6 months of lost work.

I am afraid i don't know whether to suggest that you umount the disk, or just power off the system. Call ontrack quick, though.

dan pritts Unix System Admin First Virtual Holdings, Inc. danno@fv.com 313-213-3791

>From shifter@portal.stwing.upenn.edu Wed Dec 18 19:59 MST 1996 From: Shifter <shifter@portal.stwing.upenn.edu> Subject: Re: doing the impossible data recovery To: pcaskey@bassetbyte.com Date: Wed, 18 Dec 1996 22:01:51 -0500 (EST) In-Reply-To: <199612182345.QAA17145@sherlock.cmc.sandia.gov> from "Paul Caskey" at Dec 18, 96 04:45:40 pm

If the disk hasn't been written to since, I would do a dd on the raw device, and put that into a file (on a separate filesystem). Then you could use emacs and go thru the tedious cutting and pasting things out of that big file and into smaller files.

-John

--

Shifter shifter@portal.stwing.upenn.edu

>From 6@swcp.com Wed Dec 18 21:36 MST 1996 Date: Wed, 18 Dec 1996 20:38:17 -0800 From: 6 <6@swcp.com> Organization: Light Dreams MIME-Version: 1.0 To: Paul Caskey <pcaskey@bassetbyte.com> Subject: Re: doing the impossible data recovery References: <199612182345.QAA17146@sherlock.cmc.sandia.gov>

Paul Caskey wrote: > > What if I use dd and extract the raw data off the entire disk/partition, > dump it into a giant file, and sort through it, later? Any better > ideas? I think I've heard of places you can send disks to for emergency > recoveries like this ... ?

If it is text there is a pretty good chance at salvaging ALOT of it.

First TURN OFF THE DISK do not unmount, get out out from under OS control.

Next, DD the raw disk device out to tape.

You might be able to get away with just going through the DD and extracting text files from there. Other options are to go at it sector by sector Pulling the information out.

>From jefi@kat.ina.de Thu Dec 19 01:53 MST 1996 From: Jens Fischer <jefi@kat.ina.de> Date: Thu, 19 Dec 1996 09:55:06 +0100 To: pcaskey@bassetbyte.com Subject: Re: doing the impossible data recovery X-Sun-Charset: US-ASCII

Hi Paul,

have a look at man fsdb and man fsdb_ufs. fsdb is a tool for examining and reconstruction of damaged filesystems. However, it will not be easy to reconstruct your data as you need alot of knowledge about filesystem structures.

Hope that helps

Regards - Jens Fischer

>From harvey@iotek.ns.ca Thu Dec 19 08:02 MST 1996 Date: Thu, 19 Dec 1996 11:08:29 -0400 (AST) From: Harvey Wamboldt <harvey@iotek.ns.ca> To: Paul Caskey <pcaskey@bassetbyte.com> Subject: Re: doing the impossible data recovery In-Reply-To: <199612182345.QAA17145@sherlock.cmc.sandia.gov>

On Wed, 18 Dec 1996, Paul Caskey wrote:

> I've got a user who managed to blow away his entire home directory > *and* the online backup copy he had of it.

> ...

> What if I use dd and extract the raw data off the entire disk/partition, > dump it into a giant file, and sort through it, later? Any better > ideas? I think I've heard of places you can send disks to for emergency > recoveries like this ... ?

I'm no expert on Unix data recovery, but on PC's its fairly trivial to read a disk block, then run it through a filter which decides if it is text, and if it is, write that text block into a file. Then later, with a text editor, you can visually stitch your text files back together. This doesn't work for binaries though, and you have to handle the partial text blocks at the end of files intelligently. It is fairly simple to "score" a block of text based on letter pair frequencies, ie pairs such as "th", "ed", "es" score high while "3x", "u{" etc score low. These blocks can even be sorted on "long words" to move related information closer to each other. I don't have any programs to help with this, but the crypto guys might.

Best of luck,

-H-

Harvey M Wamboldt ^ E-Mail: harvey@iotek.ns.ca MDA Inc 1000 Windmill Rd. Suite 60 ^ Fax: (902)468-2278 Dartmouth NS, B3B 1L7, Canada ^ Phone: (902)481-3531

>From iv08480@issc02.mdc.com Thu Dec 19 09:47 MST 1996 Date: Thu, 19 Dec 1996 10:48:35 -0600 From: iv08480@issc02.mdc.com (Colin Melville) To: pcaskey@bassetbyte.com Subject: Re: doing the impossible data recovery X-Sun-Charset: US-ASCII Content-Type: text Content-Length: 2218

Paul,

Don't know about the dd stuff, sounds like a real long-shot.

Search the web for data recovery, I know they're out there. If you can't find anything, se me a note, I'll call our local HP engineer. He mentioned a disk recovery service he had to use for a platter crash once...very expensive (multi K$!!).

Good luck, Colin

%)====================================================%) %) Colin Melville | cmelville@bigfoot.com %) %) UNIX Systems Administrator| NTS Technical Services %) %) UNIX Server Support Team | http://www.ntstech.com %) %) %) %) Views expressed are my own. %) %) %) %) Supporting: McDonnell Douglas Aircraft Corp. %) %) http://www.mdc.com %) %)====================================================%)

>From foster@bial1.ucsd.edu Thu Dec 19 10:31 MST 1996 From: foster@bial1.ucsd.edu Date: Thu, 19 Dec 1996 09:32:54 +0800 To: pcaskey@bassetbyte.com Subject: Re: doing the impossible data recovery X-Sun-Charset: US-ASCII Content-Type: text Content-Length: 302

Try OnTrack Data Recovery. It's a bit expensive, but not when you're talking about 6mo. of software development.

800-752-7557

They were able to recover all files from an optical disk that had its file table trashed! I think they could recover your files for you.

Dave Foster foster@bial1.ucsd.edu

>From jk@stallion.ee Fri Dec 20 12:42 MST 1996 Date: Fri, 20 Dec 1996 21:44:09 +0200 (EET) From: Jyri Kaljundi <jk@stallion.ee> X-Sender: jk@nebula To: Paul Caskey <pcaskey@bassetbyte.com> Subject: Re: doing the impossible data recovery In-Reply-To: <199612182345.QAA17145@sherlock.cmc.sandia.gov>

Hi Paul,

On Wed, 18 Dec 1996, Paul Caskey wrote:

> I've got a user who managed to blow away his entire home directory *and* > the online backup copy he had of it. The tape backup doesn't have his > critical files because of a permission problem. (My fault and his.) > He's hurtin'. Like 6-12 months of lost software development work.

There is one great company in Norway that does data recovery. Have a look at their web site at http://www.ibas.no/ or e-mail ibas@ibas.no

Usually they recover disks after crashes and disasters, but rm -rf might have a solution also. It is not cheap, beginning from 1000-2000 US dollars. But then 6-12 months work is not cheap either.

I hope they can help you,

Juri



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:11:19 CDT