SUMMARY: compress/uncompress, large files and corruption

From: Michael Mark O <mark.michael_at_es.bss.boeing.com>
Date: Tue Aug 14 2001 - 15:59:04 EDT
(for the Boeing folk, this is * not * the root cause analysis report, it's the
mailing list summary for sunmanagers ...)

I received a range of answers, mostly falling into one of two categories:

- never heard of such a thing happening, /usr/bin/compress can and should work
OK

- yup, I had it happen too, and switching to gtar/gzip was the cure

For the record, we always have the databases shut down properly before beginning
the copy and compress step (a precaution noted by several folks, thanks for the
thought!)

Also, for the record, we already thought of whether it could be Solaris 2.6 (we
are already on the 105181-26 Recommended Patches cluster) or Veritas VxVM (3.1.1
plus patches) or VxFS (3.4 plus patches).  Nope.

Finally, for the record, after fixing a nagging problem with a gigabit ethernet
switch, rebooting the E10K domain and power-cycling the disk arrays, the problem
has not reoccurred.

(Of course, until the next time, whenever that is ... :-)

Nevertheless, we intend to continue work on improving the integrity of our
disk-based database backups.

Our alternatives are:

- do nothing, rebooting the domain and power-cycling the disk arrays made the
problem go away

- switch out compress for gzip, leave everything else identical

- use gtar instead of tar and add the z argument for compression, thus reversing
the tar and the compress stages (compressed archive instead of an archive of
compressed files)

- use vxdump with a lot of mt commands to forward-space the tape for each mount
point (ugh)

- just buy a lot (!) more disk, upgrade the backup network interfaces to gigabit
ethernet and stop using compress

There is an issue with gtar 1.12 (the one we currently have deployed on our
couple of hundred servers) and sparse files largefile filesystems.  It just
plain fails.  The ChangeLog on 1.13 indicates that this is fixed.  I'm getting
that upgraded to 1.13, after which we will test.

I'll do my best to provide a follow-up after the results of those tests are
available.

ttfn
mm
mark.michael@es.bss.boeing.com

> Anyone have any experience with large (Oracle) datafiles being corrupted on
> uncompress after being compress'ed?  How did you solve the problem, short of
> buying more disk for on-line (hot) backup storage in uncompressed format?
> 
> Urgent.  We are trying to transfer compressed data via tape.  It worked on this
> server's predecessor, which doesn't have the 2.6 patch cluster containing
> 105181-26.  There is a patch 107786-02 that replaced /usr/bin/compress,
> /usr/bin/uncompress and /usr/bin/zcat (they're hardlinked anyways).
> 
> Both old and new servers are E10K domain.
> 
> I will summarize.

> I've had this problem before.  I moved all of my scripts to gzip and I've
> had no problems since.
> 
> Thanks.
> 
> bfg

> 
> Ive never heard of anyone having the problem. Compressing should not corrupt
> anything.
> 

> > 
> > Sorry to be the bearer of bad news, but......
> > 
> > Oracle Corp. specifically states to never compress a .dbf file. They say they can not
> > guarantee consistency in any file that is compressed and then uncompressed, and if you
> > do so it is at your own risk. I've done it a few times and managed to get by
> > (luckily!), but several senior level DBA's have told me I've been extremely lucky and
> > to not push it or I'm gonna "get bit". While they are sometimes considered to be
> > alarmists, they still tell me everyone who compresses Oracle .dbf files eventually
> > runs into this problem (often when its too late to do anything about it,
> > unfortunately).
> > 
> > Have a Great Day!!!
> > 
> > 
> >                      --- Vern Walls
> >                          
> > 
> > Today's Words of Wisdom:
> > =======================
> > Windows 98: n.
> >         A sometimes useless extension to a minor patch release 
> >         for 32-bit extensions and a graphical shell for a 
> >         16-bit patch to an 8-bit operating system originally 
> >         coded for a 4-bit microprocessor, written by a 2-bit 
> >         company that can't stand for 1 bit of competition.
> > 
> 
> I got that from both our on-staff senior level DBA's as well as directly from Oracle
> Support (we have a Silver contract)
> 
> Have a Great Day!!!
> 
> 
>                      --- Vern
> 

> 
> If Oracle is running you will always get corruption.
> The datafile changes continuously.  Try a export and 
> compress the export.  Export is done inside oracle.
> 
> -Mike
> 

> 
> your best bet is to do an export of the database and then an import.
> compress has issues with 'empty' spaces in data.
> 
> On a side note.. you may want to use gnu tar first, then compress.

> I haven't seen a summary on this, so I will throw my hat in.
> 
> Yes, I have had a gzip problem with large ( >4Gb) database files.  When
> uncompressing the files, I always get CRC errors and it fails.  But not
> actually using "compress" to compress files.  This was under Tru64. (old
> Digital Unix)
> 
> I have routinely compress/uncompressed 8Gb files and larger.  Of course
> the speed and compression factor wasn't as good as gzip, but at least I
> could uncompress the file.
> 
> For the past year and some, I have been working strictly in a Solaris
> shop, and haven't had to compress large files like this.  If you can get
> the gzip source code, I would try to compile it with 64bit support.  I
> think the standard gzip binary that everyone tends to use is 32bit
> compliant, and may not handle the large files well.
> 
> -- 
> ----------------------------------------------------
> Andrew Stueve              |   Office 703-758-5221
> Team Lead/Sr. Engineer     |   Mobile 703-898-8917
> Worldcom                   |   Pager 1-888-454-7594
> ----------------------------------------------------

>    We haven't seen this directly, but I bet if you started the compression
>    while the DB was running (and therefore the file is not guaranteed to be
>    quiescent), you could easily have problems when uncompressing; something
>    changing in the "front" of the file after gzip/compress/whatever has
>    already read past that section.
> 
> -- 
> Karl Vogel                <vogelke@dnaco.net>
> ASC/YCOA, Wright-Patterson AFB, OH 45433, USA
> 
 
-- 
mark michael
enterprise computing unix, info svcs, boeing satellite systems
e-mail mark.o.michael@boeing.com ph 310 364 6759 fax 310 364 5331
snail-mail po box 92919 m/s sc s50 x340 los angeles ca 90009-2919 usa
Received on Tue Aug 14 20:59:04 2001

This archive was generated by hypermail 2.1.8 : Wed Mar 23 2016 - 16:25:01 EDT