SUMMARY: Backup Copilot and RPC: Timed out

From: Dirk Behrens (behrens@imes.uni-hannover.de)
Date: Thu Sep 09 1993 - 08:58:45 CDT


Few time ago I asked:

>successfully set up the Backup Copilot to make a full dump of about 9 GB
>on two DAT tapes (WangDAT 2600) at the weekend I get every time following
>mail:
>
>> Successful completion host/conffile
>> Could not obtain disaster recovery tape listing.
>> Be sure the command '/usr/etc/dumpdm -s host -V tapelist` executes correctly.
>
>When taking a look at the log of the dump you will read:
>
>> ....
>> DUMP: DUMP IS DONE
>> dumpex: exec (/usr/etc/dumpdm -s host dbinfo; /usr/etc/dumpdm -s host -V
> tapelist)> /var....
>> host: RPC: Timed out
>
>What happened and how to handle this situation?
>(I found nothing in the documentation.)
>
>Environment: SUN 10-30, OS 4.1.3, Backup Copilot 1.0, Backup Copilot Jumbo
> patch 100551-02 installed.
>
>Thanks for any help. I will summaries.
>
>BTW: A few times ago I asked about Backup Copilot in long play mode couldn`t
> find the end of the last dump on the DAT tape after the machine has been
> rebooted or the tape has been ejected and than again inserted. There was
> only one reply "me too". I know, this is mentioned in the documentation.
> But I wonder that nobody else seems to have this problem and it`s not
> solved yet.

I have get two answers. One said the RPC times out message could be because of
too less disk space, but this can`t be. The other said he has the same problem
when a very big filesystem as the last filesystem to be dumped. He suggested to
use the -m option of the rpc.dumpdb and increase it from 1 to 5. Because we have
a large filesystem at the end of every dump-cycle (about 4GB) and the above
described failure only appears when doing a level 0 backup I tried this, but
without success. So I called sun hotline in germany. They suggested to apply
following patches:

100551-03: BacCopilot jumbo patch
100726-09: sun4m jumbo patch
100075-11: rpc.lockd jumbo patch

After applying these patches we have the same problems like before and added a
new one: Sometimes the Backup Copilot run crashes the whole system ........

The position problem in long play mode seems to be:
When the system crashes when dump is running, which will cause a tape rewinding,
Backup Copilot positions the tape absolutely to the last fileset written to e.g. 7
and will repeat the corrupted dump of the last filesystem. This will be correct.

But when the system crashes when dump is not running, Backup Copilot (when next
dump is to be done) don`t position the tape at the end of the last fileset and so
will overwrite existing filesets starting by 1 but logged the correct fileset-
number e.g. 8 in his database. This will cause an inconsistent database!!

Thanks to Richard Perlotto and Rafi Sadowsky.
See their responses attached:

> From vlsiphx!paladin!perlotto@enuucp.eas.asu.edu Wed Jul 21 13:39:21 1993
> Reply-To: vlsiphx!paladin!perlotto@enuucp.eas.asu.edu
> Date: Tue, 20 Jul 93 14:38:18 MST
> From: vlsiphx!paladin!perlotto@enuucp.eas.asu.edu (Richard E. Perlotto II)
> To: behrens@imes.uni-hannover.de
> Subject: Re: Backup Copilot and RPC: Timed out
> X-Sun-Charset: US-ASCII
> Content-Length: 2191
>
> Check your disk space. The times that I saw this problem,
> the backup database had filled up an entire partition. You
> may need to delete older data from the database.
>
> Richard
>
> P.S.
>
> Try a recovery, it probably will not work either.

> From rafi@tavor.openu.ac.il Wed Jul 28 16:01:46 1993
> From: Rafi Sadowsky <rafi@tavor.openu.ac.il>
> Subject: Re: Backup Copilot and RPC: Timed out
> To: behrens@imes.uni-hannover.de
> Date: Wed, 28 Jul 1993 16:57:56 +0200 (EET DST)
> X-Mailer: ELM [version 2.4 PL21]
> Mime-Version: 1.0
> Content-Type> : > text/plain> ; > charset=US-ASCII>
> Content-Transfer-Encoding: 7bit
> Content-Length: 2477
>
> I had a similar problem and solved it by adding the "-m 5" option to rpc.dumpdb
> it seems to me that if the last filesystem you dump has a large no. of inodes
> it locks the databse for a _long_ time - until it completes the update
> this option makes rpc.dumpdb update faster with a large fs - and avoids the
> timeout ( which also causes other problems )
> also you should make sure you have enough free space on the database fs
> -otherwise thing can get really strange
> BTW: the "-m" option is documented - the connection I made isn't ;-(
> you may of course need a larger argument ( the default is 1 max is 200 )
..
..
> I had no problems with long play on a HP-DAT ( I do remove the cassete between
> dumps)
>
> Good Luck,
> Rafi
>
> --
> +-------------------------------+---------------------------------------+
> | Rafi Sadowsky | rafi@tavor.openu.ac.il |
> | Comp.Sci. dept |-[also postmaster@openu.ac.il]---------+
> | Open University of Israel | Voice: +972-3-6460592 |
> | Tel-Aviv, Israel | Fax: +972-3-6460483 |
> +-------------------------------+---------------------------------------+
>



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:08:10 CDT