SUMMARY: ufsdump 'hanging' problems

From: Alastair Cannon (Alastair_Cannon@ivax.com)
Date: Wed May 07 1997 - 02:55:45 CDT


My unmitigated thanks to all for some *very* enlightening responses to this
one!

Forgive the lapse into nostalgia - but I used to use dump & restore on the
Sun 3 & 4 series machines running SUNOS 4.2 BSD with absolutely no problems
at all. Why can't they leave things alone?

However, I digress..............

Responses received follow:-

---------------------------------------------------------------------------
---------------------------------------------------

Yup, I've seen it. And, yes, it is very hard to diagnose.
What I would suggest is that you try to gently kill one of the stuck dump
jobs so that you get e-mail from cron with the stdout/stderr from the dump.
That was the only way we figured out what it's problem was. Try kill -TERM
and see if you get the mail...

--
magi	David Wiseman, Network Manager	       e-mail: magi@csd.uwo.ca
     Department of Computer Science
     The University of Western Ontario	  fax: +1 519 661 3515
     London Ontario Canada N6A 5B7	      voice: +1 519 679 2111 x6879
---------------------------------------------------------------------------
------------------------------------------------------------
Does it hand the most on your empty filesystems or when doing
incremental dumps?
try:
103844-02: SunOS 5.5: ufsdump patch
103847-02: SunOS 5.5.1: /usr/lib/fs/ufs/ufsdump patch
103848-02: SunOS 5.5.1_x86: /usr/lib/fs/ufs/ufsdump patch
---------------------------------------------------------------------------
------------------------------------------------------------
Wrap your ufsdump commands up in a shell script, and run the script from
cron instead of running ufsdump directly.  Redirect the standard output
and standard error to a log file.  What's in the logfile should tell you
what you're doing wrong.  One possibility is using a full pathname to
ufsdump.  cron has a very limited idea of a search path.
--
Tim Evans                     |    E.I. du Pont de Nemours & Co.
tkevans@eplrx7.es.dupont.com  |    Experimental Station
(302) 695-9353/8638 (FAX)     |    P.O. Box 80357
EVANSTK AT A1 AT ESVAX        |    Wilmington, Delaware 19880-0357
---------------------------------------------------------------------------
-------------------------------------------------------------
Alastair,
This is one of the two primary reasons I strongly recommend against relying
on
ufsdump for nightly backups. It is well worth the cost to purchase a
commercial
grade backup facility, such as Solstice Backup or something similar that
does
not depend on ufsdump underneath (as Budtool does). While these tools have
their
own problems, they are designed to handle the complexities of running
distributed backups on busy networks, with occasional errors and such.
As for why you are seeing ufsdump hang, I would bet LAN congestion is the
problem. You might not think that an unattended ufsdump in the middle of
the
night could run into such a condition. It turns out that with a shared
ethernet,
if two pairs of SPARCstation 20/10 class machines or faster start doing
something that loads the network... backups, builds, etc., then the LAN
will be
running well into its top capacity. I have found ufsdump likes to hang in
that
case. Before my current customer purchased a higher end backup product, I
had
unattended ufsdumps running, using multiple servers to get enough tape
online to
backup all our systems. I found I had to stagger the times each backup host
ran
as if I had two running at the same time, I was likely to get a ufsdump
hang. Of
course, the only way to clear the problem once I got a hang, was to reboot
the
machine hosting the tape drive for the hung ufsdump and possibly the client
system as well. We were doing a LOT of rebooting.
I hope this helps,
-Marc
Marc S. Gibian
Telos Comsys                  phone: (617) 377-6350
PRISM/TFS                     email: gibian@stars1.hanscom.af.mil
---------------------------------------------------------------------------
-------------------------------------------------------------
According to the ufsdump man pages, it may be the 'c' or 'b' options that
you are specifying that is causing the problem.
  Since you're using DAT, there shouldn't be a need to specify either
the b or c options, since ufsdump auto-detects the media type and
length.
  As for the seemingly random file system/time of day things, that's
more likely to be a problem with your crontab, because ufsdump just
does what it's told...
  It all sounds remarkably like a crontab problem actually. Check the
formatting of your crontab entry very carefully.
     M. Reynolds.
--
#  Matthew Reynolds,  ****  Aston Space Geodesy  ****       #
---------------------------------------------------------------------------
------------------------------------------------------------
I have similar problems from time to time. You need to have a log of the
errors to get some
clues.  My log tells me every time what went wrong.  Here is my cron entry:
9 21 * * 5 /usr/backupeng2.sh > /usr/backupeng2.log 2>&1
good luck
sean
---------------------------------------------------------------------------
------------------------------------------------------------
Dear Alastair :
I have the sem problem with my client a few months before. The only
difference between your system and mine is I am running Solaris 2.4
I search through the mailing list and installed the patch 102680-04
solved the problem. If I am not mistaken, for Solaris 2.5, the similiar
patch is 103844-02.
Wish this will help !

\\\// ( @ @ ) -----ooO Q Ooo----- Liew Chee Wah --------------------------------------------------------------------------- ------------------------------------------------------------- Hello, This type of problem was in 2.4 and maybe 2.5 I believe there was a patch for it. The intermittent problem was caused by trying to dump empty directories. Sorry I don't remember the patch yo may want to try the sunsolve server. -- Jim Coby :Control Data Systems Inc. email : James.E.Coby.Jr@cdc.com phone : 1-800-345-6628 U.S. & Canada : 612-482-3434 International WWW Support Services page : http://www.cdc.com/support --------------------------------------------------------------------------- -------------------------------------------------------------



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:11:54 CDT