late SUMMARY: detailed disksuite questions

From: Padmanabhan Ramadurai <durai_at_head-cfa.harvard.edu>
Date: Mon Sep 08 2003 - 12:43:49 EDT
Hello all,

Sorry for the late reply. 

Thanks a bunch to...

	Hichael Morton 
	Kevin Buterbaugh
	BAUMLER Julie L
	Russell_C_Page

Original email is at the end.....

Consensus answers are: 

Q1, Q2, Q3:

	Yes / can be ufs-logged and it is good to do that. 
	Recommended for all file systems to avoid long fscks
	during reboot. No need on read-only mounts. 

	swap mirroring is needed if the system panics and you want a crash
	dump (all times) for crash/panic analysis. Apparently many people 
	do it. mirroring /, /usr/ is important for HighAvail systems.

Q4, Q5:

	Ok, this is what delayed the 'summary'. the delay in reboot is
	NOT due to bug-ids 4361013/4621691. See sunsolve for bug details.
	This was easy to isolate, first by adding comments to lvm.init
	and lvm.sync scripts. the delay was much before that. confirmed
	by totally removing metadisks from the E3000 and just running
	ufs. the delay was still there, about 35 minutes to boot.

	ofcourse, the box is running all the latest patches as of 01sep03.

	this box had an A1000 attached to it before and not anymore. but
	still had the RMgr software installed. we suspected this software
	and the entries it makes in sd.conf file and path_to_inst file.

	removing those entries and also the Rmgr package also did not help.
	this distracted me for some time.

	I tried running debug turned on in  /etc/system file to see where
	the 'block/delay' was happening. most of the time, it happened 
	just after loading the 'atata' module, but not always. 

	the system throws the message

	clock-board-tod does not match the io-boards though it should
	not happen since we are running 103346-30. also executing hte
	command

	copy-clock-tod-to-io-boards at the boot prom did not help,

	BUT we have a fix and it works all the time (3 times) though
	I dont know why.

	If the I/O board is installed in slot-0 after the cpu board
	is installed in slot-3, there is delay in booting. (up to 40 min)
	but if the cpu board is installed in slot-3 after the ioboard,
	the boot time is only 3 to 5 minutes.

	I dont know why, but that is how it is behaving. By the way,
	even with 'delayed boot', once booted the system runs fine,
	no issues, no slowness nothinng. but just very slow boot.

oh well, thanks for your time,
Durai


---------- Forwarded message ----------
Date: Fri, 22 Aug 2003 12:34:10 -0400 (EDT)
From: Padmanabhan Ramadurai <durai@head-cfa.harvard.edu>
To: sunmanagers@sunmanagers.org
Subject: detailed disksuite questions

Hello all,

----------------------------------------------------------------------
Solaris 2.8 (Kernel 108528-22) 	Disksuite 4.2.1 with 108693-17
----------------------------------------------------------------------
Q1:

Disksuite 4.2.1 Answerbook documentation says that UFS logging can be 
enabled/used with all file systems other than /. But System Administrator 
Answerbook says that UFS logged could be done all filesystems including 
the root (/) file system.

Which info is correct? If ufs-logging indeed cannot be used with some 
system filesystems, what are they?
----------------------------------------------------------------------
Q2:

Assuming that ufs-logging can be done all system filesystems,
does it make sense to turn on ufslogging for /,/var and /usr?
----------------------------------------------------------------------
Q3:

What about swap? 
Some people suggest that it is plain waste or resource. I'm looking
to learn from collective wisdom here.
----------------------------------------------------------------------
Q4:

(Sunsolve) Bug ID 4361013 [Dirty region bit maps at shutdown can cause 
long mirror resyncs on next boot] which was backed out by BugId 4477775;
Later fix (4515606) was again backed out by 4621691.

108693-17 does not seem to fix 4361013/4621691 still. 

What is the resolution the community is using for this problem? Just
turning off the logging is the only solution?
----------------------------------------------------------------------
Q5:

My partitions are like below...

> df -kl | grep dsk
/dev/md/dsk/d30       493983  195984  248601    45%    /
/dev/md/dsk/d31      2054959  877580 1115731    45%    /usr
/dev/md/dsk/d33       493983  135816  308769    31%    /var
/dev/md/dsk/d35      2486843 2229707  207400    92%    /internal

[d34 is swap and 3GB in size]

On an E3000 (248MHz) with 5GB of RAM, with ufslogging turned ON,
for all filesystems, a soft reboot takes about 90 minutes due to
I believe bugid 4361016. 

But 

Both with ufslogging turned OFF only for /internal and with 
ufslogging turned OFF for all partitions, it takes about 30 
minutes for the reboot.  (sendmail start time from syslog as
my mile marker)

[no reconfigure, no hardware power on, with min diag in all cases]

This 30min looks rather long for me. I get only the stand warning
force overload warnings. system performance is fine after the reboot.
90 min for reboot is unacceptable when ufslogging is turned on...
I could as well detach the mirros before reboot and attach/resync
them after reboot.

What are the options/suggestions/pointers which would help me here?

thanks a lot,
will summarize next week,

Durai
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Mon Sep 8 12:48:11 2003

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:19 EST