Hi Netters!
Finally, got some time to post a summary. This job would be great if
it wasn't for all the students, faculty, users.......:>)
My original post (included below) asked you fine folk for information
concerning why I was noticing a load average of > 1 on only some of my
Sparc systems. The many responses (which, as always, are greatly
appreciated) indicate the following:
Load averages above 1 are OK. Numbers above 1 are not that unusual.
Don't worry about it.
A little more detail might be helpful:
-Load averages = number of jobs in the run queue at 1, 5 and 15
minute intervals (man pages state this, which I had already found).
-Load averages increase by (possibly)
-many processes that use up lots of cpu resources, making
other processes wait
-more than one process wanting to run at the same time as
another
-Possible causes could include suspended shells. The fix is to put the
shell into the foreground, then suspend again.
-Vmstat 1 will parallel some of the information shown by uptime (the
command I used to find the load averages in the first place). Vmstat 1
will show the number of processes in run queues, as well as those blocked
by needing a resource.
-A few asked if the performance of the system seemed to suffer. It did not.
-Some mentioned that a hay-wire process (like PC-NFS) had caused their
load averages to go nuts.
-Some attributed the problem to bugs in 4.1.3.
-A few users mentioned a hacker problem. This is not so outlandish as we
ourselves, just recently, found some of the systems had been invaded. One
of the problems was that the ps command had been replaced with one that
did not show the hacker's snooping program was running.
-A user or two mentioned unusual kernel activity, like excessive page/swap
or network stuff.
-A process can hang if waiting for a resource. This can cause a load average
over 1. I haven't noticed this, but it makes sense.
Take this information for what it is worth. Some of it is common sense. I
have not tried to verify all of these, but I have looked around my systems
and based on the info presented here, I am beginning to get a better understanding of what is going on.
One respondee deserves special kudos for the performance tuning script he
sent (called perf_scr, included below). Pat Cain's script is a dandy,
checking out cpu times, paging & swaping configurations, disk saturation
(not sure what that is), and things like network configuration. I have
tested this script on a few of my systems, and have learned that I have
an 8% collision rate! Maybe I didn't want to know all this.....
My thanks go to the following respondees. Ain't this newsgroup great??
olav.lerbrekk@geologi.uio.no (Olav Lerbrekk)
jv@nl.net (Johan Vromans)
Eckhard.Rueggeberg@ts.go.dlr.de (Eckhard Rueggeberg)
cciolori@tatca.tc.faa.gov (Chris Ciolorito)
ericb@telecnnct.com (Eric Burger)
Justin Keery <justin@indep.co.uk>
Robert.Wolf@dciem.dnd.ca (Robert Wolf)
jallen@nersc.gov (John Allen)
Dan Stromberg - OAC-DCS <strombrg@hydra.acs.uci.edu>
Pat Cain (Denver) <pjc@denver.ssds.com>
rjcronin@uop.com (Robert J. Cronin)
Dave Fetrow <fetrow@biostat.washington.edu>
Gautam Das <gautam@bwc.org>
"Patrick O'Callaghan" <poc@usb.ve>
*************************************************************
Now Pat's script. Please note that I do not certify that this
script is safe, and I am not responsible for any damage that it
may inflict on your poor systems. Use at your own risk.
*************************************************************
--------------------- BEGIN perf_scr INCLUSION ------------------------
#! /bin/sh
if [ `uname -s` != "SunOS" ]; then
echo Silly rabbit! Don\'t you know trix are for SunOS\?
exit
fi
# page out and swap out threshold values
PO_THRESH=0
SO_THRESH=0
AWK_FILE=/tmp/p.awk_$$
once=1
while getopts c c 2> /dev/null ; do
case $c in
c ) once=0
;;
* ) echo Use: `basename $0` '[-c]'
exit
;;
esac
shift
done
# whooo are you? ooh ooh ooh ooh
release=`/bin/uname -r`
# Set up functions and variables for each OS
case $release in
4.1.* ) PATH=/bin:/usr/bin:/usr/ucb:/usr/etc
ECHO=/usr/5bin/echo
psax() /bin/ps ax
psaxc() /bin/ps axc
awk() /bin/awk "$@"
sol2=0
;;
5.* ) PATH=/bin
ECHO=echo
psax() /usr/bin/ps -ea
psaxc() /usr/bin/ps -ea
awk() /usr/bin/nawk "$@" # awk dumps core on 5.1
sol2=1
;;
* ) echo Unknown release: $release
exit ;;
esac
if [ -f core ]; then
echo Core file must be removed from this directory before running.
exit
fi
# no args
eat_4_lines() {
(line ; line ; line ; line) > /dev/null
}
# no args
get_cpu_times() {
$ECHO 'Getting cpu times...\c'
f=/tmp/$$
vmstat 1 10 | (
eat_4_lines
t_intfaults=0 # initialize counter
t_sysfaults=0 # initialize counter
t_cswitch=0 # initialize counter
t_usr=0 # initialize counter
t_sys=0 # initialize counter
t_idle=0 # initialize counter
for i in 1 2 3 4 5 6 7 8 ; do # for each of the rem. lines
set -- `line` # use shell to extract
n=`expr $# - 6` # the last arg on the line
shift $n # which is the idle time
t_intfaults=`expr $1 + $t_intfaults` # accumulate interrupt faults
shift
t_sysfaults=`expr $1 + $t_sysfaults` # accumulate syscall faults
shift
t_cswitch=`expr $1 + $t_cswitch` # accumulate context switches
shift
t_usr=`expr $1 + $t_usr` # accumulate user
shift
t_sys=`expr $1 + $t_sys` # accumulate system
shift
t_idle=`expr $1 + $t_idle` # accumulate idle
done
avg_intfaults=`expr $t_intfaults / 8`
avg_sysfaults=`expr $t_sysfaults / 8`
avg_cswitch=`expr $t_cswitch / 8`
avg_usr=`expr $t_usr / 8`
avg_sys=`expr $t_sys / 8`
avg_idle=`expr $t_idle / 8`
echo $avg_intfaults $avg_sysfaults $avg_cswitch $avg_usr $avg_sys $avg_idle
) > $f
read avg_intfaults avg_sysfaults avg_cswitch avg_usr avg_sys avg_idle < $f
/bin/rm -f $f
echo ok
}
# no args
check_paging_swapping() {
$ECHO 'Checking paging/swapping...\c'
vmstat -S 1 10 | (
eat_4_lines
po=0
so=0
for i in 1 2 3 4 5 6 7 8 ; do # for each of the rem. lines
set -- `line` # use shell to extract args
so=`expr $so + $7` # swap out value
po=`expr $po + $9` # page out value
done
if [ $po -gt $PO_THRESH -o $so -gt $SO_THRESH ]; then
$ECHO '\n\tAdd memory.'
$ECHO '\tRearrange process load.'
$ECHO '\tAnalyze process behaviour.'
$ECHO '\tUse tmpfs or mmap().'
else
echo ok
fi
)
}
# no args
generate_awk_file() {
cat > $AWK_FILE << EOF
BEGIN {
go_ahead_and_debug_it = 0;
}
{
lines++;
if ( lines == 1 )
for(i=0; i<NF; i++) {
j = i + 1;
disk_name[i] = \$j;
}
if ( lines < 4 )
next;
count++;
n_disks = NF / 3;
read_index = 1;
write_index = 2;
for(i=0; i<n_disks; i++) {
rps[i] += \$read_index;
wps[i] += \$write_index;
rwps[i] += ( \$read_index + \$write_index );
read_index += 3;
write_index += 3;
}
}
END {
if (go_ahead_and_debug_it) {
printf("n_disks is %d\n", n_disks);
printf("rps array is ");
for(i=0; i<n_disks; i++)
printf("%d ", rps[i]);
printf("\n");
printf("wps array is ");
for(i=0; i<n_disks; i++)
printf("%d ", wps[i]);
printf("\n");
printf("rwps array is ");
for(i=0; i<n_disks; i++)
printf("%d ", rwps[i]);
printf("\n");
}
for(i=0; i<n_disks; i++) {
rps[i] /= count;
wps[i] /= count;
rwps[i] /= count;
}
for(i=0; i<n_disks; i++)
for(j=i+1; j<n_disks; j++) {
n = rwps[i] - rwps[j];
if (n < 0) {
if (n < -30) {
n *= -1;
if (rwps[i] > 0) {
diff = n / rwps[i];
if (diff > 0.20) {
printf(" Disk %s has %g %% more activity than Disk %s\n", \
disk_name[j], diff * 100.0, disk_name[i]);
unbalanced = 1;
}
} else {
printf(" Disk %s has %d more r-w/second than Disk %s\n", \
disk_name[j], n, disk_name[i]);
unbalanced = 1;
}
}
} else {
if (n > 30) {
if (rwps[j] > 0) {
diff = n / rwps[j];
if (diff > 0.20) {
printf(" Disk %s has %g %% more activity than Disk %s\n", \
disk_name[i], diff * 100.0, disk_name[j]);
unbalanced = 1;
}
} else {
printf(" Disk %s has %d more r-w/second than Disk %s\n", \
disk_name[i], n, disk_name[j]);
unbalanced = 1;
}
}
}
}
if (unbalanced)
printf(" Unbalanced disk load. Try moving data or striping.\n");
unbalanced = 0;
for(i=0; i<n_disks; i++)
if ((rps[i] >= 15) && (wps[i] >= (5 * rps[i]))) {
printf(" Writes/sec are %g %% the reads/sec on disk %s\n", \
(wps[i] / rps[i]) * 100.0, disk_name[i]);
unbalanced = 1;
}
if (unbalanced)
printf(" Unbalanced read/write load. Try adding PrestoServe.\n");
}
EOF
}
# no args
check_disk_saturation() {
echo 'Checking disk saturation...'
generate_awk_file
iostat -D 1 10 | awk -f $AWK_FILE
}
# no args
check_dnlc() {
echo 'Checking DNLC hit rate...'
if [ $sol2 -eq 0 ]; then
# gadzooks. someone tell me how to find maxusers another way.
set -- `(echo 'nproc?D' | adb /vmunix | ( line > /dev/null ; line))`
maxusers=`expr '(' $2 - 10 ')' / 16`
else
maxusers=`fgrep maxusers /etc/system | sed 's/^.*= *//' | cut -f1`
if [ -z "$maxusers" ]; then
set -- `(echo 'maxusers?D' | adb /kernel/unix | (line > /dev/null;line))`
maxusers=$2
fi
fi
set -- `vmstat -s | fgrep 'total name lookups'`
if [ $sol2 -eq 0 ]; then
hit_rate=`echo $7 | tr -d %`
else
hit_rate=`echo $7 | sed 's/%)//g'`
fi
total_lookups=$1
set -- $hit_rate
if [ $1 -lt 80 ]; then
if [ $1 -lt 0 ]; then
$ECHO '\tOverflow on DNLC. Re-run shortly after next reboot.'
else
$ECHO "\tDNLC hit rate is only $1 %. Should be at least 80 %."
if [ $maxusers -lt 64 ]; then
more=`expr $maxusers + 8`
if [ $more -gt 64 ]; then
more=64
fi
$ECHO "\tTry increasing MAXUSERS from $maxusers to $more"
else
$ECHO '\tTry increasing ncsize in param.c'
fi
fi
fi
set -- `vmstat -s | fgrep toolong`
if [ $sol2 -eq 0 ]; then
toolong=$2
else
toolong=$1
fi
echo $toolong $total_lookups | awk '{
n = (($1 / $2) * 100);
if (n > 10.0) {
printf(" Too-long pathnames are %5.2f %% of total lookups.\n", n);
printf(" Should be no more than 10 %%.\n");
}
}'
}
check_cpu() {
echo 'Checking CPU times...'
if [ $avg_sys -gt 30 ]; then
if [ $avg_sysfaults -gt 11000 ]; then # 30% of 33000 (peak)
$ECHO '\tInefficient use of system calls.'
fi
if [ $avg_cswitch -gt 750 ]; then # 30% of 2500 (peak)
$ECHO '\tHigh context switch rate.'
fi
fi
if [ $avg_usr -gt 70 ]; then
n_procs=`psax | awk '{
if ( $1 == "PID" || $1 < 300 )
next;
n++;
}
END { print n }'`
if [ $n_procs -gt $maxusers ]; then
$ECHO '\tHigh user time w/many processes.'
$ECHO '\tMigrate to MP or use cron or nice.'
else
$ECHO '\tHigh user time w/few processes.'
$ECHO '\tDivide processes into subprocesses, profile and optimize code.'
fi
fi
if [ $avg_intfaults -gt 1000 ]; then # 30% of 3000 (peak)
$ECHO '\tHigh interrupt rate. Culprits are:'
vmstat -i | awk '{
if ( $1 == "interrupt" || substr($1, 1, 4) == "----" || $1 == "Total" )
next;
if ( $NF > 30 && $1 != "clock" )
printf("%s %d\n", $1, $NF);
}' | while read device rate ; do
$ECHO "\t\t$device ( $rate / second )"
case $device in
ie* | le* ) $ECHO '\t\t\tCheck transceiver or try NC400.' ;;
mti* ) $ECHO '\t\t\tTry intelligent terminal servers.' ;;
zs* ) $ECHO '\t\t\tCheck for noisy ports or try HSI.' ;;
esp* ) $ECHO '\t\t\tTry SBE.' ;;
* ) $ECHO '\t\t\tUnknown solution (now).'
esac
done
fi
}
check_network() {
echo 'Checking network condition...'
if [ $sol2 -eq 0 ]; then
nfs_mounts=`df -t nfs | wc -l` # actually, minus one for the header
else
nfs_mounts=`df -F nfs | wc -l` # actually, minus one for the header
fi
f=/tmp/$$
netstat -i | egrep -v '^Name|^lo0' | (
while read name mtu net add ipkts ierrs opkts oerrs collis queue ; do
if [ -z "$t_ipkts" ]; then t_ipkts=0; fi
if [ -z "$t_ierrs" ]; then t_ierrs=0; fi
if [ -z "$t_collis" ]; then t_collis=0; fi
if [ -z "$t_opkts" ]; then t_opkts=0; fi
t_ipkts=`expr $t_ipkts + $ipkts`
t_ierrs=`expr $t_ierrs + $ierrs`
t_collis=`expr $t_collis + $collis`
t_opkts=`expr $t_opkts + $opkts`
done
echo $t_ipkts $t_ierrs $t_collis $t_opkts
) > $f
read t_ipkts t_ierrs t_collis t_opkts < $f
/bin/rm -f $f
echo $t_collis $t_opkts $t_ierrs $t_ipkts | awk '{
coll_rate = $1 / $2;
err_rate = $3 / $4;
if (coll_rate > 0.05)
printf(" High collision rate ( %g %% ). Subnet or check cabling.\n", \
(coll_rate * 100.0));
if (err_rate > 0.00025)
printf(" Error rate not zero ( %g %% ). Increase buffer space.\n", \
(err_rate * 100.0));
}'
set -- `nfsstat -rc | tail -1`
echo $1 $3 $4 | awk '{
calls = $1;
retrans = $3;
badxid = $4;
if (( retrans / calls ) > 0.05 )
if (( badxid / calls ) < 0.05 ) {
printf(" High retransmission rate.\n");
printf(" Check routers and bridges for dropped packets.\n");
printf(" Try decreasing rsize and wsize in fstab\n");
printf(" to improve NFS client I/O.\n");
} else {
printf(" Bad server response time for client.\n");
printf(" Try increasing timeo in fstab to improve\n");
printf(" NFS client I/O.\n");
}
}'
if [ $sol2 -eq 0 ]; then
udp_overflows=`netstat -s | fgrep 'socket overflows' | awk '{ print $1 }'`
else
udp_overflows=`netstat -s | fgrep udpInOverflows | awk '{ print $6 }'`
fi
if [ $udp_overflows -gt 0 ]; then
n_nfsd=`psaxc | fgrep nfsd | wc -l`
nn_nfsd=`expr $n_nfsd + 4`
$ECHO "\tOverrun of nfsd processes ( $udp_overflows times )"
$ECHO '\tTry increasing from' $n_nfsd to $nn_nfsd
fi
f=/tmp/$$
nfsstat -s | tail -5 | egrep -v 'wrcache|mkdir' | (
set -- `line`
getattr=$4
shift 11
readlink=$1
nread=$3
set -- `line`
nwrite=$4
line > /dev/null
echo $getattr $readlink $nread $nwrite
) | tr -d '%' > $f
read getattr readlink nread nwrite < $f
/bin/rm -f $f
if [ $getattr -gt 35 ]; then
$ECHO "\tHigh getattr count ($getattr %)."
$ECHO '\tCheck actimeo in fstab for client NFS I/O'
$ECHO '\t\tand increase for read-only clients.'
fi
if [ $readlink -gt 5 ]; then
$ECHO "\tHigh readlink count ($readlink %)."
$ECHO '\tCut down on number of symbolic links on NFS mounts for clients.'
fi
if [ $sol2 -eq 0 ]; then
strings /vmunix | grep -is presto
has_presto=$?
else
strings /kernel/unix | grep -is presto
has_presto=$?
fi
if [ $nwrite -gt 5 ]; then
$ECHO "\tHigh percentage of NFS writes ($nwrite %).\c"
if [ $has_presto -eq 1 ]; then
echo " Add PrestoServe."
else
echo " PrestoServe already installed."
fi
fi
if [ $nread -gt 30 ]; then
$ECHO "\tHigh percentage of NFS reads ($nread %). Add NC400."
fi
}
trap "/bin/rm -f $AWK_FILE" 0
while true
do
get_cpu_times
check_paging_swapping
check_disk_saturation
check_dnlc
check_cpu
check_network
if [ $once -eq 1 ]; then
exit
fi
done
--------------------- END perf_scr INCLUSION ------------------------
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:08:57 CDT