SUMMARY: Disk Space Usage

From: Mariela Esser-Jaime (esser@parc.xerox.com)
Date: Wed Sep 02 1992 - 23:55:52 CDT


Well, I have to say it too. This list is a great source of information.
It has saved many of us from re-inventing the wheel.

MANY THANKS TO:
--------------
Michael Osborne
kevinmac@ll.mit.edu (Kevin McElearney)
rwolf@dretor.dciem.dnd.ca
tim@otis.hssc.scarolina.edu
osicki@hasler.ascom.ch (Osicki Chris)
Kerry Duke <kerry.duke@analog.com>
stergios@kt22.stanford.edu (Stergios)
era@niwot.scd.ucar.edu (Ed Arnold)
marc@math.mcgill.ca (Marc Sokolowski)
Mike Raffety <miker@sbcoc.com>
ept@eptsun1.ctd.ornl.gov (E P Tinnel)

REQUEST:
-------
I am looking for code to monitor disk space usage.

I'll welcome whatever you have working, but what I have in mind is a script I can run when the capacity of a file system gets above a specified percentage.
Such script will determine the largest directories in the filesystem, as well as percentage of files that have not been accessed in more than n days in each directory in the filesystem.
A list of these files (or subdirectories) is also needed. If all the files under a subdirectory have not been access in more than n days, I would rather have only the name of the subdirectory.
For obvious reasons, the script should not change the access time as in the -atime option of the find command.

The goal is to get an idea of users' disk usage patterns.
Also, I am sure there are a few users who will be more inclined to delete/archive data if they were given a list of the directories or files that have not been accessed in the last n months/years.

SUMMARY OF RESPONSES:
--------------------

A couple of people recommended "Watcher" by Kenneth Ingham.
"The source and a paper on watcher are available by anonymous ftp from
ariel.unm.edu (129.24.8.1),but keep in mind that the demo scripts were not
written for SunOs, and will require changing."

I also got a few scripts (forwarded below). You'll have to modify them to suit your needs
and environment, but they are a pretty good start.

Mariela Esser-Jaime (esser@parc.xerox.com)
CSNS - Computing Systems and Network Services
Xerox PARC

>>>>>>>>>>>>>>>>>>>>>>>>>>>DUSAGE<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
#! /bin/sh
# This is a shell archive. Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file". To overwrite existing
# files, type "sh file -c". You can also feed this as standard input via
# unshar, or by typing "sh <file", e.g.. If this archive is complete, you
# will see the following message at the end:
# "End of shell archive."
# Contents: README dusage.1 dusage.pl
# Wrapped by sport@blackbird on Wed Sep 2 08:37:04 1992
PATH=/bin:/usr/bin:/usr/ucb ; export PATH
if test -f 'README' -a "${1}" != "-c" ; then
  echo shar: Will not clobber existing file \"'README'\"
else
echo shar: Extracting \"'README'\" \(1310 characters\)
sed "s/^X//" >'README' <<'END_OF_FILE'
X#
X# Wrapped by Johan Vromans <jv@mh.nl> on Sat Feb 3 21:36:57 1990
X# This program requires perl version 3.0, patchlevel 4 or higher.
X
X# Copyright 1990 Johan Vromans, all rights reserved.
X# Peaceware. This program may be used, modified and distributed as long as
X# this copyright notice remains part of the source. It may not be sold, or
X# be used to harm any living creature including the world and the universe.
X
XFrom sparkyfs!ames!apple!usc!cs.utexas.edu!uunet!allbery Wed Feb 14 08:08:05 PST 1990
X
XPosting-number: Volume 10, Issue 65
XSubmitted-by: jv@mh.nl (Johan Vromans)
XArchive-name: dusage.pl
X
XGuarding disk space is one of the problems of system management.
X
XSome time ago I converted an old awk/sed/sh script to keep track of
Xdisk usage to a new perl program, added new features, options; even
Xwrote a manual page.
X
XUsing a list of pathnames, this program filters the output of du(1) to
Xfind the amount of disk space used for each of the paths (actually, it
Xcollects all values in one single du run). It adds the new value to
Xthe list, shifting old values up. It then generates a nice report of
Xthe amount of disk space occupied in each of the specified paths,
Xtogether with the amount it grew (or shrinked) since the previous run,
Xand since 7 runs ago. When run daily, this gives daily and weekly
END_OF_FILE
if test 1310 -ne `wc -c <'README'`; then
    echo shar: \"'README'\" unpacked with wrong size!
fi
# end of 'README'
fi
if test -f 'dusage.1' -a "${1}" != "-c" ; then
  echo shar: Will not clobber existing file \"'dusage.1'\"
else
echo shar: Extracting \"'dusage.1'\" \(5139 characters\)
sed "s/^X//" >'dusage.1' <<'END_OF_FILE'
X.TH DUSAGE 1
X.SH NAME
Xdusage \- provide disk usage statistics
X.SH SYNOPSIS
X.B dusage
X.RB [ \-afghruD ]
X.RI "[\fB\-i\fR" " input" ]
X.RI "[\fB\-p\fR" " dir" ]
X.RI [ "control file" ]
X.SH DESCRIPTION
X.I Dusage
Xis a perl script which produces disk usage statistics. These
Xstatistics include the number of blocks, the increment since the previous run
X(which is assumed to be yesterday if run daily), and the increment
Xsince 7 runs ago (which could be interpreted as a week if run daily).
X.I Dusage
Xis driven by a
X.IR "control file" ,
Xwhich describes the names of the files (directories) to be reported,
Xand which also contains the results of previous runs.
X.PP
XWhen
X.I dusage
Xis run, it reads the
X.IR "control file" ,
X[optionally] gathers new disk usage values by calling
X.IR du (1),
Xprints the report, and [optionally] updates the
X.I control file
Xwith the new information.
X.PP
XFilenames in the control file may have wildcards. In this case, the
Xwildcards are expanded, and all entries reported. Both the expanded
Xnames as the wildcard info are maintained in the control file. New
Xfiles in these directories will automatically show up, deleted files
Xwill disappear when they have run out of data in the control file (but
Xsee the
X.B \-r
Xoption).
X.br
XWildcard expansion only adds filenames which are not already on the list.
X.PP
XThe control file may also contain filenames preceded with an
Xexclamation mark ``!''; these entries are skipped. This is meaningful
Xin conjunction with wildcards, to exclude entries which result from a
Xwildcard expansion.
X.PP
XThe control file may have lines starting with a dash ``\-'',
Xwhich causes the report to start on a new page. Any text following the
Xdash is placed in the page header, immediately following the text
X``Disk usage statistics''.
X.PP
XThe available command line options are:
X.TP 5
X.B \-D
XTurns on debugging, which yields lots of trace information.
X.TP
X.B \-a
XReports the statistics for this and all previous runs, as opposed to
Xthe normal case, which is to generate the statistics for this run, and
Xthe differences between the previous and 7th previous run.
X.TP
X.B \-f
XReports file statistics also. Default is to only report directories.
X.TP
X.B \-g
XGathers new data by calling
X.IR du (1).
X.TP
X.B \-h
XProvides a help message. No work is done.
X.TP
X.BI \-i " input"
XUses
X.I input
Xas data obtained by calling
X.IR du (1).
X.TP
X.BI \-p " dir"
XAll filenames in the control file are interpreted relative to this
Xdirectory.
X.TP
X.B \-r
XRetains entries which don't have any data anymore. If this option is
Xnot used, entries without data are not reported, and removed from the
Xcontrol file.
X.TP
X.B \-u
XUpdate the control file with new values.
X.PP
XThe default name for the control file is
X.BR .du.ctl ,
Xoptionally preceded by the name supplied with the
X.B \-p
Xoption.
X.SH EXAMPLES
XGiven the following control file:
X.sp
X.nf
X.ne 3
X.in +.5i
X\- for manual page
Xmaildir
Xmaildir/*
X!maildir/unimportant
Xsrc
X.in
X.fi
X.sp
XThis will generate the following (example) report when running the
Xcommand ``dusage -gu controlfile'':
X.sp
X.nf
X.ne 7
X.in +.5i
XDisk usage statistics for manual page Wed Jan 10 13:38
X
X blocks +day +week directory
X------- ------- ------- --------------------------------
X 6518 maildir
X 2 maildir/dirent
X 498 src
X.in
X.fi
X.sp
XAfter updating the control file, it will contain:
X.sp
X.nf
X.ne 4
X.in +.5i
X\- for manual page
Xmaildir 6518::::::
Xmaildir/dirent 2::::::
Xmaildir/*
X!maildir/unimportant
Xsrc 498::::::
X.in
X.fi
X.sp
XThe names in the control file are separated by the values with a TAB;
Xthe values are separated with colons. Also, the entries found by
Xexpanding the wildcard are added. If the wildcard expansion had
Xgenerated a name ``maildir/unimportant'' it would have been skipped.
X.br
XWhen the program is rerun after one day, it could print the following
Xreport:
X.sp
X.nf
X.ne 7
X.in +.5i
XDisk usage statistics for manual page Wed Jan 10 13:38
X
X blocks +day +week directory
X------- ------- ------- --------------------------------
X 6524 +6 maildir
X 2 0 maildir/dirent
X 486 -12 src
X.in
X.fi
X.sp
XThe control file will contain:
X.sp
X.nf
X.ne 4
X.in +.5i
X\- for manual page
Xmaildir 6524:6518:::::
Xmaildir/dirent 2:2:::::
Xmaildir/*
X!maildir/unimportant
Xsrc 486:498:::::
X.in
X.fi
X.sp
XIt takes very little fantasy to imagine what will happen on subsequent
Xruns...
X.PP
XWhen the contents of the control file are to be changed, e.g. to add
Xnew filenames, a normal text editor can be used. Just add or remove
Xlines, and they will be taken into account automatically.
X.PP
XWhen run without
X.B \-g
Xor
X.B \-u
Xoptions, it actually reproduces the report from the previous run.
X.PP
XWhen multiple runs are required, save the output of
X.IR du (1)
Xin a file, and pass this file to
X.I dusage
Xusing the
X.BI \-i "file"
Xoption.
X.SH BUGS
XRunning the same control file with different values of the
X.B \-f
Xand
X.B \-r
Xoptions may cause strange results.
X.SH AUTHOR
XJohan Vromans, Multihouse Research, Gouda, The Netherlands.
X.sp
XSend bugs and remarks to <jv@mh.nl> .
END_OF_FILE
if test 5139 -ne `wc -c <'dusage.1'`; then
    echo shar: \"'dusage.1'\" unpacked with wrong size!
fi
# end of 'dusage.1'
fi
if test -f 'dusage.pl' -a "${1}" != "-c" ; then
  echo shar: Will not clobber existing file \"'dusage.pl'\"
else
echo shar: Extracting \"'dusage.pl'\" \(10356 characters\)
sed "s/^X//" >'dusage.pl' <<'END_OF_FILE'
X#!/usr/bin/perl
X
X# This program requires perl version 3.0, patchlevel 4 or higher.
X
X# Copyright 1990 Johan Vromans, all rights reserved.
X# Peaceware. This program may be used, modified and distributed as long as
X# this copyright notice remains part of the source. It may not be sold, or
X# be used to harm any living creature including the world and the universe.
X
X$my_name = $0;
X
X################ usage ################
X
Xsub usage {
X local ($help) = shift (@_);
X local ($usg) = "usage: $my_name [-afghruD][-i input][-p dir] ctlfile";
X die "$usg\nstopped" unless $help;
X print STDERR "$usg\n";
X print STDERR <<EndOfHelp
X
X -D - provide debugging info
X -a - provide all statis
X -f - also report file statistics
X -g - gather new data
X -h - this help message
X -i input - input data as obtained by 'du dir' [def = 'du dir']
X -p dir - path to which files in the control file are relative
X -r - do not discard entries which don't have data
X -u - update the control file with new values
X ctlfile - file which controls which dirs to report [def = dir/.du.ctl]
XEndOfHelp
X ;
X exit 1;
X}
X
X################ main stream ################
X
X&do_get_options; # process options
X&do_parse_ctl; # read the control file
X&do_gather if $gather; # gather new info
X&do_report_and_update; # report and update
X
X################ end of main stream ################
X
X################ other subroutines ################
X
Xsub do_get_options {
X
X # Default values for options
X
X $debug = 0;
X $noupdate = 1;
X $retain = 0;
X $gather = 0;
X $allfiles = 0;
X $allstats = 0;
X
X # Command line options. We use a modified version of getopts.pl.
X
X &usage (0) if &Getopts ("Dafghi:p:ru");
X &usage (1) if $opt_h;
X &usage (0) if $#ARGV > 0;
X
X $debug |= $opt_D if defined $opt_D; # -D -> debug
X $allstats |= $opt_a if defined $opt_a; # -a -> all stats
X $allfiles |= $opt_f if defined $opt_f; # -f -> report all files
X $gather |= $opt_g if defined $opt_g; # -g -> gather new data
X $retain |= $opt_r if defined $opt_r; # -r -> retain old entries
X $noupdate = !$opt_u if defined $opt_u; # -u -> update the control file
X $du = $opt_i if defined $opt_i; # -i input file
X if ( defined $opt_p ) { # -p path
X $root = $opt_p;
X $root = $` while ($root =~ m|/$|);
X $prefix = "$root/";
X $root = "/" if $root eq "";
X }
X else {
X $prefix = $root = "";
X }
X $table = ($#ARGV == 0) ? shift (@ARGV) : "$prefix.du.ctl";
X $runtype = $allfiles ? "file" : "directory";
X if ($debug) {
X print STDERR "@(#)@ dusage 1.7 - dusage.pl\n";
X print STDERR "Options:";
X print STDERR " debug" if $debug; # silly, isn't it...
X print STDERR $noupdate ? " no" : " ", "update";
X print STDERR $retain ? " " : " no", "retain";
X print STDERR $gather ? " " : " no", "gather";
X print STDERR $allstats ? " " : " no", "allstats";
X print STDERR "\n";
X print STDERR "Root = $root [prefix = $prefix]\n";
X print STDERR "Control file = $table\n";
X print STDERR "Input data = $du\n" if defined $du;
X print STDERR "Run type = $runtype\n";
X print STDERR "\n";
X }
X}
X
Xsub do_parse_ctl {
X
X # Parsing the control file.
X #
X # This file contains the names of the (sub)directories to tally,
X # and the values dereived from previous runs.
X # The names of the directories are relative to the $root.
X # The name may contain '*' or '?' characters, and will be globbed if so.
X # An entry starting with ! is excluded.
X #
X # To add a new dir, just add the name. The special name '.' may
X # be used to denote the $root directory. If used, '-p' must be
X # specified.
X #
X # Upon completion:
X # - %oldblocks is filled with the previous values,
X # colon separated, for each directory.
X # - @targets contains a list of names to be looked for. These include
X # break indications and globs info, which will be stripped from
X # the actual search list.
X
X open (tb, "<$table") || die "Cannot open control file $table, stopped";
X @targets = ();
X %oldblocks = ();
X %newblocks = ();
X
X while ($tb = <tb>) {
X chop ($tb);
X
X # preferred syntax: <dir><TAB><size>:<size>:....
X # allowable <dir><TAB><size> <size> ...
X # possible <dir>
X
X if ( $tb =~ /^-/ ) { # break
X push (@targets, "$tb");
X printf STDERR "tb: *break* $tb\n" if $debug;
X next;
X }
X
X if ( $tb =~ /^!/ ) { # exclude
X $excl = $'; #';
X @a = grep ($_ ne $excl, @targets);
X @targets = @a;
X push (@targets, "*$tb");
X printf STDERR "tb: *excl* $tb\n" if $debug;
X next;
X }
X
X if ($tb =~ /^(.+)\t([\d: ]+)/) {
X $name = $1;
X @blocks = split (/[ :]/, $2);
X }
X else {
X $name = $tb;
X @blocks = ("","","","","","","","");
X }
X
X if ($name eq ".") {
X if ( $root eq "" ) {
X printf STDERR "Warning: \".\" in control file w/o \"-p path\" - ignored\n";
X next;
X }
X $name = $root;
X } else {
X $name = $prefix . $name unless ord($name) == ord ("/");
X }
X
X # Check for globs ...
X if ( $gather && $name =~ /\*|\?/ ) {
X print STDERR "glob: $name\n" if $debug;
X foreach $n ( <${name}> ) {
X next unless $allfiles || -d $n;
X # Globs never overwrite existing entries
X if ( !defined $oldblocks{$n} ) {
X $oldblocks{$n} = ":::::::";
X push (@targets, $n);
X }
X printf STDERR "glob: -> $n\n" if $debug;
X }
X # Put on the globs list, and terminate this entry
X push (@targets, "*$name");
X next;
X }
X
X push (@targets, "$name");
X # Entry may be rewritten (in case of globs)
X $oldblocks{$name} = join (":", @blocks[0..7]);
X
X print STDERR "tb: $name\t$oldblocks{$name}\n" if $debug;
X }
X close (tb);
X}
X
Xsub do_gather {
X
X # Build a targets match string, and an optimized list of directories to
X # search.
X $targets = "//";
X @list = ();
X $last = "///";
X foreach $name (sort (@targets)) {
X next if $name =~ /^[-*]/;
X next unless $allfiles || -d $name;
X $targets .= "$name//";
X next if ($name =~ m|^$last/|);
X push (@list, $name);
X $last = $name;
X }
X
X print STDERR "targets: $targets\n" if $debug;
X print STDERR "list: @list\n" if $debug;
X print STDERR "reports: @targets\n" if $debug;
X
X $du = "du " . ($allfiles ? "-a" : "") . " @list|"
X unless defined $du; # in which case we have a data file
X
X # Process the data. If a name is found in the target list,
X # %newblocks will be set to the new blocks value.
X
X open (du, "$du") || die "Cannot get data from $du, stopped";
X while ($du = <du>) {
X chop ($du);
X ($blocks,$name) = split (/\t/, $du);
X if (($i = index ($targets, "//$name//")) >= 0) {
X # tally and remove entry from search list
X $newblocks{$name} = $blocks;
X print STDERR "du: $name $blocks\n" if $debug;
X substr ($targets, $i, length($name) + 2) = "";
X }
X }
X close (du);
X}
X
X
X# Report generation
X
Xformat std_hdr =
XDisk usage statistics@<<<<<<<<<<<<<<<<<<<<<@<<<<<<<<<<<<<<<
X$subtitle, $date
X
X blocks +day +week @<<<<<<<<<<<<<<<
X$runtype
X------- ------- ------- --------------------------------
X.
Xformat std_out =
X@>>>>>> @>>>>>>> @>>>>>>> ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<..
X$blocks, $d_day, $d_week, $name
X.
X
Xformat all_hdr =
XDisk usage statistics@<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @<<<<<<<<<<<<<<<
X$subtitle, $date
X
X --0-- --1-- --2-- --3-- --4-- --5-- --6-- --7-- @<<<<<<<<<<<<<<<
X$runtype
X------- ------- ------- ------- ------- ------- ------- ------- --------------------------------
X.
Xformat all_out =
X@>>>>>> @>>>>>>> @>>>>>>> @>>>>>>> @>>>>>>> @>>>>>>> @>>>>>>> @>>>>>>> ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<..
X$a[0], $a[1], $a[2], $a[3], $a[4], $a[5], $a[6], $a[7], $name
X.
X
Xsub do_report_and_update {
X
X # Prepare update of the control file
X if ( !$noupdate ) {
X if ( !open (tb, ">$table") ) {
X print STDERR "Warning: cannot update control file $table - continuing\n";
X $noupdate = 1;
X }
X }
X
X if ( $allstats ) {
X $^ = "all_hdr";
X $~ = "all_out";
X }
X else {
X $^ = "std_hdr";
X $~ = "std_out";
X }
X $date = `date`;
X chop ($date);
X
X # In one pass the report is generated, and the control file rewritten.
X
X foreach $name (@targets) {
X if ($name =~ /^-/ ) {
X $subtitle = $'; #';
X print tb "$name\n" unless $noupdate;
X print STDERR "tb: $name\n" if $debug;
X $- = -1;
X next;
X }
X if ($name =~ /^\*$prefix/ ) {
X print tb "$'\n" unless $noupdate; #';
X print STDERR "tb: $'\n" if $debug; #';
X next;
X }
X @a = split (/:/, $oldblocks{$name});
X unshift (@a, $newblocks{$name}) if $gather;
X $name = "." if $name eq $root;
X $name = $' if $name =~ /^$prefix/; #';
X if ($#a < 0) { # no data?
X if ($retain) {
X @a = ("","","","","","","","");
X }
X else {
X # Discard
X print STDERR "--: $name\n" if $debug;
X next;
X }
X }
X print STDERR "Warning: ", 1+$#a, " entries for $name\n"
X if ($debug && $#a != 8);
X $line = "$name\t" . join(":",@a[0..7]) . "\n";
X print tb $line unless $noupdate;
X print STDERR "tb: $line" if $debug;
X
X $blocks = $a[0];
X if ( !$allstats ) {
X $d_day = $d_week = "";
X if ($blocks ne "") {
X if ($a[1] ne "") { # dayly delta
X $d_day = $blocks - $a[1];
X $d_day = "+" . $d_day if $d_day > 0;
X }
X if ($a[7] ne "") { # weekly delta
X $d_week = $blocks - $a[7];
X $d_week = "+" . $d_week if $d_week > 0;
X }
X }
X }
X write;
X }
X
X # Close control file, if opened
X close (tb) unless $noupdate;
X}
X
X# Modified version of getopts ...
X
Xsub Getopts {
X local($argumentative) = @_;
X local(@args,$_,$first,$rest);
X local($opterr) = 0;
X
X @args = split( / */, $argumentative );
X while(($_ = $ARGV[0]) =~ /^-(.)(.*)/) {
X ($first,$rest) = ($1,$2);
X $pos = index($argumentative,$first);
X if($pos >= $[) {
X if($args[$pos+1] eq ':') {
X shift(@ARGV);
X if($rest eq '') {
X $rest = shift(@ARGV);
X }
X eval "\$opt_$first = \$rest;";
X }
X else {
X eval "\$opt_$first = 1";
X if($rest eq '') {
X shift(@ARGV);
X }
X else {
X $ARGV[0] = "-$rest";
X }
X }
X }
X else {
X print stderr "Unknown option: $first\n";
X $opterr++;
X if($rest ne '') {
X $ARGV[0] = "-$rest";
X }
X else {
X shift(@ARGV);
X }
X }
X }
X return $opterr;
X}
END_OF_FILE
if test 10356 -ne `wc -c <'dusage.pl'`; then
    echo shar: \"'dusage.pl'\" unpacked with wrong size!
fi
# end of 'dusage.pl'
fi
echo shar: End of shell archive.
exit 0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>END OF DISKUSAGE<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>>>>>>>>>BEGINNIG OF DISKHOGS<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>From dretor.dciem.dnd.ca!rwolf Wed Sep 2 06:34:22 1992
Date: Wed, 2 Sep 1992 06:32:35 -0700
From: rwolf@dretor.dciem.dnd.ca
To: esser@parc.xerox.com
Subject: disk usage
Content-Length: 2188

Here is a simple script I use to see who is hogging the disk.

------------------------------------------------------------------------------
Robert J Wolf, Sun System Admin. DCIEM, CFB Toronto
rwolf@dciem.dnd.ca PO Box 2000 1133 Sheppard Avenue West
uunet!csri.toronto.edu!dciem!rwolf North York (Toronto), Ont., Canada M3M 3B9
Internet: 192.16.207.3 Phone: (416)635-2073 FAX: (416)635-2104
"Capitalism with environmental ethics will benefit the entire world."
------------------------------------------------------------------------------
:
# diskhogs.sh
#
# Function: Totals the disk allocation for each of the user directories
# and prints out the top ten disk-space holders for each partition.
#
# Usage: # /usr/local/bin/diskhogs.sh e_mail_user
#
# The following cron entry is usually used:
# 30 10,14 * * 1-5 /bin/nice --5 /usr/local/bin/diskhogs.sh sys-admin
#
# Log: 1991 Jun 19 rwolf First Version, based on something similar.
# 1992 Apr 22 rwolf Convert to a cron type script that will send
# a mail message.
# 1992 Jul 13 rwolf Generalize to whom the mail gets sent.

temp=/tmp/diskhogs.rpt

# Output the program id and version
prg=diskhogs.sh
ver=1.2

# Check the number of command line arguments
if [ $# -ne 1 ] ; then
    echo ERROR: Invalid usage, $prg USER_NAME
    exit 1
fi

if [ -f $temp ] ; then
    /bin/rm -f $temp
fi

echo `date` starting $prg ver $ver on `hostname` > $temp
echo " " >> $temp
echo " TOP TEN DISKHOGS ON EACH USER PARTITION" >> $temp
echo " " >> $temp
echo " # # Files not accessed in:" >> $temp
echo "blks files owner 30days 60days 90days" >> $temp

for server in dretor client1 client2 ; do
    echo " " >> $temp
    rsh $server -n /usr/etc/quot -fv /home.$server | /usr/ucb/head -11 >> $temp
done

echo " " >> $temp
rsh client2 -n /usr/etc/quot -fv /pcbackup | head -11 >> $temp

echo "" >> $temp
echo "To get a hard copy of this document run the following command" >> $temp
echo "dretor# /usr/local/bin/enscript -2r $temp" >> $temp

/usr/ucb/mail -s 'Diskhogs Report' $1 < $temp

exit 0

>>>>>>>>>>>>>>>>>>>>>>>>>>>>END OF DISKHOGS<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>>>>>>>>>>BEGINNING OF BADUSERS<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>From hasler.ascom.ch!osicki Wed Sep 2 07:09:52 1992
Date: Wed, 2 Sep 1992 06:58:42 -0700
From: osicki@hasler.ascom.ch (Osicki Chris)
To: Mariela Esser-Jaime <esser@parc.xerox.com>
Reply-To: osicki@hasler.ascom.ch
Subject: Re: Disk Space Usage
Content-Length: 2311

Mariela,

We use 'badusers' program written by J. Robert Ward. You can find it
on an ftp server near to you. A part of its man page follows.

NAME
     badusers - determine who is using excessive disc storage

SYNOPSIS
     badusers [ -e ] [ filesystem . . . ]
 
DESCRIPTION
     Badusers examines one or more file-systems to determine who
     is responsible for excessive disc usage.

     If invoked with no arguments, badusers tries to access each
     special file defined in /etc/fstab. Otherwise, if a speci-
     fied file-system corresponds to a named directory or file in
     /etc/fstab (fs_file), then badusers tries to access the
     corresponding special device (fs_spec). In both cases, it
     tries to access the corresponding raw device and then, fail-
     ing that, the corresponding block device.

     If a specified file-system does not correspond to a named
     directory or file in /etc/fstab, badusers tries to access
     that file directly.

     For each user who owns a file or directory on a specified
     file-system, badusers lists his name, his group, the number
     of files that he owns and the corresponding disc usage meas-
     ured in kilobytes. The reported disc usage is also divided
     into categories showing how much disc space has not been
     accessed for a period of three weeks, two weeks, one week
     and three days. (Badusers looks at the accessed time infor-
     mation (di_atime) stored in each inode to determine this
     information). The output is sorted by the total disc usage
     of each user.

     Badusers looks at the inode table of each file-system to
     determine disc usage. Therefore, files referenced by multi-
     ple links are accounted properly.

OPTIONS
     Badusers recognises the following option -
 
     -e Examine all mounted file-systems except those speci-
             fied in the argument list. An argument may be a
             file or special file known to /etc/fstab.
             
  .....
  
>>>>>>>>>>>>>>>>>>>>>>>>>>>>END OF BADUSERS<<<<<<<<<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>>>>>>>>>>BEGINNING OF QUOTA_REPORT<<<<<<<<<<<<<<<<<
#! /bin/sh
# Name: quota_report.sh
# Purpose: This csh script runs several other programs in order to generate a
# report on disk usage within the DSP portion of SPD. The final
# reports are left in the /usr/duke/csh/quota_report.dir directory.
# NOTES:
# 1. This is usually run as ROOT on Zion
#
# Modification Dates:
# 901208 - KED Original creation date
# 910520 - KED Added total amount of diskspace available now
# 911220 - KED Added report by group
# 911227 - KED Added HOG report
#

# Set some variables
DATEext=`date '+%y%m%d'`
DISKSIZE="/tmp/disksize"
BL="/tmp/quota.bottomline"
DAT="/tmp/quota.dat"
RESULTS="/tmp/quota.results"
BY_GROUP="/tmp/quota.by_group"
#
# mailing lists...
MANAGERS="greg.koker doug.garde jim.wilson john.hayden george.morton paul.kramarz peter.szabosr gerald.mcguire"

echo " "
echo -n "Starting Quota Report..."; date
#
# This line issues the 'quot' command on all systems.
echo "Gathering the raw data...please wait, this takes several hours..."
/usr/duke/csh/do_all.csh "/usr/etc/quot -a -v" > $DAT
#
# This generates the list of people authorized to have files on our systems.
touch /tmp/foo.doc

# Systems with significant "passwd" files
SYS_LIST="zeus"

# Start loop
for SYSTEMS in $SYS_LIST
do

echo "Copying "$SYSTEMS"'s password file over..."
rcp "$SYSTEMS":/etc/passwd /tmp/foo.$SYSTEMS

echo "Generating a list of users from $SYSTEMS..."
awk 'BEGIN { FS = ":" } \
        { \
        if (($1 !~ /\+/) && ($1 != "")) \
           print $1 \
        }' /tmp/foo.$SYSTEMS >> /tmp/foo.doc

done
# End loop

# Sort this file and get rid of duplicates.
echo "Sorting the list..."
sort -u /tmp/foo.doc > /tmp/foo.sorted

# Add the non-people to the list.
echo "non-people" >> /tmp/foo.sorted

awk '{ \
        print "awk -f /usr/duke/csh/quota_report.awk USER_NAME=\"" $1 "\" /tmp/quota.dat" \
     }' /tmp/foo.sorted > /tmp/foo.do

# Generate the report
echo "Generating report..."
chmod +x /tmp/foo.do
/tmp/foo.do > $RESULTS

# run quota_bottomline.csh
echo "Generating bottomline report..."
awk -f /usr/duke/csh/quota_bottomline.awk $RESULTS > $BL

#
# 12-27-91 KED
# Generate bottomline biggest HOG report
echo "Generating biggest HOG report..."
# Strip the header and footer lines, and the total line from the report
awk ' NF==9 { print $0 }' $BL | grep -v Total > /tmp/foo.bl
# Put the header back on.
echo "Disk Space Report - By largest user" > /tmp/foo.bl.HOG
echo " " >> /tmp/foo.bl.HOG
echo " Blocks not accessed within" >> /tmp/foo.bl.HOG
echo "User Blocks 30 days 60 days 90 days " >> /tmp/foo.bl.HOG
echo "----------------------------------------------------------------" >> /tmp/foo.bl.HOG
# then sort it...
sort -r -n +1 /tmp/foo.bl >> /tmp/foo.bl.HOG
# then put the footer back on
echo " " >> /tmp/foo.bl.HOG
echo "----------------------------------------------------------------" >> /tmp/foo.bl.HOG
grep -w Total $BL >> /tmp/foo.bl.HOG

# 5-20-91 KED
# get the total amount of diskspace available now
echo "Generating diskspace report..."
/usr/duke/csh/do_all.csh "/usr/duke/csh/disksize.csh" > $DISKSIZE

TOTALMB=`grep "c:" $DISKSIZE | awk 'BEGIN {SUM = 0} {SUM = SUM + $2} END {print SUM }'`

echo " " >> $DISKSIZE
echo " The total amount of available diskspace = " $TOTALMB " Mb" >> $DISKSIZE
echo " " >> $BL
echo " The total amount of available diskspace = " $TOTALMB " Mb" >> $BL

TOTALBLKS=`grep -w Total $BL | awk '{ print $2 }'`

# 12-26-91 KED
echo "Generating usage report by group..."

echo -n "Disk Space Report - By Group" > $BY_GROUP
echo " " >> $BY_GROUP
echo " Blocks not accessed within Annual Cost" >> $BY_GROUP
echo "Group Blocks 30 days 60 days 90 days % (x\$2300/Gb)" >> $BY_GROUP
echo "--------------------------------------------------------------------------------" >> $BY_GROUP

ypcat group | sort > /tmp/foo.by_group

for GROUP in `awk ' BEGIN {FS=":" } {print $1}' /tmp/foo.by_group`
do
    touch /tmp/foo.$GROUP
 
    for NAME in `grep -w $GROUP /tmp/foo.by_group | awk ' BEGIN {FS=":" } {print $4}' | awk ' BEGIN {FS="," } {print $1 " " $2 " " $3 " " $4 " " $5 " " $6 " " $7 " " $8 " " $9 " " $10 " " $11 " " $12 " " $13 " " $14 " " $15 " " $16 " " $17 " " $18 " " $19 " " $20 " " $21 " " $22 " " $23 " " $24 " " $25 " " $26 " " $27 " " $28 " " $29 " " $30 }' `
do

        grep -w $NAME $BL >> /tmp/foo.$GROUP

        done # for NAME

    # does the summary for the group
    if [ -s /tmp/foo.$GROUP ]; then
        awk -f /usr/duke/csh/quota_by_group.awk GROUP_NAME=$GROUP TOTAL=$TOTALBLKS /tmp/foo.$GROUP >> $BY_GROUP
    fi

done # for GROUP

echo " " >> $BY_GROUP
echo " " >> $BY_GROUP
echo " " >> $BY_GROUP
echo " " >> $BY_GROUP
echo " " >> $BY_GROUP
echo "Note: " >> $BY_GROUP
echo " Current 'group' assignments can be viewed by issuing this command: " >> $BY_GROUP
echo " ypcat group | sort " >> $BY_GROUP

# Move the data files back to /usr/duke/csh/quota_report.dir
echo "Moving the data files to /usr/duke/csh/quota_report.dir"
cp $DISKSIZE /usr/duke/csh/quota_report.dir/disksize.$DATEext
cp $BL /usr/duke/csh/quota_report.dir/quota.$DATEext.bl
cp $DAT /usr/duke/csh/quota_report.dir/quota.dat
cp $RESULTS /usr/duke/csh/quota_report.dir/quota.$DATEext
cp $BY_GROUP /usr/duke/csh/quota_report.dir/quota.$DATEext.by_group
cp /tmp/foo.bl.HOG /usr/duke/csh/quota_report.dir/quota.$DATEext.HOG

/usr/etc/chown duke /usr/duke/csh/quota_report.dir/disksize.$DATEext
/usr/etc/chown duke /usr/duke/csh/quota_report.dir/quota.$DATEext.bl
/usr/etc/chown duke /usr/duke/csh/quota_report.dir/quota.dat
/usr/etc/chown duke /usr/duke/csh/quota_report.dir/quota.$DATEext
/usr/etc/chown duke /usr/duke/csh/quota_report.dir/quota.$DATEext.by_group
/usr/etc/chown duke /usr/duke/csh/quota_report.dir/quota.$DATEext.HOG

# Get rid of all the evidence
rm /tmp/foo.*
rm $DISKSIZE
rm $BL
rm $DAT
rm $RESULTS
rm $BY_GROUP

/usr/local/bin/enscript -r -2 -fCourier-Bold6 /usr/duke/csh/quota_report.dir/quota.$DATEext.bl
/usr/local/bin/enscript -r -2 -fCourier-Bold6 /usr/duke/csh/quota_report.dir/quota.$DATEext.HOG
/usr/local/bin/enscript -r -2 -fCourier-Bold6 /usr/duke/csh/quota_report.dir/quota.$DATEext.by_group

echo -n "Finished..."; date

# Send mail when finished
/usr/ucb/mail -s "Disk Quota Report" kerry.duke << End_of_file

Your bi-monthly disk quota report has finished. You can find the results:

    1. On the System Administration's LaserWriter Plus printer and
    2. In /usr/duke/csh/quota_report.dir

        quota.$DATEext Information by user and by system
        quota.$DATEext.bl Summary data sorted by user
        quota.dat Raw data from each system by partition
        quota.$DATEext.by_group Summary data by group
        quota.$DATEext.HOG Summary data sorted by biggest HOG

Kerry

End_of_file

# Send mail to $MANAGERS
#/usr/ucb/mail -s "Disk Quota Report - by Group" $MANAGERS < /usr/duke/csh/quota_report.dir/quota.$DATEext.by_group

# Send mail to all sun users
#/usr/ucb/mail -s "Disk Quota Report - by largest user" all_sunusers < /usr/duke/csh/quota_report.dir/quota.$DATEext.HOG

# End

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>END OF QUOTA_REPORT<<<<<<<<<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>BEGINNING OF STERGIOS<<<<<<<<<<<<<<<<<<<<<<<<<
#! /usr/local/bin/perl

#die("SYNTAX CHECK") ;
$debug = 0 ;

# boolean flag to notify users by mail
$mailusers = 1 ;

# list if people to receive results
$maillist = 'dennis@jessica stergios@jessica' ;

@db = () ;

# machine where users receive mail
$machine = "leland.stanford.edu" ;

$| = 1 ;
# limit in K before a user gets flagged
$sizelimit = 3000 ;

# users who don't get hassled
@exemptlist = ('stergios', 'dennis', 'buc', 'linton', 'lpd' ) ;

$okmessage = "

Hi, even though you have been exempted from normal disk usage
allotment you are still receiving this message from the disk usage
program. Consider it an FYI message, as no action is necessary on
your part.

Sincerely,

The Management
" ;

$badmessage = "

Starting soon disk quotas will be reactivated, and will be fully
phased in by the beginning of winter quarter.

You have probably noticed disk quotas were not enforced on Portia for
over a year. Instead we asked users to police their own space. This
policy has failed as evidenced by the excessive down time directly
attributable to disk fill ups.

As a large consumer of space you should consider deleting unwanted
files before quotas are activated. Do not feel pressured to delete
files associated with class work immediately. These files can be
deleted when the class has finished.

For large files that you need regularly, you shoulder consider the
'compress' program. Do a 'man compress' for more info.

This situation is critical. It requires immediate action on your part.

Sincerely,

The Management
" ;

sub usage {
                print "Usage: $0 -[re]\n" ;
                exit ;
}

&usage if $#ARGV < 0 ;

if ($ARGV[0] eq '-r') {
        $mailusers = 0 ;
} elsif ($ARGV[0] eq '-e') {
        $mailusers = 1 ;
} else {
        &usage ;
}

;#assumes you have a home dir setup that looks like
;#
;# /
;# s1 s2 s3
;# A B C A B C A B C
;# john stergios mary sue dave bill
;#
;# its trivial to change is you dont. and fails smartly if its close
;#

if ($debug == 1) {
        @sdirs = ('/s3') ;
        @ssecs = ('A', 'B', 'C', 'D', 'E') ;
} else {
        @sdirs = ('/s1', '/s2', '/s3', '/s4', '/s5', '/s6', '/s7' ) ;
        @ssecs = ('A', 'B', 'C', 'D', 'E') ;
}

foreach $top (@sdirs) {
        foreach $bot (@ssecs) {
                $dir = $top . "/" . $bot ;
                if ( -e $dir ) {
                        chdir $dir ;
                        opendir(DIR, $dir) ;
                        @d = sort readdir DIR ;
                        closedir (DIR) ;
                        $userdirs = join (' ', splice( @d , 2)) ;
                        push(@db, split(/\n/, `du -s $userdirs`)) ;
                }
        }
}

@sorteddb = sort bysize @db ;
chop ($hostname = `hostname` ) ;
chop ($date = `date +%D` ) ;

dbmopen(DH, "dhdbm", 0666) ;
open (MAILROOT, '| mail $maillist') ;
print MAILROOT "List of $hostname disk hogs. Size is in K\n" ;

 mailtohogs: foreach (@sorteddb) {
         ($size, $user) = split ;
         
         if ($size < $sizelimit) { last mailtohogs ; }

         if ($mailusers) {
                 open (MAILUSER, "| mail -s 'DISK WARINING' $user\@$machine") ;
                print MAILUSER
  "You have consumed $size K of disk space on your AIR UNIX Account.\n" ;
                 if (grep($_ eq $user, @exemptlist) >0) {
                         print MAILUSER $okmessage ;
                 } else {
                         print MAILUSER $badmessage ;
                 }
                 close MAILUSER ;
         }
         $DH{$user} .= $size . " " . $date . "-" ;
         print MAILROOT "$size $user\n" ;
         sleep (5) ; # wait five seconds before we mail again.
 }

close MAILROOT ;
dbmclose(DH) ;

exit 0 ;

sub bysize {
        ($na, $stra) = split (/[ \t]+/, $a, 2) ;
        ($nb, $strb) = split (/[ \t]+/, $b, 2) ;
        $nb - $na ;
}

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>END OF STERGIOS<<<<<<<<<<<<<<<<<<<<<<<<<<<<
 



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:49 CDT