SUMMARY: Re: WANTED: Unix File Expiration Utility

From: Rex Espiritu (espiritu@cgi.com)
Date: Thu Apr 29 1993 - 01:57:54 CDT


REPLIES/RECOMMENDATIONS/SUGGESTIONS:

1) Use: quot -v
        or quot -av
        or a diskhogs script that uses quot and massages its output.
> You can massage quot to report
> which users haven't accessed so many blocks of storage
> in 30, 60, 90 days.
        --
> Try the "quot -av" command. It returns three columns of
> disk capacity not accessed in the previous 30, 60 and 90 days
> associated with each user. We then "publish" the list of major
> offenders and let peer pressure take care of deleting the stale
> files. In our case, assuming that a stale file should be
> automatically deleted would get me into hot water!

>> This doesn't really do what we need.

2) Use: find . -size +1000 -exec ls -l {} \;
> findbig

>> Neither does this.

3) Use: find /usr/dir -atime -90 -print | xargs rm
        This reply suggested...
> You need to use a -90, not a +90
> xargs optimizes the command line to rm, exec spawns an rm for each
> file.

>> Contrary to their suggestion, we found +90 to be correct.
>> Their recommendation to use xargs was helpful.

4) Use: find /usr/dir -mtime +90 -ls
> When I "sweep" our file systems I use "-mtime +NN".
> I want to know what's been modified not accessed.

>> Last accessed time is what we need.
>> Last modified would give us files modified long ago
>> but continue to be accessed/used daily.

5) Try: perl with some of the examples in the perl book.
> You might want to look at files & file access attributes with perl,
> but that will require more programming time.

>> We've not yet tried this one, but have been meaning to delve into
>> using perl at some point.

6) Buy: Epoch's nifty file cascade backup.
> If a file hasn't been accessed in 30 days
> it gets moved to a floptical
> with a pointer in the old file to the new location.
> After 90 days the file is moved to tape.

>> Interesting, but probably not. At least, not yet.

7) Get: lfu
> Don't know what exactly you're up to, but a utility we use here is
> called lfu. Here's the scenario: We have a central file server with a
> complete /usr/local tree (actually a couple, mounted with amd).
> Client workstations have local /usr/local directories and NFS mount
> the central /usr/local. When a file is accessed, it gets cached to
> the local /usr/local. When it doesn't get accessed for a while it
> gets removed and a symlink to the NFS mounted version is put in its
> place. If you start accessing it over the symlink it will eventually
> get brought back into the cache to replace the symlink. Has
> configuration to maintain free disk space, etc. A bit buggy at the
> moment, but someone here is working with the developer to iron out
> some of the bugs.
> The above scenario of course assumes you're trying to weed out
> replication and not complete rm the file out of online existence.

>> Not sure this really addresses what we need or whether this would be useful
>> for any purpose(s) at our site.

8) Maybe:
> > It seems that using "find" with the -atime switch/option
> > doesn't give us much of a hit list as we might expect.
        --
> Possibly due to (from the manual page):
> find does not follow symbolic links to other files or direc-
> tories; it applies the selection criteria to the symbolic
> links themselves, as if they were ordinary files.

>> This does not apply in our case since we run find on the actual hard mounted
>> local disk filesystem(s).

9) Maybe:
> Could it be that you're accessing more files than you expect? Perhaps
> a security sweep that does "file" or checksums, etc. which is reading
> the files? Someone greping through everything in a directory?

--
>      My experience (at least with SunOS 4.x) is that find works
> correctly.  You just don't have many files that have not been accessed
> in the last 90 days.  Maybe your users actually use their files :-)
>      You might also check to see if you are doing some other kind of
> system sweep that accesses files (i.e. a grep through all files).
> Also, some users might be doing this to subvert your efforts to get
> rid of old files.

>> It appears to be correct that find does work as expected. >> It appears that Backup Copilot, which we use to do dumps for unattended >> backups, is the culprit which changes the access datetimes everytime we >> (daily incrementals/weekly level zero dumps) run it via crontab. >> Does this concur with other people's experiences with Backup Copilot? >> I will send out a request for more information regarding this question.

ORIGINAL REQUEST:

--- Forwarded mail from espiritu@cgi.com (Rex Espiritu)

>From espiritu@cgi.com Fri Apr 16 15:00:30 1993 >From sun-managers-relay@ra.mcs.anl.gov Fri Apr 16 22:42:48 1993 To: Sun Managers Mailing List <sun-managers@eecs.nwu.edu> Subject: WANTED: Unix File Expiration Utility

We're attempting to establish a regular (monthly) sweep of our Unix filesystems to determine what file(s) (hierarchy/directories/subdirectories) have not been accessed for a "long" time.

It seems that using "find" with the -atime switch/option doesn't give us much of a hit list as we might expect. We're currently using a shell script with something similar to the following:

find /usr/dir -atime +90 -ls ... find /usr/dir -atime +90 -exec rm {} \;

Are there any utilities available which would help us accomplish this?

Any suggestions on how better to use find and/or recommendations would be greatly appreciated.

Thanks in advance.

-- M. Rex Espiritu, Jr. Carnegie Group, Inc. espiritu@cgi.com 5 PPG Place Voice: 412 642-6900 x233 FAX: -6906 Pittsburgh, PA 15222

--- End of forwarded message from espiritu@cgi.com (Rex Espiritu)

THANKS TO:

David Fetrow <fetrow@biostat.washington.edu> Michael G. Harrington <mgh@bihobl2.bih.harvard.edu> Daniel Trinkle <trinkle@cs.purdue.edu> Steve Holmes <sjh@math.purdue.edu> David T. Bath <dtb@otto.bf.rmit.oz.au> John Marsh <john@rod.mitre.org> John A. Murphy <jam@philabs.philips.com> Paul Begley <peb@sandoz.ueci.com> Bert Robbins <bert@penril.com> Lewis E. Wolfgang <wolfgang@sunspot.nosc.mil> danny@ews7.dseg.ti.com Fuat C. Baran <fuat@watsun.cc.columbia.edu> Mike Robinson <mike@castle.edinburgh.ac.uk>



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:07:47 CDT