SUMMARY - find -exec vs xargs

From: Steve Nelson <sanelson_at_gmail.com>
Date: Thu Mar 17 2005 - 04:12:07 EST
Thanks very much for the quick responses - herewith a summary of what I learned:

The problem is not one of recursion limit, but one of (in)efficiency.

It is much more efficient to use find | xargs than find -exec {} \;

"find ... -exec command {} \;" runs the command once for each name.

If find identifies 10000 files, using -exec grep ... as an example,
find would fork off a child copy of itself, the copy would become a
grep working on a single file.  Meanwhile the original parent find
would sleep waiting for its child (grep) to finish before searching
for the next file that meets the find criteria.  That means 10000
executions of grep and pauses of find.

"xargs command" reads names on its standard input, and feeds them in
bunches to command, so that command is run fewer times (only once, in
most cases).  So, if we use find | xargs grep ..., the find is able to
work "non-stop" filling the pipe with found file names.  xargs
collects groups of about 20 - 50 names and does a single grep for the
collection.  Find doesn't pause and you only do about 200 greps, not
10000. There isn't a forked process for every single file.

The weaknesses of xargs are (1) it can be confused by "funny"
filenames (which is why Gnu has find -print0 |xargs -0, or you can
pipe through sed to add backslashes everywhere), and (2) it can feed
zero arguments to command which might then
just sit there waiting (this is why Gnu xargs has -r which means:
don't run command if stdin is empty).

These days, you can use "find ... -exec command {} +" which groups
arguments together.  However, I found that this did not give
satisfactory grep results when invoked as:

find /usr/include -type f -exec grep NFS_VERSION {} \+ -print (or
without the escaped +)

The following examples show the time difference (and slightly more
useful output of xargs):

$ time find /usr/include -type f -exec grep NFS_VERSION {} \; -print
#define NFS_VERSION     ((rpcvers_t)2)
/usr/include/nfs/nfs.h
#define NFS_VERSION     2
/usr/include/rpcsvc/nfs_prot.h
version NFS_VERSION {
/usr/include/rpcsvc/nfs_prot.x

real    0m9.883s
user    0m3.340s
sys     0m5.900s
$ time find /usr/include -type f|xargs grep NFS_VERSION
/usr/include/nfs/nfs.h:#define  NFS_VERSION     ((rpcvers_t)2)
/usr/include/rpcsvc/nfs_prot.h:#define  NFS_VERSION     2
/usr/include/rpcsvc/nfs_prot.x: version NFS_VERSION {

real    0m0.677s
user    0m0.380s
sys     0m0.310s

Thanks,

Steve Nelson
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Thu Mar 17 04:12:29 2005

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:44 EST