Ah, the serendipity is marvelous! I’ve just been writing about the find command for the new Tiger edition of my best-selling book Learning Unix for Mac OS X so it’s all very fresh in my mind.
Whenever you have a nested file heirarchy that you want to search, you should always, automatically reach for the find monkey wrench, coupled with its partner command xargs. But let’s step through this slowly so you can see how these all work together, because we’re going to use three different commands in a pipe to accomplish what you seek.
First off, the find command has some of the weirdest syntax in Unix, so if you want to learn more about it, use the man find command within Terminal. For now, just follow along. To find all files below the current point in the file system that are HTML files, you’d use:
$ find . -name "*html" -print
Notice that by not using the pattern *.html this also matches files that have the suffix “shtml” too (typically server-side include HTML). This generates a long list of filenames. To search through them for a specific pattern, you want to use the grep command, as you know, but the wrinkle is that you can’t just do something like find | grep because grep just isn’t expecting a list of filenames from standard input (stdin).
That’s when our pal xargs comes in. The xargs command does expect a list of filenames from stdin and it then acts as a wrapper for Unix commands that don’t work that way.
Putting them all together, here’s how you could find all HTML files that have a 2004 copyright notice in them, just as a topical example: