I’m taking a Linux shell script programming class and our latest assignment is to count files and executables across all the directories in our PATH. I am stumped. Do you have some pointers to get me going, please?
While the vast majority of our interaction with computers nowadays is through graphical interfaces (think smartphone) or voice (Alexa devices), there are still cases where a command line interface and a keyboard input is the most efficient form of interaction. There are also millions of systems on the Internet that are running Linux, and while Linux has graphical interface options, a lot of developers prefer to crack open the command line for their power tasks.
I’m also a long-time Linux user, and Unix before that, so I have a very positive association with the command line too, which is why I’m happy to help you out with this. Heck, I’ve written quite a few books about Unix and Linux too, including the best-selling Wicked Cool Shell Scripts that I recommend both to you and your instructor. 😁
Okay, back on task! Now, let’s start breaking down the task, because, like any other programming project, the key to success is to divide and conquer. There are three pieces to this script that will need to be solved: How to count total files, how to count executables, and how to step through directories listed in the PATH. Let’s check it out, step by step…
HOW TO STEP THROUGH YOUR $PATH
There are a half-dozen or so system settings that are set as part of your interactive shell instantiation, including HOME, USER, SHELL, and PATH. All system variables of this nature are in all-caps, and you can see every one in your own interactive shell by typing the command env. You can try it yourself to see what’s shown, but it’s typically 12-20 different variables, including a few oddities like _=/usr/bin/env.
The most important is the PATH and that can be very easily displayed as shown:
$ echo $PATH /opt/local/bin:/opt/local/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin: /bin:/usr/sbin:/sbin:/Applications/VMware Fusion Tech Preview.app/Contents/Public
I’m running these tests on my Mac system, which has a full Linux-esque system accessible through the Terminal app, which is why I might have a few that are different to what you’ll see, most notably the VMware Fusion directory. No worries, the script is going to be portable, which is one of the great joys of Linux, Unix, and its related operating systems.
The challenge with stepping through the directories in the PATH is that the fields are not separated by spaces but rather by the colon (“:”) symbol. There are a number of ways to solve this, the most common of which is to change the Internal Field Separator (IFS) value within the shell, but that leads to other problems, so instead I’m going to use the read command and an input redirect at the end of the for loop. Here’s the script fragment I would use to step through my PATH:
while read -d ':' dir do echo "$dir" done <<< "$PATH:"
That’s easy enough, and the “<<<” notation is what’s known as a “here” notation, just handing the subsequent value to the loop as if it were standard input. Note the trailing “:” to ensure that it sees the last PATH value too! As is, here’s what I see upon execution:
$ sh count-executables.sh /opt/local/bin /opt/local/sbin /usr/local/bin /System/Cryptexes/App/usr/bin /usr/bin /bin /usr/sbin /sbin /Applications/VMware Fusion Tech Preview.app/Contents/Public
That’s exactly what I want because once I can isolate each directory, I can then use that as a parameter to a different command that can count files or executables. We’ll get to that next…
HOW TO COUNT FILES IN A DIRECTORY
The next puzzle to solve is how to count how many files (not folders!) are in a directory and how many are marked as executable. There are a number of ways to do that, including using the ls command, but I think find is your best bet, specifically using the -type parameter. A quick peek at the find man page reveals its options:
This will let you count files in a directory, though there’s no option to count executables. Okay, divide and conquer. For this block the code is fairly straightforward once you know this option for find, but let’s start with a command line test:
$ find /usr/bin -type f -depth 1 | wc -l 894
Sheesh, 894 files just in /usr/bin. Impressive. You’re probably wondering what the -depth 1 does that’s added here; it’s to ensure that the count doesn’t include files that might be in subdirectories because by default find examines “the specified location and all subdirectories therein”.
Also, wc -l is, of course, an easy way to get a count of lines output by the previous portion of the command pipe (‘-l‘ counts lines, the wc command has other useful capabilities too).
Adding this to our code fragment:
while read -d ':' dir do echo "$dir" find "$dir" -type f -depth 1 | wc -l done <<< "$PATH:"
The result has a very interesting output:
$ sh count-executables.sh /opt/local/bin find: /opt/local/bin: No such file or directory 0 /opt/local/sbin find: /opt/local/sbin: No such file or directory 0 /usr/local/bin 0 /System/Cryptexes/App/usr/bin 1 /usr/bin 894 /bin 37 /usr/sbin find: /usr/sbin/authserver: Permission denied 197 /sbin 40 /Applications/VMware Fusion Tech Preview.app/Contents/Public find: /Applications/VMware Fusion Tech Preview.app/Contents/Public: No such file or directory 0
What’s the story behind those errors? Turns out that it’s entirely possible to have a directory specified in PATH that doesn’t actually exist. It’s sloppy and should be fixed, but it also might be set deep in the OS where you have no ability to fix it. This means it’s up to our code to test and ensure that the folder exists before we try to count entries with find. That can be accomplished with an if-then test:
while read -d ':' dir do if [ -x "$dir" ] ; then echo "$dir" find "$dir" -type f -depth 1 | wc -l else echo "$dir not found" fi done <<< "$PATH:"
This includes an error message with the else clause but you can skip this if you want (or for bonus points accumulate the names of all these directories and output it with your summary data). I’ll show it once, then skip it for subsequent output in this tutorial to keep things reasonably succinct.
Now when I run the script, the output’s a bit less clumsy:
$ sh count-executables.sh /opt/local/bin not found /opt/local/sbin not found /usr/local/bin 0 /System/Cryptexes/App/usr/bin 1 /usr/bin 894 /bin 37 /usr/sbin find: /usr/sbin/authserver: Permission denied 197 /sbin 40 /Applications/VMware Fusion Tech Preview.app/Contents/Public not found
Handling error conditions within a script is a must, and in this instance also helps ensure an accurate result.
WHAT ABOUT SYMBOLIC AND HARD LINKS?
Another hiccup above is that directories like /usr/local/bin aren’t empty, but all of the contents therein are links, not actual files, according to how find defines things:
$ ls -l /usr/local/bin total 0 lrwxr-xr-x 1 root wheel 68 Jun 13 21:18 prl_convert@ -> /Applications/Parallels Desktop.app/Contents/MacOS/parallels_wrapper lrwxr-xr-x 1 root wheel 68 Jun 13 21:18 prl_disk_tool@ -> /Applications/Parallels Desktop.app/Contents/MacOS/parallels_wrapper lrwxr-xr-x 1 root wheel 68 Jun 13 21:18 prl_perf_ctl@ -> /Applications/Parallels Desktop.app/Contents/MacOS/parallels_wrapper lrwxr-xr-x 1 root wheel 62 Jun 13 21:18 prlcore2dmp@ -> /Applications/Parallels Desktop.app/Contents/MacOS/prlcore2dmp lrwxr-xr-x 1 root wheel 68 Jun 13 21:18 prlctl@ -> /Applications/Parallels Desktop.app/Contents/MacOS/parallels_wrapper lrwxr-xr-x 1 root wheel 58 Jun 13 21:18 prlexec@ -> /Applications/Parallels Desktop.app/Contents/MacOS/prlexec lrwxr-xr-x 1 root wheel 68 Jun 13 21:18 prlsrvctl@ -> /Applications/Parallels Desktop.app/Contents/MacOS/parallels_wrapper
If you want to count those as files (which might be something that needs clarifying by your teacher) then it’s doable, but the find command in our script gets just a bit more complex:
find "$dir" ( -type f -or -type l ) -depth 1 | wc -l
In other words, we’re going to count files of type ‘f’ or of type ‘l’. Don’t need to account for links? Remove the entire OR statement.
HOW TO SUM UP NUMBERS IN A SCRIPT
We can now identify the number of files in each directory, but how do we sum up these values as we go? The answer to that is to use some of the shell’s built-in math capabilities. The classic way to do that is:
((sum=$sum+$value))
The wrinkle is that we also need to assign the output of the find|wc to a variable so we can add it to the summating value. Doable, and all reflected in this updated fragment:
while read -d ':' dir do if [ -x "$dir" ] ; then echo "$dir" count="$(find "$dir" -type f -depth 1 | wc -l)" echo " $count" (( files=$files+$count )) else echo "$dir not found" fi done <<< "$PATH:"
That’s much of the project solved, with a few tweaks that I’ll show at the end, including the final output command.
COUNTING EXECUTABLE FILES IN A DIRECTORY
The last challenge to solve is how to count executables. Disappointingly, find doesn’t have that as an option with the -type parameter, but it turns out that the test command (as utilized by its alias ‘[‘ in scripts) has exactly what we seek with the ‘-x’ conditional. In other words:
if [ -x $file ] ; then echo $file is executable fi
can be directly translated into a script fragment (with numbers summed up too) as shown:
for file in $(ls "$dir"|sed 's/ /~~/g') ; do file="$(echo $file|sed 's/~~/ /g')" if [ -x "$dir/$file" ] ; then ((executable=$executable+1)) fi done
In this instance, ‘executable’ is counting executables in the current directory, and ‘execs’ is the overall summary. Notice the use of sed in the loop? It’s to protect any filenames that have spaces in them: Each space is changed to a “~~” sequence, then immediately changed back once the individual entries have been extracted from the ls output. A common script technique!
And that’s it. Here’s my full script:
#!/bin/sh # count executables across each directory in the PATH files=0 # summarize total files found execs=0 # summarize total executables dir=$HOME while read -d ':' dir do executable=0 if [ -x "$dir" ] ; then echo "$dir" # total files count="$(find "$dir" \( -type f -or -type l \) -depth 1 | wc -l)" echo " files in directory: $count" ((files=$files+$count )) # executables for file in $(ls "$dir"|sed 's/ /~~/g') ; do file="$(echo $file|sed 's/~~/ /g')" if [ -x "$dir/$file" ] ; then ((executable=$executable+1)) fi done echo " executables in directory: $executable" ((execs=$execs+$executable)) fi done <<< "$PATH:" echo "Total files found across PATH = $files, with $execs executable." exit 0
And here’s the output with my Mac system:
$ sh count-executables.sh /usr/local/bin files in directory: 7 executables in directory: 7 /System/Cryptexes/App/usr/bin files in directory: 1 executables in directory: 1 /usr/bin files in directory: 933 executables in directory: 933 /bin files in directory: 37 executables in directory: 37 /usr/sbin files in directory: 227 executables in directory: 224 /sbin files in directory: 62 executables in directory: 62 Total files found across PATH = 1267, with 1264 executable.
That’s it. Mission accomplished. Now, good luck with your own coding efforts!
Pro Tip: I’ve been writing about Linux scripting for years. I even wrote a book about it! Please check out my Linux help library for lots more tutorials, including an area specifically on Linux shell script programming help!