I am writing a shell script that summarizes large amounts of data and would like to display the sum values with thousands separators. By default I just get a long stream of digits. What’s a smart way to solve this in a shell script?
This sounds like a classic programming problem and in the world of Linux and Bash shell scripts, there are quite a few ways to solve it. Where this gets interesting, however, is that different locales have different styles for thousands separators. For example, in the United States, 12,345.00 is the correct format, but in Canada that would be written as 12 345,00 and in Spain that would be 12.345,00. A bit confusing, but in Linux it’s stored as LANG and a variety of different LC_* variables. In this case, the two values we care about are the radix (the element between the whole and fractional portions (between 5 and 0 above) and the thousands separator, known as the thousands_sep.
But this is probably way more than you want to know. The key is that your easiest answer is to use the printf command line function (its name comes from the same function in the C programming language). It lets you specify as a format string what you want output, then the value or values to be included.
In this case, let’s say that num=12345678.90 to make things interesting. Use echo and, well, you’ll get just that:
$ echo $num 12345678.90
But if we use printf and specify the %f format for a floating number (e.g. one that could have a fraction), we get:
$ printf "%f\n" $num 12345678.900000
You can see that it hasn’t worked, but the printf function has at least recognized the decimal component as it uses a default of four digits of significance. To add the thousands separator you need to use a single quote in the % sequence. To chop that 4 digits after the radix or decimal point to just two, we’ll add .2 additionally. The result:
$ printf "%'.2f\n" $num 12,345,678.90
Sweet, but there’s still a bit of a problem as this invites some problems if you want this command & period format and you’re in a locale that specifies a different format that you don’t much like.
So the output would be correct for your location, but how can you force the commas and period? By temporarily specifying a different locale just for the invocation of this command:
$ LC_ALL=en_US.UTF-8 printf "%'.3f\n" $num 12,345,678.90
Turns out that GNU Linux, and therefore many Linux distributions, also include a cool command called numfmt. With this utility you could simply specify numfmt –grouping $num and get the same output. You can, of course, use either of these strategies directly in a shell script as needed, either as a small function or inline in your script.
Hope that helps you out!
Pro Tip: I’ve been writing about shell script program since before Linux was a thing! Check out my shell script programming help for lots more tutorials and Q&A discussion.