I’ve been learning Linux and the Linux command shell through writing shell scripts. I also love Wordle. Putting those two together, I’m wondering if there’s any way to write a Wordle help utility script that will assist me while solving the puzzle! Can you help out?
Wordle is a popular word guessing puzzle in the spirit of the old puzzle game Mastermind. All answers are five letter words and you guess words, with the program denoting letters that are in the right spot and letters that are present in the solution, but in the wrong spot. If the word is “KAZOO” I might guess “BOODLE” and learn that there are no letters that are correct and in the correct spot, but there are two letters – O and O – that are present in the mystery word. Can you then use this information to comb through a language dictionary to find possible solutions?
Of course, you can! There’s not much you can’t do in a Linux or MacOS shell script, actually. It might take a bit of ingenuity, but we can make that happen. The first step is you will need to ensure that you have a user dictionary on your computer. On most Linux systems and the MacOS system, that’s conveniently located in the directory “/usr/share/dict/” with the filename “web2“. Can’t find one there? Try the ‘dict‘ command; if that works you should be able to reference its man page to find the name of the dictionary it’s using (and you can too!).
EXPLORING THE LINUX DICT FILE
The actual file “web2” is comprised of a simple list of every known word in the English language. Sound big? It is. In fact, the current version is over 200,000 words strong, ranging from ‘a’ to ‘antidisestablishmentarianism’. Here are the first few words:
$ head /usr/share/dict/web2 A a aa aal aalii aam Aani aardvark aardwolf Aaron
How do you just get five-letter words out of the dictionary? That’s a task for the grep program and it’s reasonably easy. Each ‘.’ in a pattern represents a letter, so if we anchor the pattern with the special beginning of line (^) and end of line ($) symbol, we’re ready to filter out those 5-letter words:
$ grep '^.....$' /usr/share/dict/web2 | more aalii Aaron abaca aback abaff abaft Abama abase abash abask
Now we have a stream of five letter words, there are two tasks we want to accomplish: filter by known letters in known locations and also filter by letters that must be present.
FILTER BY LETTERS IN KNOWN LOCATIONS
The most important part of this script is to be able to identify words that have specific letters in specific locations. Once we normalize every word to be all lowercase, we can easily accomplish that by replacing one or more of the dots with the actual letter. For example, if we wanted 5-letter words that began and ended with the letter ‘s’, we could do this:
$ grep '^s...s$' /usr/share/dict/web2 | head saros sarus scads scobs scops secos sekos semis shaps shies
As you can see, there are a fair number of possibilities. We’ll still need to transliterate all the dictionary words to ensure they’re lowercase, but for now, this basic pattern works fine.
FILTER BY KNOWN LETTERS IN UNKNOWN LOCATIONS
The other search we need is to be able to filter out words that have each and every specific letter too. In the above list, what if the word required the letter ‘k’? That drops it down to a single possibility, sekos. While there’s a lot you can do with sophisticated regular expressions I’m going to take a simpler, albeit less efficient path: Using a chain of grep commands. Here’s how that might look:
$ grep '^s...s$' /usr/share/dict/web2 | grep 'k' | head sekos skies
Ah, turns out that there are TWO words that begin and end with the letter ‘s’, contain 5-letters total, and also have the letter ‘k’ somewhere in the word: sekos and skies. Note: “sekos” is a synonym for “sanctuary” if you’re curious.
TURNING IT ALL INTO A SHELL SCRIPT
The next question is how you want users to specify this information. One easy way is to simply use the positional arguments as individual letters. For example, words that start and end with an ‘s’ might be S – – – S. Knowing it’s constrained to 5-letters, the sixth argument could be a list of other letters that have to show up too. In other words:
wordle-helper s - - - s k
The pattern conversion is straightforward: change all dashes to dots. Adding on the prefix and suffix characters and ensuring that the user’s input is all lowercase, here’s the solution:
pattern="$(echo "^$1$2$3$4$5$" | tr '[A-Z]' '[a-z]' | tr '-' '.')"
Keep in mind that the $( ) sequence causes everything within to be executed by a subshell, so the variable ‘pattern’ will contain that ^s…s$’ sequence.
Next up, the filter to ensure that the matching words also contain the letter or letters specified. This is a bit more tricky because we are going to allow the user to specify a sequence of letters like ksrtw but need to break that down on a letter-by-letter basis. This can be done with the rarely used fold command. It’s intended to ensure lines are a maximum length, but give it a width=1 parameter and it’ll break a word down into individual letters:
$ echo "hello" | fold -w1 h e l l o
Most of the work will actually be done in the “for” loop itself, as you can see:
for letter in $(echo "$6" | fold -w1 | tr '[[:upper:]]' '[[:lower:]]') do addon="$addon | grep '$letter'" done
Notice that the result will be stored in the variable “addon” and that it builds a pipe-separated command. If the starting letters are tw then the result will be
| grep 't' | grep 'w'
Which looks wrong, doesn’t it? You’ll see, it’ll work fine.
At its most basic, the resultant command (also ensuring that the dictionary is all lowercase) will be
tr '[[:upper:]]' '[[:lower:]]' < $DICT | grep "$pattern" $addon
Except there’s a problem: As-is, the shell won’t invoke the value of $addon. Instead, drop the entire expression within an eval statement:
eval "tr '[[:upper:]]' '[[:lower:]]' < $DICT | \
grep "$pattern" $addon | paste - - - - - - - "
I also added the paste command so if there are lots of results, they’ll be shown in a multi-column output to look a bit more attractive.
THE FINAL WORDLE-HELP SCRIPT
There’s only one more thing to add and that’s an initial test to ensure the user specified at least five arguments. If they omit the ‘right letter, any location’ argument (arg 6) that’s fine. Here’s my final script:
#!/bin/sh # Wordle Help - helps find matching 5-letter words for WORDLE # Usage: "wordle-help.sh 1 2 3 4 5 XXX" # where each space is either a dash (unknown letter) or a letter DICT="/usr/share/dict/web2" if [ $# -lt 5 ] ; then echo "Usage: $(basename $0) 1 2 3 4 5 XXX" echo " where each argument is a dash (unknown) or a letter" echo " last arg, XXX, is a list of letters that must appear in the word" exit 1 fi # build the known letter, unknown location filter for letter in $(echo "$6" | fold -w1 | tr '[[:upper:]]' '[[:lower:]]') do addon="$addon | grep '$letter'" done # and the specified letters in location pattern pattern="$(echo "^$1$2$3$4$5$" | tr '[A-Z]' '[a-z]' | tr '-' '.')" # and let's do it! eval "tr '[[:upper:]]' '[[:lower:]]' < $DICT | grep "$pattern" $addon | \ paste - - - - - - - " exit 0
Cool. Let’s try it out!
$ sh wordle-help.sh s - - - s k sekos skies
That’s our original query: A 5-letter word that begins and ends with ‘s’ and has a k somewhere in the word. Two results.
How about just a list of 5-letter words that begin with ‘sh’? Doable:
$ sh wordle-help.sh s h - - - shack shade shady shaft shahi shaka shake shako shaku shaky shale shall shalt shaly shama shame shane shang shank shant shape shape shaps shapy shard share shari shark sharn sharp shaul shaup shave shawl shawm shawn shawy sheaf sheal shean shear sheat sheen sheep sheer sheet sheik shela sheld shelf shell shemu shend sheng sheol sheth sheva shewa shiah shice shide shied shiel shier shies shift shiko shilf shilh shill shina shine shiny shire shirk shirl shirr shirt shish shisn shita shive shivy shluh shoad shoal shoat shock shode shoer shogi shoji shojo shola shole shona shone shood shooi shook shool shoop shoor shoot shore shorn short shote shott shout shove shown showy shoya shrab shraf shrag shram shrap shred shree shree shrew shrip shrog shrub shrug shuba shuck shuff shune shunt shure shurf shush shyam shyer shyly
With that result, you can really see the benefit of the paste command to format the output.
One more:
$ sh wordle-help.sh s h - - - ae shade shake shale shame shane shape shape share shave sheaf sheal shean shear sheat shela sheva shewa
See how that works? Now you can take the script further and improve it as you desire. One thing you could do is add another variable that lets you exclude specific letters from the resultant word, which could be done by adding the ‘-v’ flag to grep, like “grep -v ‘r’” which will remove any words that have the letter ‘r’ within.
Pro Tip: I’ve been writing about Linux scripting for years. I even wrote a book about it! Please check out my Linux help library for lots more tutorials, including an area specifically on Linux shell script programming help!