I’d like to be able to use one of the new generative AI Image creation tools but every time I specify a name or phrase, it gets horribly misspelled. Is there any way to correct spelling?
There are a surprising number of generative AI-powered image creation tools available nowadays, and some are built into your favorite operating system or Web browser. Not enough? There are more of these tools on their way and if you want to really be overwhelmed, try searching for “AI image tools” in your favorite mobile device app store! While many seem to share a common backend or underlying system, there are enough variations that it’s interesting to experiment any time you’re trying to create something wonderful.
There are some limitations to this technology, of course, famously that these programs seem to have a hard time with human hands. Too many fingers, too few, weirdly distorted hands, hands that merge with objects around them, the results can be somewhat troubling!
Shortcuts: Gemini | Copilot/Dall-E | Firefly | Stable Diffusion | OpenArt | Ideogram | ImageFX
The other common challenge for these AI utilities is language. They understand prompts and can figure out intention with surprising accuracy. But ask it to add the word “Mom” and you’ll get a million variations on the theme but very few that have those letters, in that order, without any additions.
If you think about how these systems work, it makes sense: We humans have learned that letters string together to create words, and words string together in a specific order to create phrases and sentences. To the computer, however, it’s just letters and subsequent letters are calculated both based on what you have prompted and what is a statistically likely shape following up the current letterform. In other words, they can fake it, but they really don’t have a clue what the strings of letters mean!
Fortunately, some AI image creation tool developers have been working on fixing this problem (and probably getting human hands correct too, but that’s another article!). Let’s have a look.
GOOGLE GEMINI
Google renamed its Bard AI system to Gemini and promptly got rather embarassed by some of its default image creation tendencies, to the point where the company just turned off image creation entirely. It’s recently returned in a limited fashion, but not without some glitches:
Why Gemini would think I’m asking for a person to be created is a bit of a puzzler, but we’re going to have to try a different tool to succeed here.
MICROSOFT COPILOT / OPENAI DALL-E
One of the most easily accessible generative AI tools available is Microsoft’s Copilot, which offers a free interface to OpenAI’s ChatGPT and OpenAI’s image creation DALL-E system. You can find Copilot in Windows 11 now, along with Microsoft Edge, its variation on the Google Chrome foundational browser Chromium. You can also just go to copilot.microsoft.com too.
One of the more mature models available, it does an… interesting… job with the prompt:
The signs are cheery and entertaining but, wow, you could have a cat run across the keyboard and get closer to the correct spelling of every word!
One trick that some people say works – though I have never had actually help – is to add spaces between letters so that they’re seen as a sequence of letterforms rather than words. I tried it with “I A M N O T S O G O O D A T S P E L L I N G”. This makes it worse, actually!
The images themselves are wonderful but oh! Those words!
ADOBE FIREFLY
One of the more interesting tools is part of the beloved Adobe Creative Suite: Firefly. The program does a great job with iages and has dozens of erference styles for art, but for words? Well, you can see what happened when I gave it the same prompt and chose a few stylistic modifiers:
Beautiful images but, um, that spelling is unrecognizably atrocious. Sorry, Adobe.
STABLE DIFFUSION
One of the best generative AI tools for creating human images is Stable Diffusion but what happens when I give it the same prompt and indicate it should use a “comic book” style? This:
In this case, perhaps there’s a bit of comic-book irony at work here, but still, not the result we seek!
OPENART: BETTER?
Another tool that’s somewhat of an up-and-comer is OpenArt.ai but when given the prompt, it not only can’t figure out the words, but creates rather boring images:
It’s not going to win any prizes with these results, sorry to say!
IDEOGRAM GETS REALLY CLOSE
Before you give up hope, one of my favorite generative AI image sites, ideogram.ai, actually has put a lot of effort into getting spelling correct, and it shows:
Two out of three actually have ever word spelled correctly, even if the middle sign offers up yet more AI irony (and 3-fingered hands!).
IMAGEFX GETS IT RIGHT
The last site I tried actually got the signage correct: ImageFX. It’s actuall part of the Google AI team’s work too, a part of AI Test Kitchen at Google Labs. Same prompt, great result:
The fact that the middle image of the three has exactly the same typographical error as the previos seems to hint at them using the same underlying engine, but perhaps not.
One more, with ImageFX. This time I prompt “a cute sign that says “I’m not so good at spelling ” and includes flowers and baby animals watercolor”. Cute apparently means needlepoint which then overrides watercolor, but still, these are very nice:
This time all three have the correct spelling. Success. Now to work on the rest of the prompt…
Pro Tip: I’ve been writing about AI for a while now. Please check out my AI and ChatGPT Help Area for more tutorials and help articles while you’re visiting!