OpenAI’s text-to-image generator DALL·E 2 produces pictures with uncanny creativity on demand. Has it invented its own language as well?
What’s new: Ask DALL·E 2 to generate an image that includes text, and often its output will include seemingly random characters. Giannis Daras and Alexandros G. Dimakis at University of Texas discovered that if you feed the gibberish back into the model, sometimes it will generate images that accord with the text you requested earlier.
How it works: The authors devised a simple process to determine whether DALL·E 2’s gibberish has meaning to the model.
- They prompted the model to generate images that include text.
- Many of the characters produced were distorted, requiring some degree of human interpretation to read, so the authors parsed them manually.
- They fed text strings produced by DALL·E 2 back into the model, prompting it to produce a new image.
Results: The authors provide only a handful of quantitative results, but they are intriguing. They report that “a lot of experimentation” was required to find gibberish that produced consistent images.
- Asking DALL·E 2 to generate an image of “two whales talking about food, with subtitles” produced an image with the text the authors rendered as, “Wa ch zod rea.” Prompting the model with “Wa ch zod rea” produced images of seafood.
- Prompting DALL·E 2 with “Apoploe vesrreaitais” yielded images of birds and other flying creatures in most of an unspecified number of attempts.
- The prompt “Contarra ccetnxniams luryca tanniounons” resulted in images of insects around half the time and an apparently random assortment of other creatures the other half.
- “Apoploe vesrreaitais eating Contarra ccetnxniams luryca tanniounons” brought forth images of — you guessed it — birds with bugs in their beaks.
Inside the mind of DALL·E 2: Inputs to DALL·E 2 are tokenized as subwords (for instance, apoploe may divide into apo, plo, e). Subwords can make up any possible input text including gibberish. Since DALL·E 2 was trained to generate coherent images in response to any input text, it’s no surprise that gibberish produces good images. But why does the author’s method for deriving this gibberish produce consistent images in some cases, random images in others, and a 50/50 combination of consistent and random images in still others? The authors and denizens of social media came up with a few hypotheses:
- The authors suggest that the model formed its own internal language with rules that may not make sense to people. In this case, similar and dissimilar images produced in response to the same prompt would have something in common that the model discovered but people may not recognize.
- One Twitter user theorized that DALL·E 2’s gibberish is based on subword patterns in its training dataset. For instance, if “apo” and “plo” are common components of Latin bird species names, then using both syllables would yield images of birds. On the other hand, subwords of “Contarra ccetnxniams luryca tanniounons” might be related to bugs in 50 percent of occurrences in the training set and to random other animals in the rest.
- Other Twitter users chalked up the authors’ findings to chance. They assert that the phenomenon is random and unrelated to patterns in the training dataset.
Why it matters: The discovery that DALL·E 2’s vocabulary may extend beyond its training data highlights the black-box nature of deep learning and the value of interpretable models. Can users benefit from understanding the model’s idiosyncratic style of communication? Does its apparent ability to respond to gibberish open a back door that would allow hackers to get results the model is designed to block? Do builders of natural language models need to start accounting for gibberish inputs? These questions may seem fanciful, but they may be critical to making such models dependable and secure.
We’re thinking: AI puzzles always spur an appetite, and right now a plate of fresh wa ch zod rea would hit the spot!