This AI neural network transforms text captions into art, like a jellyfish Pikachu

Appropriately named after the artist Salvador Dali and Pixar's WALL-E, OpenAI's GPT-3-trained DALL-E is an AI artist "trained to generate images from text descriptions, using a dataset of text–image pairs."

We've found that it has a diverse set of capabilities, including creating anthropomorphized versions of animals and objects, combining unrelated concepts in plausible ways, rendering text, and applying transformations to existing images.


For example: "an illustration of an avocado with a mustache playing guitar."

"We find it interesting how DALL·E adapts human body parts onto animals," DALL-E's creators explain on the site. "For example, when asked to draw a daikon radish blowing its nose, sipping a latte, or riding a unicycle, DALL·E often draws the kerchief, hands, and feet in plausible locations."

Or, "a professional high quality illustration of a jellyfish pikachu chimera. a jellyfish imitating a pikachu. a jellyfish made of pikachu."

For this one, the creators note that:

DALL·E is sometimes able to combine distinct animals in plausible ways. We include "pikachu" to explore DALL·E's ability to incorporate knowledge of popular media, and "robot" to explore its ability to generate animal cyborgs. Generally, the features of the second animal mentioned in the caption tend to be dominant. 

We also find that inserting the phrase "professional high quality" before "illustration" and "emoji" sometimes improves the quality and consistency of the results.

The DALL-E website lets you play around with the captions and generate your own weird images. To be fair, the resulting images are no longer spontaneously generated; rather, as they explain:

The samples shown for each caption in the visuals are obtained by taking the top 32 of 512 after reranking with CLIP, but we do not use any manual cherry-picking, aside from the thumbnails and standalone images that appear outside.

DALL-E: Creating Images From Text