Wish you could draw, but your elephants look like crocodiles? Sketchforme doesn’t have that problem. This AI agent roughs out simple scenes based on text descriptions.
What’s new: Sketchforme generates crude drawings from natural-language descriptions such as “an apple on a tree” or “a dog next to a house.” People who viewed its output thought a human made the drawing a third of the time, a new paper says.
How it works: Sketchforme relies on two neural networks:
- The scene composer generates scene layouts. It was trained on the Visual Genome data set of photos annotated with captions, bounding boxes, and class information.
- The object sketcher draws the objects according to their real-world scale. It was trained on the Quick, Draw! data set of 50 million labeled sketches of individual objects.
Behind the news: Building a sketch generator was a thorny problem until the advent of neural networks. Sketch-RNN, an early sketcher based on neural nets in 2017, was trained on crowd-sourced drawings and draws requested objects using an initial stroke as a prompt. Sketchforme builds on that work.
Bottom line: Sketchforme’s output is remarkably true to human notions of what objects look like in the abstract. UC Berkeley researchers Forrest Huang and John F. Canny point out that sketching is a natural way to convey ideas quickly and a useful thinking tool in applications like learning languages. But the fact is, Sketchforme is just plain fun — and no slouch at drawing, too.