Mar 24, 2021
Pictures From Words and Gestures: AI model generates captions as users mouse over images.
A new system combines verbal descriptions and crude lines to visualize complex scenes. Google researchers led by Jing Yu Koh proposed Tag-Retrieve-Compose-Synthesize (TReCS), a system that generates photorealistic images by describing what they want to see while mousing around on a blank screen.