GILL (Generating Images with Large Language Models)

1 Post

Text or Images, Input or Output: GILL, an innovative approach to multimodal model training

GPT-4V introduced a large multimodal model that generates text from images and, with help from DALL-E 3, generates images from text. However, OpenAI hasn’t fully explained how it built the system. A separate group of researchers described their own method.

GILL (Generating Images with Large Language Models)

Text or Images, Input or Output: GILL, an innovative approach to multimodal model training

Subscribe to The Batch