Gopher

3 Posts

Dependency between compute budget and number of parameters

Right-Sizing Models for the Dataset: Finding the Best Data-To-Parameter Ratio for NLP Models

The route to improving transformer-based language models like GPT-3 and Gopher, which are trained on immense quantities of text scraped from the web, has been to increase their size. But research shows that, given a processing budget, bigger doesn’t necessarily mean better.

Gopher

Yale Song: Foundation models for vision

Large models pretrained on immense quantities of text have been proven to provide strong foundations for solving specialized language tasks. My biggest hope for AI in 2022 is...

Gopher

Large Language Models Shrink: Gopher and RETRO prove lean language models can push boundaries.

DeepMind released three papers that push the boundaries — and examine the issues — of large language models.

Gopher

Right-Sizing Models for the Dataset: Finding the Best Data-To-Parameter Ratio for NLP Models

Yale Song: Foundation models for vision

Large Language Models Shrink: Gopher and RETRO prove lean language models can push boundaries.

Subscribe to The Batch