École Normale Supérieure
Memorize Less; Retrieve More: How small language models can perform specialized tasks.
Large language models are trained only to predict the next word based on previous ones. Yet, given a modest fine-tuning set, they acquire enough information to learn how to perform tasks such as answering questions.