Text-to-Text Transfer Transformer (T5)
Bigger, Faster Transformers: Increasing parameters without slowing down transformers
Performance in language tasks rises with the size of the model — yet, as a model’s parameter count rises, so does the time it takes to render output. New work pumps up the number of parameters without slowing down the network.