Longformer

1 Post

Graphs related to different attention mechanisms
Longformer

More Efficient Transformers: BigBird is an efficient attention mechanism for transformers.

As transformer networks move to the fore in applications from language to vision, the time it takes them to crunch longer sequences becomes a more pressing issue. A new method lightens the computational load using sparse attention.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox