Data Efficienct Image Transformers
Cookbook for Vision Transformers: A Formula for Training Vision Transformers
Vision Transformers (ViTs) are overtaking convolutional neural networks (CNN) in many vision tasks, but procedures for training them are still tailored for CNNs. New research investigated how various training ingredients affect ViT performance.