ImageNet

16 Posts

Masked Pretraining for CNNs: ConvNeXt V2, the new model family that boosts ConvNet performance
ImageNet

Masked Pretraining for CNNs: ConvNeXt V2, the new model family that boosts ConvNet performance

Vision transformers have bested convolutional neural networks (CNNs) in a number of key vision tasks. Have CNNs hit their limit? New research suggests otherwise.
Vision Transformers Made Manageable: FlexiViT, the vision transformer that allows users to specify the patch size
ImageNet

Vision Transformers Made Manageable: FlexiViT, the vision transformer that allows users to specify the patch size

Vision transformers typically process images in patches of fixed size. Smaller patches yield higher accuracy but require more computation. A new training method lets AI engineers adjust the tradeoff.
Diffusion Transformed: A new class of diffusion models based on the transformer architecture
ImageNet

Diffusion Transformed: A new class of diffusion models based on the transformer architecture

A tweak to diffusion models, which are responsible for most of the recent excitement about AI-generated images, enables them to produce more realistic output.
Stable Biases: Stable Diffusion may amplify biases in its training data.
ImageNet

Stable Biases: Stable Diffusion may amplify biases in its training data.

Stable Diffusion may amplify biases in its training data in ways that promote deeply ingrained social stereotypes.
Animation showing 3 main types of data augmentation and random cropping of a picture
ImageNet

Cookbook for Vision Transformers: A Formula for Training Vision Transformers

Vision Transformers (ViTs) are overtaking convolutional neural networks (CNN) in many vision tasks, but procedures for training them are still tailored for CNNs. New research investigated how various training ingredients affect ViT performance.
Abeba Birhane
ImageNet

Abeba Birhane: Clean up web datasets

From language to vision models, deep neural networks are marked by improved performance, higher efficiency, and better generalizations. Yet, these systems are also marked by perpetuation of bias and injustice.
Animated image showing the transformer architecture of processing an image
ImageNet

Transformer Speed-Up Sped Up: How to Speed Up Image Transformers

The transformer architecture is notoriously inefficient when processing long sequences — a problem in processing images, which are essentially long sequences of pixels. One way around this is to break up input images and process the pieces
Model identifying erroneous labels in popular datasets
ImageNet

Labeling Errors Everywhere: Many deep learning datasets contain mislabeled data.

Key machine learning datasets are riddled with mistakes. Several benchmark datasets are shot through with incorrect labels. On average, 3.4 percent of examples in 10 commonly used datasets are mislabeled and the detrimental impact of such errors rises with model size.
Blurred human faces in different pictures
ImageNet

De-Facing ImageNet: Researchers blur all faces in ImageNet.

ImageNet now comes with privacy protection.What’s new: The team that manages the machine learning community’s go-to image dataset blurred all the human faces pictured in it and tested how models trained on the modified images on a variety of image recognition tasks.
Data related to SElf-supERvised (SEER), an image classifier pretrained on unlabeled images
ImageNet

Pretraining on Uncurated Data: How unlabeled data improved computer vision accuracy.

It’s well established that pretraining a model on a large dataset improves performance on fine-tuned tasks. In sufficient quantity and paired with a big model, even data scraped from the internet at random can contribute to the performance boost.
Graphs and data related to ImageNet performance
ImageNet

ImageNet Performance, No Panacea: ImageNet pretraining won't always improve computer vision.

It’s commonly assumed that models pretrained to achieve high performance on ImageNet will perform better on other visual tasks after fine-tuning. But is it always true? A new study reached surprising conclusions.
Different data related to the phenomenon called underspecification
ImageNet

Facing Failure to Generalize: Why some AI models exhibit underspecification.

The same models trained on the same data may show the same performance in the lab, and yet respond very differently to data they haven’t seen before. New work finds this inconsistency to be pervasive.
Examples of InstaHide scrambling images
ImageNet

A Privacy Threat Revealed: How researchers cracked InstaHide for computer vision.

With access to a trained model, an attacker can use a reconstruction attack to approximate its training data. A method called InstaHide recently won acclaim for promising to make such examples unrecognizable to human eyes while retaining their utility for training.
Tree farm dataset
ImageNet

Representing the Underrepresented: Many important AI datasets contain bias.

Some of deep learning’s bedrock datasets came under scrutiny as researchers combed them for built-in biases. Researchers found that popular datasets impart biases against socially marginalized groups to trained models due to the ways the datasets were compiled, labeled, and used.
Collage of self portraits
ImageNet

Unsupervised Prejudice: Image classification models learned bias from ImageNet.

Social biases are well documented in decisions made by supervised models trained on ImageNet’s labels. But they also crept into the output of unsupervised models pretrained on the same dataset.
Load More

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox