LAION-400M

2 Posts

Abeba Birhane
LAION-400M

Abeba Birhane: Clean up web datasets

From language to vision models, deep neural networks are marked by improved performance, higher efficiency, and better generalizations. Yet, these systems are also marked by perpetuation of bias and injustice.
Series of example of accurate and inaccurate matching images to text
LAION-400M

Crawl the Web, Absorb the Bias: NLP Models Absorb Biases from Web Training Data

The emerging generation of trillion-parameter models needs datasets of billions of examples, but the most readily available source of examples on that scale — the web — is polluted with bias and antisocial expressions. A new study examines the issue.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox