The most widely used image-recognition systems are better at identifying items from wealthy households than from poor ones.
What’s new: Facebook researchers tested object recognition systems on images from the Dollar Street corpus household scenes, ranked by income level, from 50 countries. Services from Amazon, Clarifai, Google, IBM, and Microsoft performed poorly on photos from several African and Asian countries, relative to their performance on images from Europe and North America. Facebook’s own was roughly 20 percent more accurate on photos from the wealthiest households than on those from the poorest, where even common items like soap may look very different.
Behind the news: Nearly all photos in ImageNet, Coco, and OpenImages come from Europe and North America, the researchers point out. So systems trained on those data sets are better at recognizing, say, a Western-style wedding than an Indian-style wedding. Moreover, photos labeled “wedding,” with their veiled brides and tuxedoed grooms, look very different from those labeled with the equivalent word in Hindi (शादी) with their bright-red accents. Systems designed around English labels may ignore relevant photos from elsewhere, and vice versa.
We're thinking: Bias in machine learning runs deep and manifests in unexpected ways, and the stakes can be especially high in applications like healthcare. There is no simple solution. Ideally, data sets would represent social values. Yet different parts of society hold different values, making it hard to define a single representative data distribution. We urge data set creators to examine ways in which their data may be skewed and work to reduce any biases.
Things Look Different Over There
Published
Reading time
1 min read