A number of countries restrict commercial use of personal data without consent unless they’re fully anonymized. A new paper proposes a way to anonymize images of faces, purportedly without degrading their usefulness in applications that rely on face recognition.
What’s new: Researchers from the Norwegian University of Science and Technology introduced DeepPrivacy, a system that anonymizes images of people by synthesizing replacement faces. They also offer the Flickr Diverse Faces dataset, 1.47 million images of faces with supplemental metadata, which they used to train DeepPrivacy.
Key insight: The original images are never exposed to the face generator. Authors Håkon Hukkelås, Rudolf Mester, and Frank Lindseth argue that this strategy preserves privacy more effectively than traditional anonymization techniques like pixelizing and blurring.
How it works: DeepPrivacy is a conditional generative adversarial network that synthesizes novel images similar to previously observed ones. A discriminator classifies images as real or generated, while a generator based on the U-Net architecture is optimized to create images that fool the generator.
- Single Shot Scale Invariant Face Detector detects faces in images.
- For each face, Mask R-CNN locates keypoints for eyes, nose, ears, and shoulders.
- Then the faces are replaced with random values.
- The generator architecture receives keypoints, which define the deleted face’s orientation, and the corresponding faceless images. From these inputs, it learns to create replacement faces that the discriminator can’t distinguish from real-world images in the training data.
Results: The researchers processed the WIDER-Face dataset (roughly 32,000 images containing around 394,000 faces) using DeepPrivacy as well as traditional anonymization methods. Subjected to traditional techniques, Dual Shot Face Detector retained 96.7 percent of its usual performance. With DeepPrivacy, it retained 99.3 percent. The researchers don’t provide metrics to evaluate the relative degree of anonymity imparted by the various methods.
Why it matters: Laws like the European Union’s General Data Protection Regulation set a high bar for data-driven applications by placing tight limits on how personal data can be used. DeepPrivacy transforms photos of people into a less identifiable format that still contains faces recognizable to neural networks.
Yes, but: DeepPrivacy addresses the privacy implications of faces only. An image purged of faces but still containing, say, clothing with identifiable markings, such as an athlete’s number, would allow a sophisticated model to infer the wearer’s identity.
We’re thinking: Machine learning’s reliance on data is both a gift and a curse. Aggregation of data has allowed for great progress in the field. Yet privacy advocates are inclined to keep personal data under wraps. DeepPrivacy is an intriguing step toward a compromise that could satisfy both AI engineers and users alike.