In any training dataset, some classes may have relatively few examples. A new technique can improve a trained model’s performance on such underrepresented classes.
What’s new: Researchers at Jilin University, Megvii Inc., Beihang University, Huazhong University, and Tsinghua University led by Jialun Liu and Yifan Sun introduced a method that synthesizes extracted features of underrepresented classes.
Key insight: The researchers trained a model and then mapped the extracted features for each data class into a two-dimensional visualization. Classes with fewer samples covered a smaller volume, making nearby decision boundaries more sensitive to variations in the features. They reasoned that artificially increasing the volume of underrepresented classes to match that of other classes should result in more robust predictions on the underrepresented classes.
How it works: The researchers used well represented classes to predict the distribution of features in classes with fewer samples.
- The researchers measured the distribution of features in a given class by locating the center of all training features in that class. The distribution’s shape is defined by the variance of angles between the center and the features themselves (the tan box in the animation above).
- For each example of an underrepresented class, the researchers generated a cloud of artificial points so the cloud’s angular variance matched that of a well represented class (the yellow oval to the right of the dotted-line decision boundary above). They labeled the synthetic features as the undersampled class and added them to the extracted features.
- The network learned from the artificial features using a loss function similar to the one called ArcFace, which maximizes the distance between the center of extracted feature distributions and decision boundaries.
Results: The researchers extracted features from images using a ResNet-50. They applied those features to models built with the ArcFace loss and trained on two datasets pared down to create underrepresented classes of five examples each. Then they built models using their approach and compared the results. Their method increased the average precision (AP), a measure of true positive rate where 1 is perfect, from 0.811 AP to 0.832 AP on Market-1501. Similarly, it boosted performance from 0.732 AP to 0.742 AP on DukeMTMC-reID.
Why it matters: There’s no need to generate synthetic examples if we can describe their extracted features.
We’re thinking: Deep learning engineers like to use cats as examples, but these researchers focused only on the long tail.