Skip to main content
placeholder image

Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification

Conference Paper


Abstract


  • Learning discriminative image representations plays a vital role in long-tailed image classification because it can ease the classifier learning in imbalanced cases. Given the promising performance contrastive learning has shown recently in representation learning, in this work, we explore effective supervised contrastive learning strategies and tailor them to learn better image representations from imbalanced data in order to boost the classification accuracy thereon. Specifically, we propose a novel hybrid network structure being composed of a supervised contrastive loss to learn image representations and a cross-entropy loss to learn classifiers, where the learning is progressively transited from feature learning to the classifier learning to embody the idea that better features make better classifiers. We explore two variants of contrastive loss for feature learning, which vary in the forms but share a common idea of pulling the samples from the same class together in the normalized embedding space and pushing the samples from different classes apart. One of them is the recently proposed supervised contrastive (SC) loss, which is designed on top of the state-of-the-art unsupervised contrastive loss by incorporating positive samples from the same class. The other is a prototypical supervised contrastive (PSC) learning strategy which addresses the intensive memory consumption in standard SC loss and thus shows more promise under limited memory budget. Extensive experiments on three long-tailed classification datasets demonstrate the advantage of the proposed contrastive learning based hybrid networks in long-tailed classification.

Publication Date


  • 2021

Citation


  • Wang, P., Han, K., Wei, X. S., Zhang, L., & Wang, L. (2021). Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 943-952). doi:10.1109/CVPR46437.2021.00100

Scopus Eid


  • 2-s2.0-85123182834

Start Page


  • 943

End Page


  • 952

Abstract


  • Learning discriminative image representations plays a vital role in long-tailed image classification because it can ease the classifier learning in imbalanced cases. Given the promising performance contrastive learning has shown recently in representation learning, in this work, we explore effective supervised contrastive learning strategies and tailor them to learn better image representations from imbalanced data in order to boost the classification accuracy thereon. Specifically, we propose a novel hybrid network structure being composed of a supervised contrastive loss to learn image representations and a cross-entropy loss to learn classifiers, where the learning is progressively transited from feature learning to the classifier learning to embody the idea that better features make better classifiers. We explore two variants of contrastive loss for feature learning, which vary in the forms but share a common idea of pulling the samples from the same class together in the normalized embedding space and pushing the samples from different classes apart. One of them is the recently proposed supervised contrastive (SC) loss, which is designed on top of the state-of-the-art unsupervised contrastive loss by incorporating positive samples from the same class. The other is a prototypical supervised contrastive (PSC) learning strategy which addresses the intensive memory consumption in standard SC loss and thus shows more promise under limited memory budget. Extensive experiments on three long-tailed classification datasets demonstrate the advantage of the proposed contrastive learning based hybrid networks in long-tailed classification.

Publication Date


  • 2021

Citation


  • Wang, P., Han, K., Wei, X. S., Zhang, L., & Wang, L. (2021). Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 943-952). doi:10.1109/CVPR46437.2021.00100

Scopus Eid


  • 2-s2.0-85123182834

Start Page


  • 943

End Page


  • 952