Skip to main content
placeholder image

Few-Shot Object Detection by Second-Order Pooling

Chapter


Abstract


  • In this paper, we tackle a challenging problem of Few-shot Object Detection rather than recognition. We propose Power Normalizing Second-order Detector consisting of the Encoding Network (EN), the Multi-scale Feature Fusion (MFF), Second-order Pooling (SOP) with Power Normalization (PN), the Hyper Attention Region Proposal Network (HARPN) and Similarity Network (SN). EN takes support image crops and a query image per episode to produce covolutional feature maps across several layers while MFF combines them into multi-scale feature maps. SOP aggregates them per support image while PN detects the presence of visual feature instead of counting its frequency of occurrence. HARPN cross-correlates the PN pooled support features against the query feature map to match regions and produce query region proposals that are then aggregated with SOP/PN. Finally, support and query second-order descriptors are passed to SN. Our approach performs well because: (i) HARPN leverages SOP/PN for cross-correlation of detected rather than counted support features with query features which improves region proposals, (ii) SOP/PN capture second-order statistics per region proposal and factor out spatial locations, and (iii) PN limits the complexity of the space of functions over which HARPN and SN learn. These properties lead to the state of the art on the PASCAL VOC 2007/12, MS COCO and the FSOD datasets.

Publication Date


  • 2021

Citation


  • Zhang, S., Luo, D., Wang, L., & Koniusz, P. (2021). Few-Shot Object Detection by Second-Order Pooling. In Unknown Book (Vol. 12625 LNCS, pp. 369-387). doi:10.1007/978-3-030-69538-5_23

International Standard Book Number (isbn) 13


  • 9783030695378

Scopus Eid


  • 2-s2.0-85103243204

Web Of Science Accession Number


Book Title


  • Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Start Page


  • 369

End Page


  • 387

Abstract


  • In this paper, we tackle a challenging problem of Few-shot Object Detection rather than recognition. We propose Power Normalizing Second-order Detector consisting of the Encoding Network (EN), the Multi-scale Feature Fusion (MFF), Second-order Pooling (SOP) with Power Normalization (PN), the Hyper Attention Region Proposal Network (HARPN) and Similarity Network (SN). EN takes support image crops and a query image per episode to produce covolutional feature maps across several layers while MFF combines them into multi-scale feature maps. SOP aggregates them per support image while PN detects the presence of visual feature instead of counting its frequency of occurrence. HARPN cross-correlates the PN pooled support features against the query feature map to match regions and produce query region proposals that are then aggregated with SOP/PN. Finally, support and query second-order descriptors are passed to SN. Our approach performs well because: (i) HARPN leverages SOP/PN for cross-correlation of detected rather than counted support features with query features which improves region proposals, (ii) SOP/PN capture second-order statistics per region proposal and factor out spatial locations, and (iii) PN limits the complexity of the space of functions over which HARPN and SN learn. These properties lead to the state of the art on the PASCAL VOC 2007/12, MS COCO and the FSOD datasets.

Publication Date


  • 2021

Citation


  • Zhang, S., Luo, D., Wang, L., & Koniusz, P. (2021). Few-Shot Object Detection by Second-Order Pooling. In Unknown Book (Vol. 12625 LNCS, pp. 369-387). doi:10.1007/978-3-030-69538-5_23

International Standard Book Number (isbn) 13


  • 9783030695378

Scopus Eid


  • 2-s2.0-85103243204

Web Of Science Accession Number


Book Title


  • Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Start Page


  • 369

End Page


  • 387