Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences

An Advanced Network for Object Detection

Data：06-07-2021 | 【 A A A 】 | 【Print】【Close】

Object detection is one of the most important computer vision tasks and many researchers have proposed enormous object detection methods based on Convolutional Neural Network (CNN). Still, the performances of these object detectors are hindered by the diversity of object sizes and categories.

To get better feature expression, the utilization of multiscale features have been proposed. The extraction and utilization of multiscale features, known as Feature Pyramid Network (FPN), have great influence on the performance of the final detector. However, feature fusion of FPN is insufficient to express objects of similar size but different appearance due to the unidirectional feature fusion.

A research team led by Prof. Dr. LU XiaoQiang from Xi'an Institute of Optics and Precision Mechanics (XIOPM) of the Chinese Academy of Sciences (CAS) proposed a new multiscale feature fusion method with bidirectional feature fusion, using to solve the one-direction fusion of FPN, which called Adaptive Multiscale Feature (AMF). The results were published in NEUROCOMPUTING.

Overview of the proposed AMF. (Image by XIOPM)

“The main problem of the backbone network is how to integrate the deep and shallow features reasonably, because using only the last layer of features makes it difficult to deal with multi-size objects.” Therefore, the unidirectional feature fusion of FPN should be avoided and the AMF is employed in the detector.

There are two parts in the AMF module for feature fusion and feature redistribution, which are called Feature Scattering (FS) and Feature Redistribution (FR), respectively. Firstly, based on Convolutional Long Short Term Memory networks, the fusion is carried out in two directions. The shallow features are enhanced by the deep features and also be enhanced by the shallow features. Secondly, the two features are further fused: for each level, channel-wise attention is utilized to assign features to the corresponding layer.

To demonstrate the effectiveness of the proposed AMF for both anchor-free based and anchor based detectors, we used FCOS and RetinaNet as the baseline, representing anchor-free based and anchor based detectors, respectively. Experimental results BASED on the COCO 2014 dataset show that the proposed AMF module performs the popular FPN based detector. Whether anchored-free based detectors or anchored based detectors, the performance of detector can be improved through AMF.

The proposed AMF method exceeds the current most advanced object detector in accuracy.