Attentive RSA: Accelerating All FCN-based Detection Methods with Higher Accuracy

Speaker:        Yu Liu
                Chinese University of Hong Kong

Title:          "Attentive RSA: Accelerating All FCN-based Detection
                 Methods with Higher Accuracy"

Date:           Thursday, 19 October 2017

Time:           4:00pm - 5:00pm

Venue:          Lecture Theater G (near lift 25/26), HKUST

Abstract:

Fully convolutional neural network (FCN) has been dominating the game of
object detection for several years. It boils down most of the redundant
calculation with its congenital capability of searching in sliding
windows. As a result, most recent state-of-the-art methods such as Faster
R-CNN, SSD, YOLO and FPN use FCN as their backbone. Here comes one
question: Is there a fundamental method that accelerates FCN, and thus
accelerating all the recent FCN-based methods?

Examining all the pipelines above, we can easily find the three
bottlenecks of speed:
a) Not all layers in image pyramid contain object with valid scale;
b) Anterior layers in FCN take much more amount of calculation than
posterior layers do;
c) Scale-friendly FCN requires deep and wide structure for both large
receptive field and better adaptivity on various scales.

In this talk, I'd like to introduce our recent work, 'Attentive RSA', that
overcome all these difficulties. First, a scale and location attention
module is proposed to select the valid layers in the image pyramid. Only
the largest valid image is sent into the anterior part of FCN and we get
the largest feature map. In this way, we avoid the bottleneck 'a'. After
that, instead of sending down-sampled images subsequently into the
anterior part, we directly approximate their feature maps by the largest
feature map with a recurrent scale approximation unit (RSA). So most
down-sampled images will not pass through anterior layers in FCN,
therefore eliminating the bottleneck 'b'. Finally, since we use an
image-pyramid method to deal with multi-scale detection, the detector
(FCN) only need to verify foreground and background in a certain scale
range. So different from the general approach in the bottleneck 'c', the
network can be designed in a very thin and shallow form. Experiments show
that Attentive RSA accelerates FCN by 6 times while recalling 30% more
missing faces on three face detection benchmarks and achieves the first
place on FDDB benchmark.


*********************
Biography:

Yu Liu is a first year PhD student at the Chinese University of Hong Kong,
advised by Prof. Xiaogang Wang. Before that he was a research intern at
Microsoft Research Asia and Sensetime. His research interests include
computer vision and machine learning, especially object detection and
recognition.

For more details please visit http://liuyu.us/.