Person Search: A New Research Paradigm

Speaker:        Shuang LI
                Chinese University of Hong Kong

Title:          "Person Search: A New Research Paradigm"

Date:           Monday, 25 September 2017

Time:           4:00pm - 5:00pm

Venue:          Lecture Theater F (near lift no. 25/26), HKUST

Abstract:

Automatic person search plays a key role in finding missing people and
criminal suspects. However, existing methods are based on manually cropped
person images, which are unavailable in the real world. Also, there might
be only verbal descriptions of suspects' appearance in many criminal
cases. To improve the practicability of person search in real world
applications, we propose two new branches: (i) finding a target person in
the gallery of whole scene images and (ii) using natural language
description to search people.

In this talk, I will first present a joint pedestrian detection and
identification network for person search from whole scene images. An
Online Instance Matching (OIM) loss function is proposed to train the
network, which is scalable to datasets with numerous identities. Then, I
will talk about natural language based person search. A two-stage
framework is proposed to solve this problem. The stage-1 network learns to
embed textual and visual features with a Cross-Modal Cross-Entropy (CMCE)
loss, while stage-2 network refines the matching results with a latent
co-attention mechanism. In stage-2, the spatial attention relates each
word with corresponding image regions while the latent semantic attention
aligns different sentence structures to make the matching results more
robust to sentence structure variations. The proposed methods produce the
state-of-the-art results for person search.


*******************
Biography:

Shuang Li is an M.Phil student at the Chinese University of Hong Kong,
advised by Prof. Xiaogang Wang. She works in Multimedia Lab with Prof.
Xiaoou Tang. Her research interests include computer vision, natural
language processing, and deep learning, especially image-text relationship
and person re-identification. She was a research intern at Disney
Research, Pittsburgh.