Transfer Reinforcement Learning for Task-oriented Dialogue Systems

PhD Thesis Proposal Defence

Title: "Transfer Reinforcement Learning for Task-oriented Dialogue 


Mr. Kaixiang MO


Dialogue systems are attracting more and more attention recently. Dialogue 
systems can be categorized into open-domain dialogue systems and 
task-oriented dialogue systems. Task-oriented dialogue systems are 
designed to help user finish a specific task, and there are four modules, 
namely the spoken language understanding module, the dialogue state 
tracking module, the dialogue policy module and the natural language 
generation module. One of the most important modules is the dialogue 
policy module, which aims to choose the best reply according to the 
dialogue context. In this proposal, we focus on the dialogue policy of 
task-oriented dialogue systems.

Reinforcement Learning is usually used in the dialogue policy. However, 
traditional reinforcement learning algorithm relies heavily on a large 
number of training data and accurate reward signal. Transfer Learning can 
leverage knowledge from a source domain and improve the performance of a 
model in the target domain with little target data. However, traditional 
transfer learning focuses on supervised learning setting, which cannot 
handle knowledge transfer in reinforcement setting since it did not 
consider the state.Transfer Reinforcement Learning aims to transfer 
dialogue policy knowledge across different domains. In the target domain, 
the state and action can be aligned to the source domain state and action, 
so the dialogue policy can be transferred from the source domain to the 
target domain.

The key to transfer reinforcement learning is learning to build the 
mapping from source and target domain, and transfer only domain 
independent common knowledge while minimizing the negative transfer caused 
by the domain-dependent knowledge. In this proposal, we propose a unified 
framework for transfer reinforcement learning problems in the 
task-oriented dialogue system, including 1) How to transfer dialogue 
policies across different users with different preference? 2) How to 
transfer most common knowledge when the common knowledge is mixed with the 
domain dependent knowledge? 3) How to transfer dialogue policies across 
dialogue systems built with different sets of action and slots? We will 
use both large-scale simulation and large-scale real-world dataset to 
validate this research. The proposal will also discuss the difficulties of 
existing methods, and point out some future direction for extensive 

Date:			Monday, 4 December 2017

Time:                  	10:00am - 12:00noon

Venue:                  Room 3494
                         (lifts 25/26)

Committee Members:	Prof. Qiang Yang (Supervisor)
  			Dr. Xiaojuan Ma (Chairperson)
 			Prof. Lei Chen
 			Dr. Ming Liu (ECE)

**** ALL are Welcome ****