Exploring Dependencies in Complex Input and Complex Output Machine Learning Problems

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Exploring Dependencies in Complex Input and Complex Output Machine
Learning Problems"

By

Miss Elham JEBALBAREZI SARBIJAN


Abstract

Multi-input and multi-output machine learning are some of the chief 
challenges in the era of big data (Variety of the data). These big 
datasets are too large and too complex to be handled by traditional 
machine learning methods and new solutions must be found. In this thesis, 
we investigate the effect of dependencies between multiple input and 
multiple output and we show that these dependencies help to solve the 
problems in a more accurate and less expensive way with fewer parameters. 
We choose prediction tasks on multi-label (each label is equivalent to an 
output task) and multimodal (each modality is equivalent to an input 
channel) as case studies.

Multi-label learning is an example of an extreme classification task on an 
extremely large number of labels (tags). User generated labels for any 
type of online data can be sparse in terms of individual user but 
intractably large among all users. For example, in web and document 
categorization, image semantic analysis, protein function detection and 
social network analysis, multiple outputs must be predicted 
simultaneously. In these problems, modelling output label dependencies 
improves the output predictions. Many of the existing algorithms do not 
adequately address multi-label classification with label dependencies and 
a large number of labels. In this thesis, we investigate multi-label 
classification with dependencies between many labels. We can then 
efficiently solve the problem of multi-label learning with an intractably 
large number of interdependent labels, such as the automatic tagging of 
Wikipedia pages.

In this thesis, we have studied the nature of label dependencies and the 
efficiency of distributed multi-label learning methods. Then, we have 
proposed an assumption-free label sampling approach to handle a huge 
number of the labels. Finally, we have investigated and compared 
chain-ordered label dependency and order-free learning methods for 
multi-label datasets.

In the second part of our dependency challenge investigation, we 
investigate multimodal learning complexities, as most of the learning 
tasks include several sensory modalities, such as vision and speech, which 
represent our primary channels of communication and perception. We focus 
on how to utilize the modality dependencies for multimodal fusion in order 
to integrate information from two or more modalities for better 
prediction.

Our aim is to understand and modulate the relative contribution of each 
modality in multimodal inference tasks by investigating input modality 
dependencies. Moreover, we propose some solutions to solve the curse of 
dimensionality which happens by high-order integrating the data from 
several sources. We make several contributions to multimodal data 
processing: First, we have investigated various basic fusion methods. In 
contrast to the previous approaches which use simple linear or 
concatenation approaches, we propose to generate a $(M + 1)$-way high-order 
dependency structure (tensor) to consider the high-order relationships 
between M modalities and the output layer of a neural network model. 
Applying a modality-based tensor factorization method, which adopts 
different factors for different modalities, results in removing 
information present in a modality that can be compensated by other 
modalities, with respect to the model outputs. Moreover, this 
modality-based tensor factorization approach helps to understand the 
relative utility of information in each modality and handles the scale 
issues of the problem. In addition, it leads to a less complicated model 
with fewer parameters and therefore could be applied as a regularizer to 
avoid overfitting.

According to our investigations and the experimental results, we find that 
including the dependencies in the prediction tasks lead to the approaches 
with simpler models and fewer parameters, while improving the prediction 
results. We aim to use the challenge of the dimensionality of big data as 
an opportunity by extracting their dependencies and using them as extra 
information to solve the prediction problems. We have shown that divide 
and conquer based on the label dependencies results in a smaller but more 
accurate method in comparison to the methods which ignore the 
dependencies. Then, we have shown that a small subset of the labels could 
provide a lot of information about the remaining labels, therefore we can 
use a small subset to perform the prediction tasks. Then, we have 
investigated the order-based dependency extraction vs order-free methods 
which concludes to the superiority of the order-free methods which are 
more general and accurate especially for the larger datasets. We have 
shown that a high-order integration of the modalities represents more 
information of the inter and intra modality dependencies, however it 
suffers from the polynomial growth of the dimensionality. Therefore, we 
propose a fully differentiable framework based on tensor factorization 
which could be included in any neural based learning method. In a 
nutshell, our results demonstrate that the dependencies between multiple 
inputs or outputs could help to make the problem simpler, smaller, and 
easier to train by combining the prediction tasks with dependency-based 
sampling, compression, or clustering methods.


Date:                   Tuesday, 22 September 2020

Time:                   10:00am - 12:00noon

Zoom Meeting:
https://hkust.zoom.us/j/97120508164?pwd=ejQrcGtzT0RhNWRBRVBQZ3FDNWN5Zz09

Chairperson:            Prof. Wenjing YE (MAE)

Committee Members:      Prof. Pascale FUNG (Supervisor)
                        Prof. Ming LIU
                        Prof. Tong ZHANG
                        Prof. Daniel PALOMAR (ECE)
                        Prof. Rada MIHALCEA (University of Michigan)