CSE Theory Double-Seminar

Title:  "CSE Theory Double-Seminar"
Time/Date:  Monday March 4, 2-3PM
Venue:  Room 3520

-----------------------------------------------------------------------
Talk 1:  Polynomial Time Algorithms for Constructing Optimal AIFV Codes
Speaker: Elfarouk Harb

Abstract: Huffman Codes are optimal Fixed-to-Variable (FV) codes  under
the condition that every source symbol can only be encoded by one
codeword. Relaxing these constraints permits constructing better FV codes.
More specifically, recent work has shown that AIFV codes can beat Huffman
coding. AIFV codes construct a set of different coding trees between which
the code alternates and are only "almost instantaneous". (AI). This means
that decoding a word might require a delay of a finite number of bits.
Current algorithms for constructing optimal AIFV codes are iterative
processes.   One iteration step improves the current set of trees to a
``better'' set. The process has been proven to finitely converge to the
optimal code but with no known bounds on the convergence time.

This paper derives a geometric interpretation of the space of AIFV codes
permitting the development of new polynomially time-bounded iterative
procedures for constructing optimal AIFV  codes.  For the simplest case we
show that a binary search procedure can replace the current iterative
process. For the more complicated cases  we describe how to frame the
problem as a linear programming problem with an exponential number of
constraints but a polynomial time separability oracle. This permits using
the Grotschel, Lovasz and Schrijver ellipsoid method to solve the problem
in a polynomial number of steps.

This is joint work with M. Golin and will be presented at DCC'2019


---------------------------------------------------------------------------
Talk 2: TITLE: Self improving sorting under a hidden partition
Speaker:  Kai Jin


In a self-improving computational model, we will be given many instances
$I_1,I_2,...,I_t,...$ of a specified problem, each of which is of the same
size and is generated according to a fixed but unknown distribution, and
we want to compute a function $\pi(I_j)$ defined by the problem
efficiently after receiving $I_j$ . For example, in the sorting problem,
an instance $I_j$ contains $n$ elements $x_1,...,x_n$ (for a fixed $n$),
and the output function $\pi(I_j)$ contains the $n$ ranks of the given
elements. We aim to find an algorithm that starts from a ``learning
phase'' where it solves several instances (and learns something about the
distribution and builds some data structures for future instances) and
then goes into a ``limiting phase'' where it computes $\pi(I_j)$ in
optimal time (using the preprocessed data structures).

As the pioneered work, Ailon et. al. [JC11] designed self-improving
algorithms for the sorting problem and the Delaunay triangulation problem.
In the limiting phase their algorithm computes $\pi(I)$ (the ranks of the
$n$ given numbers, or the Delaunay triangulation of the $n$ given points)
in $O(H(\pi(I))+n)$ (expected) time (with high probability), where
$H(\pi(I))$ denotes the entropy of the output $\pi(I)$. This matches the
information theory lower bound. However, their algorithm only works for
the ``product distribution'' where each element is generated by an
independent source.

We extend the previous self-improving sorting algorithm by allowing some
dependency among the $n$ input numbers. In particular, we assume that
$x_1,...,x_n$ are partitioned into $g$ hidden groups and the $k$-th group
is governed by an hidden variable $z_k$ and that all the elements in this
group are some hidden functions of $z_k$, where $z_1,...,z_k$ are
generated by a product distribution. We show that under some general
assumption of these hidden functions, we can still compute $\pi(I)$ in
optimal time in the limiting phase as before.

This is a joint work with Siu-Wing Cheng and Man-Kwun Chiu.