Hexastore: Sextuple Indexing for Semantic Web Data Management

Speaker:	Dr. Panagiotis Karras
		National University of Singapore

Title:		"Hexastore: Sextuple Indexing for Semantic Web Data
		 Management"

Date:		Friday,  3 October 2008

Time:		2:00 - 3:00pm

Venue:		Room 4502 (via lifts 25/26), HKUST


Abstract:

Despite the intense interest towards realizing the Semantic Web vision,
most existing RDF data management schemes are constrained in terms of
efficiency and scalability. Still, the growing popularity of the RDF
format arguably calls for an effort to offset these drawbacks. Viewed
from a relational database perspective, these constraints are derived
from the very nature of the RDF data model, which is based on a triple
format. Recent research has attempted to address these constraints using
a vertical-partitioning approach, in which separate two-column tables
are constructed for each property. However, as we show, this approach
suffers from similar scalability drawbacks on queries that are not bound
by RDF property value. In this paper, we propose an RDF storage scheme
that uses the triple nature of RDF as an asset.
This scheme enhances the vertical partitioning idea and takes it to its
logical conclusion. RDF data is indexed in six possible ways, one for
each possible ordering of the three RDF elements. Each instance of an
RDF element is associated with two vectors; each such vector gathers
elements of one of the other types, along with lists of the third-type
resources attached to each vector element. Hence, a sextupleindexing
scheme emerges. This format allows for quick and scalable
general-purpose query processing; it confers significant advantages (up
to five orders of magnitude) compared to previous approaches for RDF
data management, at the price of a worst-case five-fold increase in
index space. We experimentally document the advantages of our approach
on real-world and synthetic data sets with practical queries.


Biography:

Panagiotis Karras is a Lee Kuan Yew Postdoctoral Fellow at the National
University of Singapore. He received a Ph.D. in Computer Science from
the University of Hong Kong in and an M.Eng. in Electrical and Computer
Engineering from the National Technical University of Athens. He has
also worked and studied at the University of Zurich, at the Technical
University of Denmark, at the Institute of Language and Speech
Processing in Athens, at Schlumberger Information Solutions in Oslo, at
the University of Karlsruhe, Germany, and at the University of Patras,
Greece. His research interests are in the design and analysis of
algorithms and data structures for massive data management, data stream
algorithms, geometric and spatial data management problems, data
anonymization, and indexing methods for semi-structured data. His work
has been published in major data engineering and data mining
conferences.