Effective and Scalable Methods for Debugging Concurrent Software Systems

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Effective and Scalable Methods for Debugging Concurrent Software Systems"

By

Mr. Shaoming Huang


Abstract

Multicore is here to stay. To keep up with the hardware innovation, software 
developers are undergoing a revolution from sequential programming to 
concurrent programming. This revolution, however, is slow and challenging due 
to the exponential complexity in reasoning about concurrency. In particular, 
Heisenbugs such as data races which are non-deterministic pervasively infect 
concurrent software, making concurrent program debugging notoriously difficult.

In this dissertation, we develop several effective methods for debugging 
concurrent programs along four directions: multiprocessor deterministic replay, 
predictive trace analysis, trace simplification, and data sharing reduction. We 
first present LEAP, a lightweight record and replay system that makes 
Heisenbugs reproducible on multi-core and multi-processors. Underpinned by a 
new local-order based replay theorem, LEAP is fast, portable, and 
deterministic. As long as a Heisenbug manifests once, LEAP is able to 
deterministically reproduce it in every subsequent execution, and more 
importantly, with much lower overhead compared to previous approaches.

We second present PECAN and TraceFilter, a persuasive predictive trace analysis 
system that predicts Heisenbugs from normal executions, and an efficient 
algorithm that significantly improves the scalability of predictive analysis by 
removing the trace redundancy. The salient feature of PECAN is that, in 
addition to predict Heisenbugs, it generates "bug hatching clips" that 
deterministically expose and validate the predicted bugs. With PECAN, 
programmers are provided with the full execution history and context 
information to understand the bug, which dramatically expedites the debugging 
process.

We third present LEAN and SimTrace, a dynamic and a static technique for 
simplifying concurrency bug reproduction through removing the computation 
redundancy and validating the trace equivalence. A simplified execution with 
fewer threads, fewer thread interleavings, and faster replay greatly reduces 
the debugging effort by reducing the number of places in the trace where we 
need to look for the cause of the bug and by speeding up the bug reproduction 
process.

We finally present Privateer, an execution privatization technique that soundly 
privatizes a subset of shared data accesses in a vast category of 
scheduler-oblivious concurrent programs. Underpinned by a privatization 
theorem, Privateer safely reduces the data sharing and isolates the erroneous 
thread interleavings without introducing any additional synchronization. With 
Privateer, many Heisenbugs are fixed and a wide range of concurrency problems 
are alleviated without impairing but, instead, improving the program 
performance.


Date:			Thursday, 29 November 2012

Time:			9:00am - 11:00am

Venue:			Room 3408
 			Lifts 17/18

Chairman:		Prof. Xueqing Zhang (CIVL)

Committee Members:	Prof. Charles Zhang (Supervisor)
 			Prof. Shing-Chi Cheung
 			Prof. Lin Gu
 			Prof. Jiang Xu (ECE)
                        Prof. Thomas Ball (Microsoft Research, USA)


**** ALL are Welcome ****