Making sense of software documentation with natural language processing

======================================================================
                Joint Seminar
======================================================================
The Hong Kong University of Science & Technology
Dept. of Computer Science and Engineering
Human Language Technology Center
-----------------------------------------------------------------------
Speaker:        Dr. Christoph Treude
                University of Adelaide

Title:          "Making sense of software documentation with natural
                 language processing"

Date:           Monday, 18 April 2016

Time:           4:00pm - 5:00pm

Venue:          Lecture Theater F (near lifts 25 & 26), HKUST

Abstract:

Knowledge management plays a central role in many software development
organizations. While much of the important technical knowledge can be
captured in documentation, there often exists a gap between the
information needs of software developers and the documentation structure.
To help developers access documentation more effectively, we are
developing approaches based on natural language processing to
automatically analyze and repackage software documentation into formats
that are more amenable to the readers of documentation. This talk will
focus on two such approaches: First, I will present TaskNav, a user
interface for search queries that suggests tasks automatically extracted
from documentation in an auto-complete list along with concepts, code
elements, and section headers. In a field study, we found search results
identified through extracted tasks to be more helpful to developers than
those found through concepts, code elements, and section headers. Second,
I will present SISE, a machine learning based approach to automatically
augment API documentation with "insight sentences" from Stack Overflow --
sentences that are related to a particular API type and that provide
insight not contained in the API documentation of that type. In a
comparative study, we found that SISE resulted in the highest number of
sentences that were considered to add useful information not found in the
API documentation compared to several baseline approaches. These results
indicate that natural language processing can be used to analyze and
repackage software documentation automatically, and that it can help
bridge the gap between documentation structure and the information needs
of software developers.


********************
Biography:

Christoph Treude received his Diploma degree in computer
science/management information systems from the University of Siegen,
Germany, and his PhD degree in computer science from the University of
Victoria, Canada. After postdocs in Canada and Brazil, he is now working
as a faculty member in the School of Computer Science, University of
Adelaide, Australia. His research interests include empirical software
engineering, natural language processing, and social media.