The user interface of VoiceCoach: (a) The user panel allows users to submit a query sentence via audio or text input. (b) The recommendation view presents different levels of recommendation results of modulation combination. (c) The voice technique view enables users to quickly locate and compare the contexts of a specific voice modulation skill in either one-line mode or multi-line mode. (d) The practice view provides users with real-time and quantitative visual feedback to iteratively practice voice modulation skills.


The modulation of voice properties, such as pitch, volume, and speed, is crucial for delivering a successful public speech. However, it is challenging to master different voice modulation skills. Though many guidelines are available, they are often not practical enough to be applied in different public speaking situations, especially for novice speakers. We present VoiceCoach, an interactive evidence-based approach to facilitate the effective training of voice modulation skills. Specifi- cally, we have analyzed the voice modulation skills from 2623 high-quality speeches (i.e., TED Talks) and use them as the benchmark dataset. Given a voice input, VoiceCoach automatically recommends good voice modulation examples from the dataset based on the similarity of both sentence structures and voice modulation skills. Immediate and quantitative visual feedback is provided to guide further improvement. The expert interviews and the user study provide support for the effectiveness and usability of VoiceCoach.