EmoCo: Visual Analysis of Emotion Coherence in Presentation Videos
Our visualization system supports emotion analysis across three modalities (i.e., face, text, and audio) at different levels of details. The video view (a) summarizes each video in the collection and enables quick identification of videos of interest. The channel coherence view (b) shows emotion coherence of the three modalities at the sentence level and provides extracted features for channel exploration. The detail view (c) supports detail exploration for a selected sentence with some highlighted features and transition points. The sentence clustering view (d) provides a summary of the video and reveals the temporal patterns of emotion information. The word view (e) enables efficient quantitative analysis at the word level in the video transcript.
Emotions play a key role in human communication and public presentations. Human emotions are usually expressed through multiple modalities. Therefore, exploring multimodal emotions and their coherence is of great value for understanding emotional expressions in presentations and improving presentation skills. However, manually watching and studying presentation videos is often tedious and time-consuming. There is a lack of tool support to help conduct an efficient and in-depth multi-level analysis. Thus, in this paper, we introduce EmoCo, an interactive visual analytics system to facilitate efficient analysis of emotion coherence across facial, text, and audio modalities in presentation videos. Our visualization system features a channel coherence view and a sentence clustering view that together enable users to obtain a quick overview of emotion coherence and its temporal evolution. In addition, a detail view and word view enable detailed exploration and comparison from the sentence level and word level, respectively. We thoroughly evaluate the proposed system and visualization techniques through two usage scenarios based on TED Talk videos and interviews with two domain experts. The results demonstrate the effectiveness of our system in gaining insights into emotion coherence in presentations.