DragVideo: Interactive Drag-style Video Editing

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering

Final Year Thesis Oral Defense

Title: "DragVideo: Interactive Drag-style Video Editing"

by

WANG Ruida

Abstract:

Video generation models have shown their superior ability to generate 
photorealistic video. However, how to accurately control (or edit) the video 
remains a formidable challenge. The main issues are: 1) how to perform direct 
and accurate user control in editing; 2) how to execute editings like changing 
shape, expression, and layout without unsightly distortion and artifacts to the 
edited content; and 3) how to maintain spatio-temporal consistency of video 
after editing. To address the above issues, we propose DragVideo, a general 
drag-style video editing framework. Inspired by DragGAN, DragVideo addresses 
issues 1) and 2) by proposing the drag-style video latent optimization method 
which gives desired control by updating noisy video latent according to drag 
instructions through video-level drag objective function. We amend issue 3) by 
integrating the video diffusion model with sample-specific LoRA and Mutual 
Self-Attention in DragVideo to ensure the edited result is spatio-temporally 
consistent. We also present a series of testing examples for drag-style video 
editing and conduct extensive experiments across a wide array of challenging 
editing tasks, such as motion, skeleton editing, etc, underscoring DragVideo 
can edit video in an intuitive, faithful to the user's intention manner, with 
nearly unnoticeable distortion and artifacts, while maintaining spatio-temporal 
consistency. While traditional prompt-based video editing fails to do the 
former two and directly applying image drag editing fails in the last, 
DragVideo's versatility and generality are emphasized.


Date            : 8 April 2024 (Monday)

Time            : 13:30 - 14:10

Venue           : Room 5501 (near lifts 25/26), HKUST

Advisor         : Prof. TANG Chi-Keung

2nd Reader      : Dr. CHEN Qifeng