Abstract
—Traditional video editing interfaces model and represent videos as a collection of frames against a timeline, which makes object-centric manipulation of videos a laborious task. We enable simple and meaningful interaction for object-centric navigation and manipulation of long shot videos, by introducing operators on three high-level video semantics: background mosaics, object motions, and camera motions. We estimate the scene background and represent the object motion using 3D space-time trajectories. We use the 3D object trajectories as basic interaction elements and define several object and camera operations as simple and intuitive curve manipulations. These allow users to perform various video object temporal manipulations by interactively manipulating the object trajectories. The camera operations model the camera as a movable and scalable aperture and allow the users to simulate pan, tilt, and zoom effects by creating new camera trajectories. With several example compositions we demonstrate that our representation and operations allow users to simply and interactively perform numerous seemingly complex, high-level video manipulation tasks.