Abstract
Autonomous camera systems are often subjected to an optimization/filtering operation to smoothen and stabilize the rough trajectory estimates. Most common filtering techniques do reduce the irregularities in data; however, they fail to mimic the behavior of a human cameraman. Global filtering methods modeling human camera operators have been successful; however, they are limited to offline settings. In this paper, we propose two online filtering methods called Cinefilters, which produce smooth camera trajectories that are motivated by cinematographic principles. The first filter (CineConvex) uses a sliding window-based convex optimization formulation, and the second (CineCNN) is a CNN based encoder-decoder model. We evaluate the proposed filters in two different settings, namely a basketball dataset and a stage performance dataset. Our models outperform previous methods and baselines on both quantitative and qualitative metrics. The CineConvex and CineCNN filters operate at about 250fps and 1000fps, respectively, with a minor latency (half a second), making them apt for a variety of real-time applications.