Abstract
The ability to detect, and track multiple moving objects like person and other robots, is an important prerequisite for mobile robots working in dynamic indoor environments. We approach this problem by detecting independently moving objects in image sequence from a monocular camera mounted on a robot. We use multi-view geometric constraints to classify a pixel as moving or static. The first constraint, we use, is the epipolar constraint which requires images of static points to lie on the corresponding epipolar lines in subsequent images. In the second constraint, we use the knowledge of the robot motion to estimate a bound in the position of image pixel along the epipolar line. This is capable of detecting moving objects followed by a moving camera in the same direction, a so-called degenerate configuration where the epipolar constraint fails. To classify the moving pixels robustly, a Bayesian framework is used to assign a probability that the pixel is stationary or dynamic based on the above geometric properties and the probabilities are updated when the pixels are tracked in subsequent images. The same framework also accounts for the error in estimation of camera motion. Successful and repeatable detection and pursuit of people and other moving objects in realtime with a monocular camera mounted on the Pioneer 3DX, in a cluttered environment confirms the efficacy of the method.