We proposes an ecient method to match a human template model to a corresponding human object in a monocular video to estimate visually pleasing depths of the human at every frame. Instead of trying to match the silhouettes between the projection of a 3D template of a human and the corresponding 2D object, we propagate the partially retrieved depth toward the boundary of the object. Our system matches a given 3D template model to a person in a monocular video with a small number of user inputs.
We render depth maps from the matched results. We overlay the depth maps and corresponding scenes in the video. The human object is divided into several regions based on the color information. Depth pixels corresponding to each segment are seprately propagated to ensure the detail of the results. We compared the result and depth maps painted by experienced artists. The comparison shows our method produces viable depth maps of humans in monocular videos effciently.