A method and apparatus for predicting an object of interest of a user receives an input image of a visible region of a user and gaze information including a gaze sequence of the user, generates weight filters for a per-frame segmentation image by analyzing a frame of the input image for input characteristics of the per-frame segmentation image and the gaze information, and predicts an object of interest of the user by integrating the weight filters and applies the integrated weight filter to the per-frame segmentation image.