Visual object tracking algorithm is an important problem for many applications like surveillance system, robot vision system, and augmented reality system. Particularly, in the object tracking algorithm, it is important that the algorithm is robust to occlusions, motion blurring, and object deformation. The trackers using CNN is robust to object deformations, but it is weak to occlusion and motion blurring. However, trackers using LSTM is robust to occlusion and motion blurring, but relatively weak to object deformations. In this work, we developed an multimodal object tracking algorithm that can track an object stably by using CNN and Deep LSTM, which can learn more complex features than single LSTM. We used OTB-30 dataset to evaluate our tracker, and verified that our tracker has the highest performance than the other trackers.