This paper proposes a new approach to multi-object tracking by semantic topic discovery. We dynamically cluster frame-by-frame detections and treat objects as topics, allowing the application of the Dirichlet process mixture model. The tracking problem is cast as a topic-discovery task, where the video sequence is treated analogously to a document. It addresses tracking issues such as object exclusivity constraints as well as tracking management without the need for heuristic thresholds. Variation of object appearance is modeled as the dynamics of word co-occurrence and handled by updating the cluster parameters across the sequence in the dynamical clustering procedure. We develop two kinds of visual representation based on super-pixel and deformable part model and integrate them into the model of automatic topic discovery for tracking rigid and non-rigid objects, respectively. In experiments on public data sets, we demonstrate the effectiveness of the proposed algorithm.