We present a new technique for achieving blind source separation when given only a single-channel recording. The main idea is based on exploiting the inherent time structure of sound sources by learning a priori sets of time-domain basis functions that encode the sources in a statistically efficient manner. We derive a learning algorithm using a maximum likelihood approach given the observed single-channel data and sets of basis functions. For each time point, we infer the source parameters and their contribution factors using a flexible but simple density model. We show separation results of two music signals as well as the separation of two voice signals.