To optimally cope with continuous speech recognizer, we propose the stochastic lexicon model that effectively represents variations in pronunciation, Zn this lexicon model, the baseform of a word is represented by subword-states with a probability distribution of subword units as a two-level hidden Markov model (HMM) and this baseform is automatically trained by sample utterances. Also, the proposed approach can be applied to systems employing nonlinguistic recognition units.