Encoding the facial expression dynamics is efficient in classifying and recognizing facial expressions. Most facial dynamics-based methods assume that a sequence is temporally segmented before prediction. This requires the prediction to wait until a full sequence is available, resulting in prediction delay. To reduce the prediction delay and enable prediction "on-the-fly" (as frames are fed to the system), we propose new dynamics feature learning method that allows prediction with partial (incomplete) sequences. The proposed method utilizes the readiness of recurrent neural networks (RNNs) for on-the-fly prediction, and introduces novel learning constraints to induce early prediction with partial sequences. We further show that a delay in accurate prediction using RNNs could originate from the effect that the subject appearance has on the spatio-temporal features encoded by the RNN. We refer to that effect as "appearance bias". We propose the appearance suppressed dynamics feature, which utilizes a static sequence to suppress the appearance bias. Experimental results have shown that the proposed method achieved higher recognition rates compared to the state-of-the-art methods on publicly available datasets. The results also verified that the proposed method improved on-the-fly prediction at subtle expression frames early in the sequence, using partial sequence inputs.