In this paper, we propose photo-realistic facial emotion synthesis by using a novel multi-level critic network with multi-level generative model. We devise a new facial emotion generator containing the proposed multi-level decoder to synthesize facial image with a desired variation. A proposed multi-level decoder and multi-level critic network help the generator to produce a photo-realistic and variation-realistic facial image in generative adversarial learning. The multi-level critic network consists of two discriminators, photo-realistic discriminator and variation-realistic discriminator. The photo-realistic discriminator in the multi-level critic network determines whether the multi-resolution facial image generated from the latent feature of the multi-level decoding module is photo-realistic or not. The variation-realistic discriminator determines whether the multi-resolution facial image has natural variation or not. Experimental results show that the proposed facial emotion synthesis method outperforms existing methods in terms of both qualitative performance and quantitative performance of expression recognition.