In this thesis, we propose a deep recursive autoencoder based architecture with enhanced interaction between the encoder and the decoder networks to improve its performance for image generation. In the first part of the thesis, we modify the architecture of deep recurrent attentive writer(DRAW) by replacing the RNN at the encoder with CNN because in more than one spatio-temporal domains and even in images it is difficult to use RNNs for feature learning. This is mainly because RNNs need to remember far back in the time to look for the pixels which are horizontally or vertically aligned. In addition, CNNs are commonly used for image processing tasks and they give the state of the art performance for them. In the second part of the thesis, the model is further modified to increase its expressiveness and eventually the performance. In order to do this multiple stochastic layers are introduced in the architecture, which help the model in generating the complex data. Moreover, the interaction between the inference and the generation networks is increased by adding the skip connections between the recognizer and the generator networks, this makes the generation of data more effective. Three variants of Ladder deep convolutional recurrent writer(L-DCRW) are proposed with increased interaction between the recognizer network and the generator network. The first architecture trains the network to get the posterior by combining the mean and variance of recognizer network (which acts as Gaussian likelihoods) and mean and variance of generator network (which can be considered as priors). In the second architecture, skip connections between the inference network and the generation network are introduced at the higher layers of network such that, the higher layers instead of capturing all the information now only needs to learn the abstract representations. Finally, the architecture with the skip connections at all the layers is presented. Furthermore, in the last chapter of this thesis the same idea of ladder network is also applied to and tested with the DRAW architecture. All the architectures are tested on MNIST and Omniglot datasets and the results are analyzed.