The expressive power of neural networks is critical for understanding the empirical success of deep learning. In this thesis, we study the expressive power of deep and narrow networks as a dual of classical results for shallow and wide networks. First, we study the universal approximation property of deep and narrow networks. In particular, we provide the first exact characterization on the minimum width of ReLU networks required for the universal approximation. Second, we study the memorization power of deep and narrow networks. In particular, we aim to characterize the necessary number of parameters for memorizing $N$ data. We show that $O(N^{\frac23})$ parameters are sufficient for deep and narrow ReLU networks to memorize $N$ data, while $\Omega(N)$ parameters are necessary for shallow and wide counterparts. We believe that our results provide new insight into the expressive power theory of deep and narrow networks.