In recent years, deep neural networks (DNNs) have achieved remarkable success in image classification and natural language processing tasks thanks to the large number of parameters and input data that make up the neural network. By layering on a large number of neurons, the DNN is able to represent the appropriate function for the given data. However, the downside is that DNNs with many parameters require extensive training time, even with the use of GPUs. As a result, there is a growing need for fast training methods that are not solely reliant on hardware performance. Parallel computation has emerged as the most promising method for achieving fast training, and distributed computing machines that utilize multiple GPUs have been developed. Despite their advantages, current parallel training methods such as data parallelism and model parallelism have limitations in effectively training large scale DNNs, and often lack a mathematical reason. To address these issues, this thesis proposes a novel parallel computation technique for deep neural networks.