Purpose: To accurately separate water and fat signals for bipolar multi-echo gradient-recalled echo sequence using a convolutional neural network (CNN). Methods: A CNN architecture was designed and trained using the relationship between multi-echo images from the bipolar multi-echo gradient-recalled echo sequence and artifact-free water-fat-separated images. The artifact-free water-fat-separated images for training the CNN were obtained from multiple signals with different TEs by using iterative decomposition of water and fat with echo asymmetry and the least-squares estimation method, in which multiple signals at different TEs were acquired using a single-echo gradient-recalled echo sequence. We also proposed a data augmentation method using a synthetic field inhomogeneity to generate multi-echo signals, including various bipolar multi-echo gradient-recalled echo artifacts so that the CNN could prevent overfitting and increase the separation accuracy. We trained the CNN using in vivo knee images and tested it using in vivo knee, head, and ankle images. Results: In vivo imaging results showed that the proposed CNN could separate water-fat images accurately. Although the proposed CNN was trained using only in vivo knee images, the proposed CNN could also separate water-fat images of different imaging regions. The proposed data augmentation method could prevent overfitting even with a limited number of training data sets and make the method robust to magnetic field inhomogeneities. Conclusion: The proposed CNN could obtain water-fat-separated images from the multi-echo images acquired from the bipolar multi-echo gradient-recalled echo sequence, which included artifacts from the bipolar gradients.