This letter presents a robust data association method for fusing camera and marine radar measurements in order to automatically detect surface ships and determine their locations with respect to the observing ship. In the preprocessing step for sensor fusion, convolutional neural networks are used to perform feature extraction from camera images and semantic segmentation from radar images. The correspondences between the camera and radar image features are optimized using a pair of geometric parameters, and the positions of all the matched object features are determined. The proposed method enables robust data association even without accurate calibration and alignment between the camera and radar measurements. The feasibility and performance of the proposed method are demonstrated using an experimental dataset obtained in a real coastal environment.