Since modern autonomous driving (AD) platforms offer a variety of sensors, it is intuitive to leverage complementary data from multimodal sensors to produce reliable 3D semantic segmentation. However, due to the information loss and the sub-optimized fusion in multimodal fusion methods, LiDAR-only methods currently occupy the top positions in the leaderboard of datasets. In this paper, we focus on two aspects to improve the LiDAR-camera fusion semantic segmentation performance, namely data augmentation and fusion strategy. First, we propose an novel data augmentation by refining point-image patches. Second, we design an attention fusion block for the dual-branch segmentation network by considering the modality gap between LiDAR and RGB camera. Experiments on nuScences indicate that our proposed method outperforms the baseline methods on key classes.